绑定变量超过65535 导致数据库集群频繁crash
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: 绑定变量超过65535 导致数据库集群频繁crash
今天凌晨5点半,被客户的电话声惊喜。对于DBA来讲,最紧张的时刻莫过于半夜接到突然突然。若非紧急故障,客户通常不会大半夜打电话。从客户提供的信息来看,数据库不断crash:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | Thu Aug 25 00:05:59 2022 Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x186D3B7, opiaba()+639] [flags: 0x0, count: 1] Errors in file /u02/app/oracle/diag/rdbms/gdfw/gdfw1/trace/gdfw1_ora_93981.trc  (incident=590479): ORA-07445: exception encountered: core dump [opiaba()+639] [SIGSEGV] [ADDR:0x0] [PC:0x186D3B7] [SI_KERNEL(general_protection)] [] Incident details in: /u02/app/oracle/diag/rdbms/xxxx/xxxx1/incident/incdir_590479/xxxx1_ora_93981_i590479.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Thu Aug 25 00:06:17 2022 Dumping diagnostic data in directory=[cdmp_20220825000617], requested by (instance=1, osid=93981), summary=[incident=590479]. Thu Aug 25 00:06:19 2022 Sweep [inc][590479]: completed Sweep [inc2][590479]: completed Thu Aug 25 00:06:33 2022 Errors in file /u02/app/oracle/diag/rdbms/xxxx/xxxx1/trace/xxxx1_pmon_94310.trc  (incident=587614): ORA-00600: internal error code, arguments: [17147], [0x2223138B60], [], [], [], [], [], [], [], [], [], [] Incident details in: /u02/app/oracle/diag/rdbms/xxxx/xxxx1/incident/incdir_587614/xxxx1_pmon_94310_i587614.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Thu Aug 25 00:06:36 2022 Dumping diagnostic data in directory=[cdmp_20220825000636], requested by (instance=1, osid=94310 (PMON)), summary=[incident=587614]. Errors in file /u02/app/oracle/diag/rdbms/xxxx/xxxx1/trace/xxxx1_pmon_94310.trc: ORA-00600: internal error code, arguments: [17147], [0x2223138B60], [], [], [], [], [], [], [], [], [], [] PMON (ospid: 94310): terminating the instance due to error 472 Thu Aug 25 00:06:37 2022 ORA-1092 : opitsk aborting process System state dump requested by (instance=1, osid=94310 (PMON)), summary=[abnormal instance termination]. System State dumped to trace file /u02/app/oracle/diag/rdbms/xxxx/xxxx1/trace/xxxx1_diag_94335_20220825000641.trc Instance terminated by PMON, pid = 94310 Thu Aug 25 00:06:48 2022 Adjusting the default value of parameter parallel_max_servers from 3600 to 1970 due to the value of parameter processes (2000) Starting ORACLE instance (normal) ************************ Large Pages Information ******************* Per process system memlock (soft) limit = UNLIMITED | 
可以看到实例最终被pmon进程异常终止了。而ora-00600 17147 也是较为常见的。
进一步分析客户提供的trace 发现了如下内容,其中报错的存储过程传参超过7万个:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | *** 2022-08-25 00:05:59.450 dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0) ----- Current SQL Statement for this session (sql_id=dz3895jnbahxj) ----- BEGIN             INSERT INTO  PSSC.T_H_SG186FAULT_EVENT             (             ID,             CITY_UNIT_NAME,             CITY_UNIT_ID,             ........             SYS_GUID(),             :1 ,             :2 ,             :3 ,             :4 ,             :5 ,             :6 ,             :7 ,             :78693 ,             :78694 ,             TO_DATE(:78695 ,'YYYY-MM-DD HH24:MI:SS'),             :78696 ,             :78697 ,             TO_DATE(:78698 ,'YYYY-MM-DD HH24:MI:SS'),             :78699 ,             :78700 ,             :78701 ,             :78702 ,             :78703 ,             :78704 ,             :78705              );         COMMIT;         END; ----- Call Stack Trace ----- calling              call     entry                argument values in hex       location             type     point                (? means dubious value)      -------------------- -------- -------------------- ---------------------------- skdstdst()+41        call     kgdsdst()            000000000 ? 000000000 ?                                                    7F046159BC40 ? 7F046159BD18 ?                                                    7F04615A07C0 ? 000000003 ? ksedst1()+103        call     skdstdst()           000000000 ? 000000000 ?                                                    7F046159BC40 ? 7F046159BD18 ?                                                    7F04615A07C0 ? 000000003 ? ksedst()+39          call     ksedst1()            000000001 ? 000000001 ?                                                    7F046159BC40 ? 7F046159BD18 ?                                                    7F04615A07C0 ? 000000003 ? dbkedDefDump()+2746  call     ksedst()             000000001 ? 000000001 ?                                                    7F046159BC40 ? 7F046159BD18 ?                                                    7F04615A07C0 ? 000000003 ? ksedmp()+41          call     dbkedDefDump()       000000003 ? 000000003 ?                                                    7F046159BC40 ? 7F046159BD18 ?                                                    7F04615A07C0 ? 000000003 ? ssexhd()+2462        call     ksedmp()             000000003 ? 000000003 ?                                                    7F046159BC40 ? 7F046159BD18 ?                                                    7F04615A07C0 ? 000000003 ? __sighandler()       call     ssexhd()             00000000B ? 7F04615A1BF0 ?                                                    7F04615A1AE8 ? 7F046159BD18 ?                                                    7F04615A07C0 ? 000000003 ? opiaba()+639         signal   __sighandler()       000003380 ? 000003380 ?                                                    000000000 ? 000003200 ?                                                    222311EFD8 ? 0000000A1 ? opiprs()+307         call     opiaba()             000003380 ? 2000000226B1F0F ?                                                    000000000 ? 000003200 ?                                                    222311EFD8 ? 0000000A1 ? kksParseChildCursor  call     opiprs()             7FFC39C77780 ? 7F045E3BC3E8 ? ()+572                                             00042860A ? 21CC3DE210 ?                                                    000000004 ? 0000000A1 ? rpiswu2()+1776       call     kksParseChildCursor  7FFC39C77960 ? 7F045E3BC3E8 ?                               ()                   00042860A ? 21CC3DE210 ?                                                    000000004 ? 0000000A1 ? kksLoadChild()+5255  call     rpiswu2()            2602464080 ? 000000027 ?                                                    2263B6CC34 ? 000000017 ?                                                    000000004 ? 0000000A1 ? kxsGetRuntimeLock()  call     kksLoadChild()       00C113DA0 ? 2736366FA8 ? +2209                                              7FFC39C78EB0 ? 000000017 ?                                                    000000004 ? 0000000A1 ? kksfbc()+15108       call     kxsGetRuntimeLock()  00C113DA0 ? 7F0461397608 ?                                                    7FFC39C78EB0 ? 000000017 ?                                                    000000004 ? 0000000A1 ? kkspsc0()+1171       call     kksfbc()             7F0461397608 ? 000000003 ?                                                    000000108 ? 7F045E3BC3E8 ?                                                    000000004 ? 0000000A1 ? kksParseCursor()+11  call     kkspsc0()            7F04613826D8 ? 7F045E3BC3E8 ? 6                                                  00042860A ? 000000003 ?                                                    000000006 ? 0000000A4 ? opiosq0()+2019       call     kksParseCursor()     7FFC39C79A20 ? 7F045E3BC3E8 ?                                                    00042860A ? 000000003 ?                                                    000000006 ? 0000000A4 ? kpooprx()+274        call     opiosq0()            000000003 ? 00000000E ?                                                    7FFC39C79BF0 ? 0000000A4 ?                                                    000000006 ? 0000000A4 ? kpoal8()+842         call     kpooprx()            7FFC39C7D3D4 ? 7F045E3BC3E8 ?                                                    000428609 ? 000000001 ?                                                    000000000 ? 0000000A4 ? opiodr()+917         call     kpoal8()             00000005E ? 7F045E3BC3E8 ?                                                    000428609 ? 000000001 ?                                                    000000000 ? 0000000A4 ? ttcpip()+2183        call     opiodr()             00000005E ? 00000001C ?                                                    7FFC39C7D3D0 ? 000000001 ?                                                    000000000 ? 0000000A4 ? opitsk()+1710        call     ttcpip()             00C1319F0 ? 0099D5D10 ?                                                    7FFC39C7D3D0 ? 000000000 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? opiino()+969         call     opitsk()             00C1319F8 ? 000000000 ?                                                    7FFC39C7D3D0 ? 000000000 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? opiodr()+917         call     opiino()             00000003C ? 000000004 ?                                                    7FFC39C7EBC8 ? 000000000 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? opidrv()+570         call     opiodr()             00000003C ? 000000004 ?                                                    7FFC39C7EBC8 ? 000000000 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? sou2o()+103          call     opidrv()             00000003C ? 000000004 ?                                                    7FFC39C7EBC8 ? 000000000 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? opimai_real()+133    call     sou2o()              7FFC39C7EBA0 ? 00000003C ?                                                    000000004 ? 7FFC39C7EBC8 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? ssthrdmain()+265     call     opimai_real()        000000002 ? 7FFC39C7ED90 ?                                                    000000004 ? 7FFC39C7EBC8 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? main()+201           call     ssthrdmain()         000000002 ? 7FFC39C7ED90 ?                                                    000000001 ? 000000000 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? __libc_start_main()  call     main()               000000002 ? 7FFC39C7EF38 ? +253                                               000000001 ? 000000000 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? _start()+41          call     __libc_start_main()  000A2A354 ? 000000002 ?                                                    7FFC39C7EF28 ? 000000000 ?                                                    7FFC39C7CE28 ? 7FFC39C7D3CC ? | 
对比上述相关错误以及call stack堆栈信息,发现为Oracle bug导致。可参考如下文档:
摘取其中一段描述:
CAUSE
Instance terminated due to ora-7445 [opiaba] which leads to ora-600 [17147]. ora-7445 [opiaba] error is reported due to the use of more than 65535 binds in the same sql / plsql statement.
You may find some or all of the following function codes in the ‘Call Stack’ portion of the trace file:
opiaba opiprs rpiswu2 kksLoadChild kxsGetRuntimeLock kksfbc
This scenario is reported in bug 13973845 which is closed of duplicated bug 12578873.
紧急下载该patch发客户,打上补丁之后,数据库实例不再频繁crash。从描述来看,该patch并不能彻底解决问题,只是让实例不crash而已。最根本的解决方法还是调整应用,将存储过程拆分,不要一次性传入数万个参数,超过oracle 65535的临界值。
这个问题很简答, 我们在其他客户之前都遇到过了多次,在我司内部mes平台也能看到相关的案例分享。简单记录一下吧!



Leave a Reply
You must be logged in to post a comment.