lmon terminating the instance due to error 481
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
今天某客户反馈说其中一套业务系统数据库实例crash重启了,通过分析了日志发现报错如下:
| 
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  | 
Tue Oct 18 21:34:34 2022 Errors in file /u01/app/oracle/diag/rdbms/xxxx/xxxx1/trace/xxxx1_lmon_2491932.trc  (incident=512156): ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x1108E6388], [], [], [], [], [], [], [], [], [], [] Incident details in: /u01/app/oracle/diag/rdbms/xxxx/xxxx1/incident/incdir_512156/xxxx1_lmon_2491932_i512156.trc Tue Oct 18 21:34:44 2022 Dumping diagnostic data in directory=[cdmp_20221018213444], requested by (instance=1, osid=2491932 (LMON)), summary=[incident=512156]. Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /u01/app/oracle/diag/rdbms/xxxx/xxxx1/trace/xxxx1_lmon_2491932.trc: ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x1108E6388], [], [], [], [], [], [], [], [], [], [] LMON (ospid: 2491932): terminating the instance due to error 481 Tue Oct 18 21:34:44 2022 opiodr aborting process unknown ospid (7210544) as a result of ORA-1092 Tue Oct 18 21:34:44 2022 opiodr aborting process unknown ospid (5047352) as a result of ORA-1092 Tue Oct 18 21:34:44 2022 opiodr aborting process unknown ospid (10618076) as a result of ORA-1092 Tue Oct 18 21:34:44 2022 ORA-1092 : opitsk aborting process Tue Oct 18 21:34:44 2022 ORA-1092 : opitsk aborting process Tue Oct 18 21:34:44 2022 opiodr aborting process unknown ospid (5441064) as a result of ORA-1092 Tue Oct 18 21:34:44 2022 ORA-1092 : opitsk aborting process Tue Oct 18 21:34:49 2022 Instance terminated by LMON, pid = 2491932 Tue Oct 18 21:34:53 2022 Starting ORACLE instance (normal)  | 
可以看到实例被LMON进程给异常终止了,详细内容还需要进一步看lmon trace内容:
| 
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166  | 
*** SERVICE NAME:(SYS$BACKGROUND) 2022-10-18 21:34:34.668 *** MODULE NAME:() 2022-10-18 21:34:34.668 *** ACTION NAME:() 2022-10-18 21:34:34.668 Dump continued from file: /u01/app/oracle/diag/rdbms/xxxx/xxxx1/trace/xxxx1_lmon_2491932.trc ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x1108E6388], [], [], [], [], [], [], [], [], [], [] ========= Dump for incident 512156 (ORA 600 [kghstack_underflow_internal_2]) ======== *** 2022-10-18 21:34:34.691 dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0) ----- SQL Statement (None) ----- Current SQL information unavailable - no cursor. ----- Call Stack Trace ----- calling              call     entry                argument values in hex       location             type     point                (? means dubious value)      -------------------- -------- -------------------- ---------------------------- skdstdst()+40        bl       0000000109B4CD24     000000000 ? 000000001 ?                                                    000000003 ? 000000000 ?                                                    000000000 ? 000000001 ?                                                    000000003 ? 000000000 ? ksedst1()+112        call     skdstdst()           171F2D30C8558AB1 ?                                                    4844284100000000 ?                                                    FFFFFFFFFFF6500 ?                                                    28E4DEBE4CBF3 ? 10A81AD8C ?                                                    000000000 ? 11072A8C0 ?                                                    2050033FFFF6508 ? ksedst()+40          call     ksedst1()            000000000 ? 00000000A ?                                                    000003000 ? 10A5BFFA8 ?                                                    000000000 ? 000000000 ?                                                    000002004 ? 000000001 ? dbkedDefDump()+1516  call     ksedst()             000000000 ? 000000000 ?                                                    000000000 ? 000000000 ?                                                    000000000 ? 000000000 ?                                                    000000000 ? 300000003 ? ksedmp()+72          call     dbkedDefDump()       31072A8C0 ? 110000A60 ?                                                    FFFFFFFFFFF6D10 ? 1106AC1B8 ?                                                    100125838 ? FFFFFFFFFFF7730 ?                                                    1000F0D94 ? 1106AC1B8 ? ksfdmp()+100         call     ksedmp()             000000002 ? 000000000 ?                                                    000000002 ? 10AAE5CB0 ?                                                    10A07CFD0 ? 000000000 ?                                                    1109D3E30 ? 11072A8C0 ? dbgexPhaseII()+1904  call     ksfdmp()             000000000 ? 00000000A ?                                                    000000002 ? 000000000 ?                                                    000000002 ? 10A07CFC8 ?                                                    000000000 ? 001050005 ? dbgexProcessError()  call     dbgexPhaseII()       11072A8C0 ? 1109D2040 ? +1556                                              00007D09C ? 200000000 ?                                                    FFFFFFFFFFF7C28 ? 000000082 ?                                                    000000000 ? 000000000 ? dbgeExecuteForError  call     dbgexProcessError()  11072A8C0 ? 1109D3E30 ? ()+72                                              1FFFFB6A0 ? 000000001 ?                                                    000000703 ? 000000011 ?                                                    000000006 ? 1109D5B78 ? dbgePostErrorKGE()+  call     dbgeExecuteForError  000000000 ? 00A4D1050 ? 2044                          ()                   FFFFFFFFFFFFB210 ?                                                    00A4D1050 ? 000000000 ?                                                    90000000D6969D8 ? 000000000 ?                                                    110000C58 ? dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000003000 ? 10A5BFFA8 ? 68                                                 25800000002 ? 109E85570 ?                                                    000000000 ? 000000000 ?                                                    FFFFFFFFFFFBEE0 ? 11113A600 ? kgeadse()+380        call     dbkePostKGE_kgsf()   102DA1484 ? 100000000 ?                                                    FFFFFFFFFFFC0D8 ? 000000000 ?                                                    110AED1A0 ? 1108EA610 ?                                                    000000002 ? 700000000013680 ? kgerinv_internal()+  call     kgeadse()            000000000 ? 000000000 ? 48                                                 000000000 ? 1700000010 ?                                                    100000000 ? 000003000 ?                                                    110D33350 ? 1108EA610 ? kgerinv()+48         call     kgerinv_internal()   8311AABF3BAF ? 8311AABF3FD4 ?                                                    8311AABF3BAF ? 8311AABF3BAF ?                                                    000000000 ? 10A5A3090 ?                                                    000000000 ? 000000000 ? kgeasnmierr()+72     call     kgerinv()            000000000 ? 000000023 ?                                                    000000001 ? 000000004 ?                                                    000000000 ? 000000001 ?                                                    110D33350 ? 110AED398 ? kghstack_underflow_  call     kgeasnmierr()        000000000 ? FFFFFFFFFFFC100 ? internal()+280                                     00000001E ? 100000001 ?                                                    000000002 ? 1108E6388 ?                                                    000000000 ? 000000000 ? kghstack_free()+716  call     kghstack_underflow_  000000001 ? 08DBD1E85 ?                               internal()           700011351BB7B48 ? 0000F4240 ?                                                    000000000 ? 00000000A ?                                                    000003000 ? 10A5BFFA8 ? kccgrd()+264         call     kghstack_free()      FFFFFFFFFFFC0C0 ?                                                    4224282B00000000 ?                                                    103D2C888 ? 000004000 ?                                                    500000005 ? C0000000C ?                                                    400003000 ? 10A5BFFA8 ? kjxgrf_rr_read()+66  call     kccgrd()             1FFFD02FAFF35E5 ? 110A5BD70 ? 0                                                  FFFFFFFFFFFC180 ? 000000000 ?                                                    110A5BD70 ? 110FBCF48 ?                                                    0037D6E50 ? 1106AC1B8 ? kjxgrDD_rr_read()+1  call     kjxgrf_rr_read()     110A032D0 ? 700011342677E98 ? 04                                                 000000000 ? 000000001 ?                                                    FFFFFFFFFFFC6A4 ? 110A03B38 ?                                                    FFFFFFFFFFFC630 ?                                                    42245280FFFFC790 ? kjxgrimember()+124   call     kjxgrDD_rr_read()    000003000 ? 10A5BFFA8 ?                                                    000000002 ? 700000000013680 ?                                                    11011EAD0 ? FFFFFFFFFFFCD80 ?                                                    000000001 ? 218DBD1E85 ? kjxggpoll()+804      call     kjxgrimember()       FFFFFFFFFFFC6D0 ? 0000186A0 ?                                                    101FECE90 ? 8311AABD8A40 ?                                                    70000000000C0D0 ? 000000000 ?                                                    000001568 ? 100000000 ? kjfmact()+508        call     kjxggpoll()          000000000 ? 000000000 ?                                                    000000000 ? 000000000 ?                                                    FFFFFFFFFFFC7A0 ? 000000000 ?                                                    1037BB124 ? 000000000 ? kjfdact()+32         call     kjfmact()            11011EAD0 ? FFFFFFFFFFFCD80 ?                                                    000000001 ? 000000000 ?                                                    002050000 ? 001160000 ?                                                    10896F91E ?                                                    14616E27FFFFC930 ? kjfcln()+2240        call     kjfdact()            000000000 ? 10A5B3014 ?                                                    700011351BB7B48 ? 000000002 ?                                                    700011351BB7B54 ? 000000004 ?                                                    2FFFFF570 ? 200000002 ? ksbrdp()+2216        call     kjfcln()             700000000013198 ?                                                    7000000000131B4 ? 048245028 ?                                                    000000E00 ? 1108B2310 ?                                                    100638128 ? 000000001 ?                                                    700000007 ? opirip()+1620        call     ksbrdp()             FFFFFFFFFFFFEA7 ? 10B2ADCF0 ?                                                    FFFFFFFFFFFDE50 ? 000000000 ?                                                    000000001 ? 000000000 ?                                                    01099067F ? 000000001 ? opidrv()+608         call     opirip()             10AE1FAC0 ? 410134198 ?                                                    FFFFFFFFFFFEFC0 ?                                                    2F7530312F ? 108354684 ?                                                    1106AC1B8 ?                                                    7264626D732F6462 ?                                                    1106AC1B8 ? sou2o()+136          call     opidrv()             32067E1DB0 ? 4FFFFF388 ?                                                    FFFFFFFFFFFEFC0 ?                                                    25001D022C0000 ? 000000010 ?                                                    1106AC1B8 ? 000000000 ?                                                    000000000 ? opimai_real()+188    call     sou2o()              FFFFFFFFFFFF030 ?                                                    5524445B00000001 ?                                                    9000000000DC64C ?                                                    BADC0FFEE0DDF00D ?                                                    000000003 ? 9001000A008DB98 ?                                                    A0000000A000000 ? 10B6B6F40 ? ssthrdmain()+276     call     opimai_real()        FFFFFFFFFFFF110 ?                                                    9001000A0092DC0 ?                                                    FFFFFFFFFFFF130 ? 10B6F72B8 ?                                                    90000000008AB0C ?                                                    9001000A008DB98 ?                                                    FFFFFFFFFFFF110 ?                                                    9001000A008DB98 ? main()+204           call     ssthrdmain()         3F0003720 ? FFFFFFFFFFFF478 ?                                                    FFFFFFFFFFFF4E0 ?                                                    9FFFFFFF000D6F0 ?                                                    9FFFFFFF00009E0 ? 000000000 ?                                                    000000000 ? 9FFFFFFF000D6F0 ? __start()+112        call     main()               000000000 ? 000000000 ?                                                    000000000 ? 000000000 ?                                                    000000000 ? 000000000 ?                                                    000000000 ? 000000000 ?  | 
跟进前面的call stack信息,很容易定位到如下的bug,详细内容可以参考mos的文章:
SYMPTOMS
- The LMON or LMS process crash the instance with an error like:
ORA-00600: internal error code, arguments: [kghstack_underflow_internal_2], [0x110A10838], [], [], [], [], [], [], [], [], [], []
ORA-1092 : opitsk aborting process
Instance terminated by LMS1, pid = 14024818 - Review of the generated tracefiles reveals a call stack similar to:
… kghstack_underflow_internal kghstack_free kccgrd kjxgrf_rr_read kjxgrDD_rr_read kjxgrimember kjxggpoll kjfmact kjfdact kjfcln ksbrdp …
– OR –
… kghstack_underflow_internal kghstack_free ktundo kturcrbackoutonechg ktrgcm ktrget3 ktrget2 kclgcr …
 
CHANGES
CAUSE
The cause of this problem has been identified in a.o.:
Bug 18687067 – ORA-600 [KGHSTACK_UNDERFLOW_INTERNAL_2]
closed as duplicate of Bug 20675347 – ORA-07445 [KGHSTACK_OVERFLOW_INTERNAL()+644]
The bug is caused by an AIX compiler issue causing volatile variables in the Oracle kernel not to be handled properly.
The bug is a regression introduced in 11.2.0.4.
The issue does not reproduce in later versions, i.e. 12.1.
SOLUTION
To solve the issue, use any of below alternatives:
- Upgrade to 12.1
– OR –
 
- Apply interim patch 20675347, if available for your platform and Oracle version.
To check for conflicting patches, please use the MOS Patch Planner Tool
Please refer to
Note 1317012.1 – How To Use MOS Patch Planner To Check And Request The Conflict Patches?If no patch exists for your version, please contact Oracle Support for a backport request.
 
从文档来看,该问题在11.2.0.4还是比较常见,主要是该用户没有安装相应的PSU。问题相对简单,简单记录一下,以备后查!


Leave a Reply
You must be logged in to post a comment.