某客户RAC由于掉电导致系统崩溃的恢复过程

本站文章除注明转载外，均为本站原创： 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客

这里简单记录一下，此次国庆加班恢复的某客户的2套Oracle RAC数据库，整个恢复过程中，2套rac差不多，因此这里以其中一套数据库的恢复过程为例进行简单分析说明。数据库由于为非归档，由于掉电导致重启之后系统无法正常open。

在正常open的过程中，报错如下错误：

SQL&gt; alter database open;
alter database open
*
ERROR at line 1:
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [],
[], [], []
SQL&gt; shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL&gt; startup
ORACLE instance started.
Total System Global Area 4294967296 bytes
Fixed Size                  2089576 bytes
Variable Size            2751466904 bytes
Database Buffers         1526726656 bytes
Redo Buffers               14684160 bytes
Database mounted.
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [],
[], [], []

SQL> alter database open;

alter database open

ERROR at line 1:

ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [],

[], [], []

SQL> shutdown immediate

ORA-01109: database not open

Database dismounted.

ORACLE instance shut down.

SQL> startup

ORACLE instance started.

Total System Global Area 4294967296 bytes

Fixed Size 2089576 bytes

Variable Size 2751466904 bytes

Database Buffers 1526726656 bytes

Redo Buffers 14684160 bytes

Database mounted.

ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [],

[], [], []

对于该错误，网上的解决方法也很多，可惜都不管用。这种情况之下，往往都是需要强制打开数据库的，首先需要做一个不完全恢复，如下：

SQL&gt; recover database
ORA-00279: change 236912204 generated at 09/29/2015 12:49:13 needed for thread
1
ORA-00289: suggestion : /xxxx/1_5112_877094801.dbf
ORA-00280: change 236912204 for thread 1 is in sequence #5112
Specify log: {&lt;RET&gt;=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: cannot open archived log '/xxxx/1_5112_877094801.dbf'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3

SQL> recover database

ORA-00279: change 236912204 generated at 09/29/2015 12:49:13 needed for thread

ORA-00289: suggestion : /xxxx/1_5112_877094801.dbf

ORA-00280: change 236912204 for thread 1 is in sequence #5112

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

auto

ORA-00308: cannot open archived log '/xxxx/1_5112_877094801.dbf'

ORA-27037: unable to obtain file status

IBM AIX RISC System/6000 Error: 2: No such file or directory

Additional information: 3

在进行相关操作之后，我备份了一下当前的控制文件信息，便于后面如果有问题，方便处理。强制open的过程中，发现报如下错误：

Sat Oct  3 11:49:31 2015
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Sat Oct  3 11:49:33 2015
Errors in file /oracle/oracle/admin/cwdb/udump/cwdb1_ora_6029586.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [1], [18446744072394632417], [1], [18446744072392296306], [], [], []
Sat Oct  3 11:49:34 2015
Errors in file /oracle/oracle/admin/xxxx/udump/xxxx1_ora_6029586.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [1], [18446744072394632417], [1], [18446744072392296306], [], [], []
Sat Oct  3 11:49:34 2015
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 6029586

Sat Oct 3 11:49:31 2015

Instance recovery: looking for dead threads

Instance recovery: lock domain invalid but no dead threads

Sat Oct 3 11:49:33 2015

Errors in file /oracle/oracle/admin/cwdb/udump/cwdb1_ora_6029586.trc:

ORA-00600: internal error code, arguments: [kclchkblk_4], [1], [18446744072394632417], [1], [18446744072392296306], [], [], []

Sat Oct 3 11:49:34 2015

Errors in file /oracle/oracle/admin/xxxx/udump/xxxx1_ora_6029586.trc:

ORA-00600: internal error code, arguments: [kclchkblk_4], [1], [18446744072394632417], [1], [18446744072392296306], [], [], []

Sat Oct 3 11:49:34 2015

Error 600 happened during db open, shutting down database

USER: terminating instance due to error 600

Instance terminated by USER, pid = 6029586

这个错误已经处理过多次了。同样，百度一下，会发现很多人都写过相关的文章，包括Oracle mos的文章解释也是说这是临时块的scn过大导致，通过drop tempfile即可绕过该问题。实际上，这种情况之下，根本不会起作用。

但是不管如何，这个问题很明显都是跟block的scn有关系。既然是跟scn有关系，那么处理就不难了，通过推进scn即可。

通过推进scn 之后，再次open resetlogs成功打开数据库，可惜的是alert log报了一堆错误，如下所示：

Sat Oct  3 13:10:34 2015
Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_smon_10420246.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 40
Sat Oct  3 13:10:35 2015
ORACLE Instance xxxx1 (pid = 25) - Error 600 encountered while recovering transaction (23, 85).
Sat Oct  3 13:10:35 2015
Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_smon_10420246.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Oct  3 13:10:35 2015
Trace dumping is performing id=[cdmp_20151003131035]
Sat Oct  3 13:10:35 2015
replication_dependency_tracking turned off (no async multimaster replication found)
Sat Oct  3 13:10:36 2015
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Sat Oct  3 13:10:36 2015
Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_smon_10420246.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Oct  3 13:10:37 2015
Starting background process QMNC
Sat Oct  3 13:10:37 2015
ORACLE Instance xxxx1 (pid = 25) - Error 600 encountered while recovering transaction (23, 85).
Sat Oct  3 13:10:37 2015
Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_smon_10420246.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
QMNC started with pid=53, OS id=7536816
Sat Oct  3 13:10:41 2015
LOGSTDBY: Validating controlfile with logical metadata
Sat Oct  3 13:10:41 2015
LOGSTDBY: Validation complete
Sat Oct  3 13:10:46 2015
Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_mmon_9110004.trc:
ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []
Sat Oct  3 13:10:48 2015
Errors in file /oracle/oracle/admin/xxxx/udump/xxxx1_ora_6619434.trc:
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Completed: alter database open resetlogs

Sat Oct 3 13:10:34 2015

Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_smon_10420246.trc:

ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

Opening with internal Resource Manager plan

where NUMA PG = 1, CPUs = 40

Sat Oct 3 13:10:35 2015

ORACLE Instance xxxx1 (pid = 25) - Error 600 encountered while recovering transaction (23, 85).

Sat Oct 3 13:10:35 2015

Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_smon_10420246.trc:

ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

Sat Oct 3 13:10:35 2015

Trace dumping is performing id=[cdmp_20151003131035]

Sat Oct 3 13:10:35 2015

replication_dependency_tracking turned off (no async multimaster replication found)

Sat Oct 3 13:10:36 2015

Instance recovery: looking for dead threads

Instance recovery: lock domain invalid but no dead threads

Sat Oct 3 13:10:36 2015

Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_smon_10420246.trc:

ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

Sat Oct 3 13:10:37 2015

Starting background process QMNC

Sat Oct 3 13:10:37 2015

ORACLE Instance xxxx1 (pid = 25) - Error 600 encountered while recovering transaction (23, 85).

Sat Oct 3 13:10:37 2015

Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_smon_10420246.trc:

ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

QMNC started with pid=53, OS id=7536816

Sat Oct 3 13:10:41 2015

LOGSTDBY: Validating controlfile with logical metadata

Sat Oct 3 13:10:41 2015

LOGSTDBY: Validation complete

Sat Oct 3 13:10:46 2015

Errors in file /oracle/oracle/admin/xxxx/bdump/xxxx1_mmon_9110004.trc:

ORA-00600: internal error code, arguments: [qertbFetchByRowID], [], [], [], [], [], [], []

Sat Oct 3 13:10:48 2015

Errors in file /oracle/oracle/admin/xxxx/udump/xxxx1_ora_6619434.trc:

ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []

Completed: alter database open resetlogs

这部分错误处理其实都不难。对于第一个ora-00600 [4137] 错误，很明显这是跟undo有关系的，其中(23,85)中的23表现第23号回滚段；通过屏蔽第23号回滚段可以很容易解决该错误，当然，这会儿导致事务的不一致性，这是没办法的，已经undo异常，Oracle 已经没有办法进行正常的事务恢复了。

其次，对于第2个ora-00600 [qertbFetchByRowID] 错误，处理也很简单，其大致意思是通过rowid访问获取数据有异常，很明显这是跟index有关系，通过重建index 可以解决该问题，其次最后一个[kdsgrp1] 错误就更常见了，通常也是Index的问题，重建即可。

看上去一切的恢复过程都很简单，很顺利，然而这里真正的难题，真正的问题才开始。

1	看上去一切的恢复过程都很简单，很顺利，然而这里真正的难题，真正的问题才开始。

也就是最后一个看似很简单的错误ora-00600 [kdsgrp1]错误，对我们产生了极大的困难。首先我们来看下产生该错误时涉及到那些对象：

1	也就是最后一个看似很简单的错误ora-00600 [kdsgrp1]错误，对我们产生了极大的困难。首先我们来看下产生该错误时涉及到那些对象：

<pre class="brush:plain">Validate domain 0
Validated domain 0, flags = 0x0
kwqmnich: current time:: 13: 31: 34
kwqmnich: instance no 0 check_only flag 1
kwqmnich: initialized job cache structure
row 0041edda.2e continuation at
file# 1 block# 126426 slot 47 not found
**************************************************
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 47 ..... nrows: 175
**************************************************
*** 2015-10-03 13:31:40.864
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Current SQL statement for this session:
SELECT OWNER,NAME,TYPE,COUNT(*) FROM DBA_SOURCE WHERE SUBSTR(OWNER,1,4)='FMIS' GROUP BY OWNER,NAME,TYPE HAVING COUNT(*)&gt;1000
----- PL/SQL Call Stack -----
Object id on Block? Y
seg/obj: 0x12  csc: 0x01.b1957474  itc: 3  flg: O  typ: 1 - DATA
fsl: 0  fnx: 0x0 ver: 0x01
Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0009.00e.0003968d  0x0c07f50f.1ea7.5d  --U-    1  fsc 0x0053.b1957475
0x02   0x0017.005.0006755c  0x0c0774de.1be3.38  C---    0  scn 0x0001.b1957451
0x03   0x0002.042.00010d39  0x0080b268.0cf1.22  C---    0  scn 0x0001.b1957429
data_block_dump,data header at 0x110f46074
===============
tsiz: 0x3f88
hsiz: 0x170
pbl: 0x110f46074
bdba: 0x0041edda
76543210
flag=--------
ntab=1
nrow=175
frre=0
fsbo=0x170
fseo=0x783
avsp=0x3cbe
tosp=0x3d13
0xe:pti[0]      nrow=175        offs=0
0x12:pri[0]     sfll=1
0x14:pri[1]     sfll=2

<pre class="brush:plain">Validate domain 0

Validated domain 0, flags = 0x0

kwqmnich: current time:: 13: 31: 34

kwqmnich: instance no 0 check_only flag 1

kwqmnich: initialized job cache structure

row 0041edda.2e continuation at

file# 1 block# 126426 slot 47 not found

**************************************************

KDSTABN_GET: 0 ..... ntab: 1

curSlot: 47 ..... nrows: 175

**************************************************

*** 2015-10-03 13:31:40.864

ksedmp: internal or fatal error

ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []

Current SQL statement for this session:

SELECT OWNER,NAME,TYPE,COUNT(*) FROM DBA_SOURCE WHERE SUBSTR(OWNER,1,4)='FMIS' GROUP BY OWNER,NAME,TYPE HAVING COUNT(*)>1000

----- PL/SQL Call Stack -----

Object id on Block? Y

seg/obj: 0x12 csc: 0x01.b1957474 itc: 3 flg: O typ: 1 - DATA

fsl: 0 fnx: 0x0 ver: 0x01

Itl Xid Uba Flag Lck Scn/Fsc

0x01 0x0009.00e.0003968d 0x0c07f50f.1ea7.5d --U- 1 fsc 0x0053.b1957475

0x02 0x0017.005.0006755c 0x0c0774de.1be3.38 C--- 0 scn 0x0001.b1957451

0x03 0x0002.042.00010d39 0x0080b268.0cf1.22 C--- 0 scn 0x0001.b1957429

data_block_dump,data header at 0x110f46074

===============

tsiz: 0x3f88

hsiz: 0x170

pbl: 0x110f46074

bdba: 0x0041edda

76543210

flag=--------

ntab=1

nrow=175

frre=0

fsbo=0x170

fseo=0x783

avsp=0x3cbe

tosp=0x3d13

0xe:pti[0] nrow=175 offs=0

0x12:pri[0] sfll=1

0x14:pri[1] sfll=2

我们可以发现，除开其他的非核心对象之后，这里还涉及到一个obj#=18，也就是obj$ 这个核心的数据字典表。而该数据字典表上的几个Index，

1	我们可以发现，除开其他的非核心对象之后，这里还涉及到一个obj#=18，也就是obj$ 这个核心的数据字典表。而该数据字典表上的几个Index，

i_obj1，i_obj2，i_obj3 都是object_id 小于57的核心对象，这部分对象是属于bootstrap$ 的核心数据字典对象。即是Index也无法通过

1	i_obj1，i_obj2，i_obj3 都是object_id 小于57的核心对象，这部分对象是属于bootstrap$ 的核心数据字典对象。即是Index也无法通过

rebuild，38003 event或在upgrade模式下进行重建。

1	rebuild，38003 event或在upgrade模式下进行重建。

当然，这里也不是说完全无法去重建上述数据字典表，我后面有一篇文章会相信讲解如何去重建。

1	当然，这里也不是说完全无法去重建上述数据字典表，我后面有一篇文章会相信讲解如何去重建。

在分析过程中，我发现其中的前面2个Index都有问题，如下：

1	在分析过程中，我发现其中的前面2个Index都有问题，如下：

<pre class="brush:plain">SQL&gt; analyze table obj$ validate structure;
Table analyzed.
SQL&gt; select index_name from dba_indexes where table_name='OBJ$';
INDEX_NAME
------------------------------
I_OBJ1
I_OBJ2
I_OBJ3
SQL&gt; analyze index I_OBJ1 validate structure;
analyze index I_OBJ1 validate structure
*
ERROR at line 1:
ORA-08100: index is not valid - see trace file for diagnostics
SQL&gt; analyze index I_OBJ2 validate structure;
analyze index I_OBJ2 validate structure
*
ERROR at line 1:
ORA-08100: index is not valid - see trace file for diagnostics
SQL&gt; analyze index I_OBJ3 validate structure;
Index analyzed.

<pre class="brush:plain">SQL> analyze table obj$ validate structure;

Table analyzed.

SQL> select index_name from dba_indexes where table_name='OBJ$';

INDEX_NAME

------------------------------

I_OBJ1

I_OBJ2

I_OBJ3

SQL> analyze index I_OBJ1 validate structure;

analyze index I_OBJ1 validate structure

ERROR at line 1:

ORA-08100: index is not valid - see trace file for diagnostics

SQL> analyze index I_OBJ2 validate structure;

analyze index I_OBJ2 validate structure

ERROR at line 1:

ORA-08100: index is not valid - see trace file for diagnostics

SQL> analyze index I_OBJ3 validate structure;

Index analyzed.

不过这里我们也要注意的时，虽然前面2个index都有问题，然而上述错误产生时涉及到的index并不是2个都用到了，其实只是用到了第一个index就

1	不过这里我们也要注意的时，虽然前面2个index都有问题，然而上述错误产生时涉及到的index并不是2个都用到了，其实只是用到了第一个index就

报错ora-00600错误了。由于客户想通过expdp schema的方式去导出数据，然而发现执行时报错ora-00600 [kdsgrp1],包括exp执行时也报该

1	报错ora-00600错误了。由于客户想通过expdp schema的方式去导出数据，然而发现执行时报错ora-00600 [kdsgrp1],包括exp执行时也报该

错误，不过exp tables的方式，不会报错；由于对象太多，将近50万个对象（包括表，index以及其他）。很明显，只能通过用户级别的导出。

1	错误，不过exp tables的方式，不会报错；由于对象太多，将近50万个对象（包括表，index以及其他）。很明显，只能通过用户级别的导出。

那么也就在意味着我们必须修复这个错误才。

1	那么也就在意味着我们必须修复这个错误才。

通过dump 相关的block，我们发现错误是很奇怪的，如下：

1	通过dump 相关的block，我们发现错误是很奇怪的，如下：

<pre class="brush:plain">*** SESSION ID:(1052.1243) 2015-10-03 17:56:59.784
Block Checking: DBA = 4684328, Block Type = KTB-managed data block
*** previous block dba dba: 477a2adoes not match my previous block dba dba: 477a26
**** row 0: key out of order
---- end index block validation
---- for block 4684328 = 0x00477a28
Block header dump:  0x00477a28
Object id on Block? Y
seg/obj: 0x24  csc: 0x02.80026908  itc: 19  flg: O  typ: 2 - INDEX
fsl: 0  fnx: 0x477a1c ver: 0x01
Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0015.01f.00020a0a  0x00867c38.0862.04  C---    0  scn 0x0001.83f32656
0x02   0x002e.044.00002415  0x0082989a.0477.52  C---    0  scn 0x0001.74d90109
0x03   0x002c.00a.000012c5  0x0b804b78.012b.70  C---    0  scn 0x0001.74d9011c
0x04   0x0002.001.00003a72  0x00861aab.024b.0a  C---    0  scn 0x0001.74fa8cf1
0x05   0x002f.02e.000065eb  0x0b816866.0217.5b  C---    0  scn 0x0001.74fb0205
0x06   0x0026.011.00002000  0x0b814f02.016d.14  C---    0  scn 0x0001.74fc3dbf
0x07   0x0006.011.00003b1c  0x0084c65b.01f9.10  C---    0  scn 0x0001.74fde478
0x08   0x0029.05f.00003258  0x0084faa6.017c.2f  C---    0  scn 0x0001.74fdedad
0x09   0x001b.023.00003fe6  0x008a8718.025b.3b  C---    0  scn 0x0001.74fdfedc
0x0a   0x001b.006.00004057  0x0b804f24.025c.33  C---    0  scn 0x0001.751a25ed
0x0b   0x001b.02a.00004039  0x0b804f24.025c.49  C---    0  scn 0x0001.751a2609
0x0c   0x001b.024.0000404f  0x0b804f24.025c.63  C---    0  scn 0x0001.751a2627
0x0d   0x000a.018.00021b0b  0x0b821c8b.04cc.45  C---    0  scn 0x0001.751a263f
0x0e   0x000a.055.00021b0a  0x0b821c8b.04cc.5e  C---    0  scn 0x0001.751a265c
0x0f   0x000a.031.00021aec  0x0b821c8c.04cc.09  C---    0  scn 0x0001.751a2678
0x10   0x0022.05e.00001dbd  0x008a88df.01e0.2e  C---    0  scn 0x0001.74d8ff09
0x11   0x0025.010.00001ead  0x00860446.01ac.20  C---    0  scn 0x0001.74d8ff7d
0x12   0x002b.043.00001228  0x0084dc65.0164.11  C---    0  scn 0x0001.74d8ffe2
0x13   0x001f.015.00001de8  0x0b81a76a.0183.4e  C---    0  scn 0x0001.74d90071
Leaf block dump
===============
header address 504403185228956148=0x70000064725c1f4
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 1
kdxcosdc 2
kdxconro 0
kdxcofbo 36=0x24
kdxcofeo 15816=0x3dc8
kdxcoavs 15780
kdxlespl 0
kdxlende 0
kdxlenxt 4684330=0x477a2a
kdxleprv 4684326=0x477a26
kdxledsz 6
kdxlebksz 15816
*** dummy key ***
row#0[9549] flag: ---D--, lock: 0, len=14, data:(6):  00 44 03 10 00 7c
col 0; len 5; (5):  c4 04 24 25 2f
----- end of leaf block dump -----
dumping parent of corrupted subtree, row # = 1202
dumping parent of corrupted subtree, row # = 3

<pre class="brush:plain">*** SESSION ID:(1052.1243) 2015-10-03 17:56:59.784

Block Checking: DBA = 4684328, Block Type = KTB-managed data block

*** previous block dba dba: 477a2adoes not match my previous block dba dba: 477a26

**** row 0: key out of order

---- end index block validation

---- for block 4684328 = 0x00477a28

Block header dump: 0x00477a28

Object id on Block? Y

seg/obj: 0x24 csc: 0x02.80026908 itc: 19 flg: O typ: 2 - INDEX

fsl: 0 fnx: 0x477a1c ver: 0x01

Itl Xid Uba Flag Lck Scn/Fsc

0x01 0x0015.01f.00020a0a 0x00867c38.0862.04 C--- 0 scn 0x0001.83f32656

0x02 0x002e.044.00002415 0x0082989a.0477.52 C--- 0 scn 0x0001.74d90109

0x03 0x002c.00a.000012c5 0x0b804b78.012b.70 C--- 0 scn 0x0001.74d9011c

0x04 0x0002.001.00003a72 0x00861aab.024b.0a C--- 0 scn 0x0001.74fa8cf1

0x05 0x002f.02e.000065eb 0x0b816866.0217.5b C--- 0 scn 0x0001.74fb0205

0x06 0x0026.011.00002000 0x0b814f02.016d.14 C--- 0 scn 0x0001.74fc3dbf

0x07 0x0006.011.00003b1c 0x0084c65b.01f9.10 C--- 0 scn 0x0001.74fde478

0x08 0x0029.05f.00003258 0x0084faa6.017c.2f C--- 0 scn 0x0001.74fdedad

0x09 0x001b.023.00003fe6 0x008a8718.025b.3b C--- 0 scn 0x0001.74fdfedc

0x0a 0x001b.006.00004057 0x0b804f24.025c.33 C--- 0 scn 0x0001.751a25ed

0x0b 0x001b.02a.00004039 0x0b804f24.025c.49 C--- 0 scn 0x0001.751a2609

0x0c 0x001b.024.0000404f 0x0b804f24.025c.63 C--- 0 scn 0x0001.751a2627

0x0d 0x000a.018.00021b0b 0x0b821c8b.04cc.45 C--- 0 scn 0x0001.751a263f

0x0e 0x000a.055.00021b0a 0x0b821c8b.04cc.5e C--- 0 scn 0x0001.751a265c

0x0f 0x000a.031.00021aec 0x0b821c8c.04cc.09 C--- 0 scn 0x0001.751a2678

0x10 0x0022.05e.00001dbd 0x008a88df.01e0.2e C--- 0 scn 0x0001.74d8ff09

0x11 0x0025.010.00001ead 0x00860446.01ac.20 C--- 0 scn 0x0001.74d8ff7d

0x12 0x002b.043.00001228 0x0084dc65.0164.11 C--- 0 scn 0x0001.74d8ffe2

0x13 0x001f.015.00001de8 0x0b81a76a.0183.4e C--- 0 scn 0x0001.74d90071

Leaf block dump

===============

header address 504403185228956148=0x70000064725c1f4

kdxcolev 0

KDXCOLEV Flags = - - -

kdxcolok 0

kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y

kdxconco 1

kdxcosdc 2

kdxconro 0

kdxcofbo 36=0x24

kdxcofeo 15816=0x3dc8

kdxcoavs 15780

kdxlespl 0

kdxlende 0

kdxlenxt 4684330=0x477a2a

kdxleprv 4684326=0x477a26

kdxledsz 6

kdxlebksz 15816

*** dummy key ***

row#0[9549] flag: ---D--, lock: 0, len=14, data:(6): 00 44 03 10 00 7c

col 0; len 5; (5): c4 04 24 25 2f

----- end of leaf block dump -----

dumping parent of corrupted subtree, row # = 1202

dumping parent of corrupted subtree, row # = 3

对于上述这个index 的错误，我是第一次遇见，跟老熊讨论了一下，他认为可能是index split情况之下出现的。在远程时，我也用过bbed

1	对于上述这个index 的错误，我是第一次遇见，跟老熊讨论了一下，他认为可能是index split情况之下出现的。在远程时，我也用过bbed

对前后将近10个index block进行了分析，通过比较index 的链表，发现确实不匹配。

1	对前后将近10个index block进行了分析，通过比较index 的链表，发现确实不匹配。

对于这种情况之下，想通过bbed去修复 index，难度可想而知，因此果断放弃这种方式。最后无奈之下，只能通过处理数据字典表

1	对于这种情况之下，想通过bbed去修复 index，难度可想而知，因此果断放弃这种方式。最后无奈之下，只能通过处理数据字典表

的方式来处理掉i_obj1，i_obj2 这2个index。最后再让客户进行exp 用户级别的导出，只不过这个导出的时间比较漫长了。

1	的方式来处理掉i_obj1，i_obj2 这2个index。最后再让客户进行exp 用户级别的导出，只不过这个导出的时间比较漫长了。

Tags: kclchkblk_4, ora-00600, [4137], [kcratr1_lastbwr], [kdsgrp1], [qertbFetchByRowID]
Posted in backup & rcovery on 10月 7, 2015

2 Responses to “某客户RAC由于掉电导致系统崩溃的恢复过程”

jyc Says:

10月 7, 2015 at 上午 10:17

recover database的时候没有选择online redo吗？
lizhenxu Says:

10月 7, 2015 at 下午 2:10

客户之前已经recover了N次了。。

You must be logged in to post a comment.

love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客

Categories

Archives

最新评论

国内圈子

oracle security

某客户RAC由于掉电导致系统崩溃的恢复过程

2 Responses to “某客户RAC由于掉电导致系统崩溃的恢复过程”

Leave a Reply