love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客

Phone:18180207355 提供专业Oracle/MySQL/PostgreSQL数据恢复、性能优化、迁移升级、紧急救援等服务

oracle asm剖析系列(9)–ASM Dynamic Volume Manager (ADVM)

从oracle 11gR2开始,引入了ACFS,其中11gR2同时又引入了ASM Dynamic Volume Manager (ADVM)去支持ACFS。
在11.2的asm中,不仅仅用于存储database files,还能存储一些非结构化的数据,例如clusterware 文件、以及
一些通常的二进制文件、external files和text files。

总之,从11gR2开始,asm变得更加简单,智能化,也大大的降低了开销。彻底的抛开了对第三方卷组管理软件的依赖。

这一篇文章中,我们就来讲解11gR2引入的ADVM,也就是我们所看到的asm metadata file 7.

在未创建ADVM之前,我们直接查询是看不到file 7的,这里我们先来创建一下。
首先需要启动acfs 相关服务:
[root@11gR2test bin]# ./acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies – this may take some time.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@11gR2test bin]# ./acfsload  start
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9322: completed
[root@11gR2test bin]# ./acfsdriverstate version
ACFS-9325:     Driver OS kernel version = 2.6.18-8.el5(i386).
ACFS-9326:     Driver Oracle version = 111206.

创建一个diskgroup:

创建advm卷组:
[ora11g@11gR2test ~]$ asmcmd volcreate -G acfs -s 1g acfs_v1
[ora11g@11gR2test ~]$ asmcmd volcreate -G acfs -s 1g acfs_v2
ORA-15032: not all alterations performed
ORA-15041: diskgroup “ACFS” space exhausted (DBD ERROR: OCIStmtExecute)
[ora11g@11gR2test ~]$ asmcmd volinfo -a
Diskgroup Name: ACFS

Volume Name: ACFS_V1
Volume Device: /dev/asm/acfs_v1-41
State: ENABLED
Size (MB): 1024
Resize Unit (MB): 256
Redundancy: MIRROR
Stripe Columns: 4
Stripe Width (K): 128
Usage:
Mountpath:

[ora11g@11gR2test ~]$ exit
exit

Volume Name: ACFS_V1
Volume Device: /dev/asm/acfs_v1-41
State: ENABLED
Size (MB): 1024
Resize Unit (MB): 256
Redundancy: MIRROR
Stripe Columns: 4
Stripe Width (K): 128
Usage:
Mountpath:

Volume Name: ACFS_V2
Volume Device: /dev/asm/acfs_v2-41
State: ENABLED
Size (MB): 512
Resize Unit (MB): 256
Redundancy: MIRROR
Stripe Columns: 4
Stripe Width (K): 128
Usage:
Mountpath:

创建完advm之后,我们再次查询试图,看下能否看到asm file 7呢?

从上面,大家可以看到,默认创建advm是必须镜像的,且其分配单元是256m,条带宽度是128k.
创建一个acfs文件系统:
[ora11g@11gR2test ~]$ /sbin/mkfs -t acfs /dev/asm/acfs_v1-41
mkfs.acfs: version                   = 11.2.0.2.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/acfs_v1-41
mkfs.acfs: volume size               = 1073741824
mkfs.acfs: Format complete.
[root@11gR2test bin]# mkdir /acfs_test
[root@11gR2test bin]# chown -R ora11g:oinstall /acfs_test
[root@11gR2test bin]# mount -t acfs  /dev/asm/acfs_v1-41  /acfs_test
[root@11gR2test bin]# mount
/dev/sda2 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda5 on /home type ext3 (rw)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
none on /proc/fs/vmblock/mountPoint type vmblock (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/asm/acfs_v1-41 on /acfs_test type acfs (rw)
[ora11g@11gR2test ~]$ asmcmd volinfo -G acfs ACFS_V1
Diskgroup Name: ACFS

Volume Name: ACFS_V1
Volume Device: /dev/asm/acfs_v1-41
State: ENABLED
Size (MB): 1024
Resize Unit (MB): 256
Redundancy: MIRROR
Stripe Columns: 4
Stripe Width (K): 128
Usage: ACFS
Mountpath: /acfs_test
[ora11g@11gR2test ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             3.8G  2.9G  767M  80% /
/dev/sda5              19G   18G  419M  98% /home
/dev/sda1              46M   13M   31M  31% /boot
tmpfs                 506M  154M  352M  31% /dev/shm
/dev/asm/acfs_v1-41   1.0G   39M  986M   4% /acfs_test

下面我们用kfed来读取advm 元数据:

—block 1

—block 2

由于我这里只有2个advm 卷,所以kfed读取前面2个block 就行了,第1个block对应第一个advm卷,第2个block对应第2个
advm卷,以此类推. 正因为我这里只有2个,所以当读取第3个block时,会发现数据都是空的,这是正常的,如下:

下面我们进入正题,解析下advm的结构,从上面kfed的结果,我们可以看到其实也就分为3部分:

1)kfbh ,头部信息 这部分内容不多说了,前面的文章中都有描述,差不多。
kfbh.type 指数据类型
kfbh.block.obj 指该元数据的asm file number,advm是file 7,所以这里看到的是7.
kfbh.block.blk 该数据所在的au block号

2)kffdnd,从上面输出的信息,我们不难猜测,这部分信息其实就是用来定位和描述block在目录树中的具体位置的。
跟前面描述disk directory的kffdnd结构是一样的,所以这里也不多说。

kffdnd.bnode.incarn:                  1 ; 0x000: A=1 NUMM=0x0   —-分配信息,包括block的分支号和指向next freelist block的指针
kffdnd.bnode.frlist.number:  4294967295 ; 0x004: 0xffffffff
kffdnd.bnode.frlist.incarn:           0 ; 0x008: A=0 NUMM=0x0
kffdnd.overfl.number:                 2 ; 0x00c: 0x00000002     —overfl,表示指向同层级的下一个block
kffdnd.overfl.incarn:                 1 ; 0x010: A=1 NUMM=0x0
kffdnd.parent.number:        4294967295 ; 0x014: 0xffffffff
kffdnd.parent.incarn:                 0 ; 0x018: A=0 NUMM=0x0
kffdnd.fstblk.number:                 0 ; 0x01c: 0x00000000     —表示指向上一层的block
kffdnd.fstblk.incarn:                 1 ; 0x020: A=1 NUMM=0x0

3)kfvvde,这部分内容是asm advm元数据定义内容。

—前面部分是entry信息,这部分内容无关紧要
kfvvde.entry.incarn:                  1 ; 0x024: A=1 NUMM=0x0
kfvvde.entry.hash:                    0 ; 0x028: 0x00000000
kfvvde.entry.refer.number:   4294967295 ; 0x02c: 0xffffffff
kfvvde.entry.refer.incarn:            0 ; 0x030: A=0 NUMM=0x0

—下面才是我们关注的焦点:
kfvvde.volnm:                   ACFS_V1 ; 0x034: length=7  —表示asm advm 卷名称
kfvvde.usage:                      ACFS ; 0x054: length=4  —advm的type类型,这里是使用的acfs
kfvvde.dgname:                          ; 0x074: length=0
kfvvde.clname:                          ; 0x094: length=0
kfvvde.mountpath:            /acfs_test ; 0x0b4: length=10  —这里表示acfs mount的路径
kfvvde.drlinit:                       1 ; 0x4b5: 0x01
kfvvde.pad1:                          0 ; 0x4b6: 0x0000
kfvvde.volfnum.number:              256 ; 0x4b8: 0x00000100 —这里表示volume file number.
kfvvde.volfnum.incarn:        808033913 ; 0x4bc: 0x30299e79
kfvvde.drlfnum.number:              257 ; 0x4c0: 0x00000101 —这里表示volume dirty region logging 信息对应的file number
kfvvde.drlfnum.incarn:        808033913 ; 0x4c4: 0x30299e79
kfvvde.volnum:                        1 ; 0x4c8: 0x0001   —这里表示对应的卷组number号,从1开始。
kfvvde.avddgnum:                     41 ; 0x4ca: 0x0029   —这里不知道是什么意思 ?
kfvvde.extentsz:                     64 ; 0x4cc: 0x00000040  —这里表示advm的extent大小,有点类似database中的extent概念。
这里stripe是4,而其分配unit是256m,所以这里是64.
kfvvde.volstate:                      2 ; 0x4d0: D=0 C=1 R=0 –这里表示advm卷组状态。2应该是表示可用。
kfvvde.pad[0]:                        0 ; 0x4d1: 0x00
上面的kfvvde.drlfnum.number,表示dirty region logg,我没有找到asm相关的资料,通过google搜索发现Veritas Volume Manager有类似的机制。

Dirty region logging (DRL) is a fault recovery mechanism used in Veritas Volume Manager. If DRL is enabled, speeds recovery of
mirrored volumes after a system crash. DRL keeps track of the regions that have changed due to I/O writes to a mirrored volume.
DRL uses this information to recover only those portions of the volume that need to be recovered.

If DRL is not used and a system failure occurs, all mirrors of the volumes must be restored to a consistent state. Restoration is
done by copying the full contents of the volume between its mirrors. This process can be lengthy and I/O intensive. It may also be
necessary to recover the areas of volumes that are already consistent.

Dirty Region Logs
DRL logically divides a volume into a set of consecutive regions, and maintains a log on disk where each region is represented by
a status bit. This log records regions of a volume for which writes are pending. Before data is written to a region, DRL synchronously
marks the corresponding status bit in the log as dirty. To enhance performance, the log bit remains set to dirty until the region
becomes the least recently accessed for writes. This allows writes to the same region to be written immediately to disk if the
region’s log bit is set to dirty.

On restarting a system after a crash, VxVM recovers only those regions of the volume that are marked as dirty in the dirty region log.

从这段文字的描述来看,我猜测oracle asm advm的DRL,跟veritas 卷管理的drl机制应该是一样或至少是类似的机制。 dirty region log,顾名思义,
也就是脏数据日志记录区域,能够加快系统crash后的恢复速度,可以防止io写错误等。

事实上,不仅仅是veritas,其他的厂家都有着类似的技术,我在Sun Cluster 2.2 Cluster Volume Manager Guide 中也看到了如下类似的描述:
Dirty Region Logging (DRL) is an optional property of a volume that provides speedy recovery of mirrored volumes after a system failure.
Dirty Region Logging is supported in cluster-shareable disk groups. This section provides a brief overview of DRL and outlines differences
between SSVM DRL and the CVM implementation of DRL. For more information on DRL, refer to Chapter 1 in the applicable Sun StorEdge
Volume Manager 2.6 System Administrator’s Guide.

DRL keeps track of the regions that have changed due to I/O writes to a mirrored volume and uses this information to recover only
the portions of the volume that need to be recovered. DRL logically divides a volume into a set of consecutive regions and maintains
a dirty region log that contains a status bit representing each region of the volume. Log subdisks store the dirty region log of a
volume that has DRL enabled. A volume with DRL has at least one log subdisk, that is associated with one of the volume’s plexes.

大家可以去参考oracle的官方文档:http://docs.oracle.com/cd/E19957-01/806-2329/ch2admin-39382/index.html

最后简单总结一下:

1) 11gR2引入了acfs文件系统,同时引入了advm卷组管理,跟第三方软件的卷组管理机制类似,比如veritas,solaris cluster。
2) 11gR2的asm相比之前的版本,更为强悍,可以存储各种文件,不仅仅是datafiles、甚至还可以存储external files、text files等。
3) acfs管理并不复杂,同时advm的结构也比较简单,总共分为3部分,第一部分是头部信息,第2部分是指针信息,第3部分是advm
卷组的一些定义信息。
4) 在$GRID_HOME/bin下面有部分acfs的工具,是grid软件自带的,可以用于进行acfs的管理和监控.

One Response to “oracle asm剖析系列(9)–ASM Dynamic Volume Manager (ADVM)”

  1. oracle performance tunning firefighting | love wife & love life —Roger 提供oracle技术支持服务 Says:

    […] 点击下载:oracle_performance_firefighting_中文版第1章-试读版 […]

Leave a Reply

You must be logged in to post a comment.