love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客

Phone:18180207355 提供专业Oracle/MySQL/PostgreSQL数据恢复、性能优化、迁移升级、紧急救援等服务

关于parallel rollback的一点总结

本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客

本文链接地址: 关于parallel rollback的一点总结

3 Responses to “关于parallel rollback的一点总结”

  1. roger Says:

    DATABASE HANG DUE TO PARALLEL TRANSACTION RECOVERY [ID 464246.1]

    ——————————————————————————–

    修改时间 18-OCT-2010 类型 PROBLEM 状态 PUBLISHED

    In this Document
    Symptoms
    Cause
    Solution

    ——————————————————————————–

    Applies to:
    Oracle Server – Enterprise Edition – Version: 9.2.0.1 to 10.2.0.4 – Release: 9.2 to 10.2
    Information in this document applies to any platform.

    Symptoms
    Database is hanging. Undo tablespace is growing.

    The hang encountered because of parallel transaction recovery as the systemstate dump shows the significant waits for “Wait for a undo record” and “Wait for stopper event to be increased”.

    Cause
    The systemstate dump shows the following waitevents:

    SO: 70000008c9de498, type: 4, owner: 70000008c626cc0, flag: INIT/-/-/0x00
    (session) sid: 1099 trans: 0, creator: 70000008c626cc0, flag: (51) USR/- BSY/-/-/-/-/-
    DID: 0001-0008-00000003, short-term DID: 0000-0000-00000000
    txn branch: 0
    oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
    waiting for ‘wait for stopper event to be increased’ blocking sess=0x0 seq=82 wait_time=0
    seconds since wait started=156
    =0, =0, =0
    Dumping Session Wait History
    for ‘wait for stopper event to be increased’ count=1 wait_time=97720 =0, =0, =0
    for ‘wait for stopper event to be increased’ count=1 wait_time=97681 =0, =0, =0
    for ‘wait for stopper event to be increased’ count=1 wait_time=97676 =0, =0, =0
    for ‘wait for stopper event to be increased’ count=1 wait_time=97676 …

    SO: 70000008c9d6a58, type: 4, owner: 70000008c628460, flag: INIT/-/-/0x00
    (session) sid: 1087 trans: 0, creator: 70000008c628460, flag: (41) USR/- BSY/-/-/-/-/-
    DID: 0001-000E-00000004, short-term DID: 0000-0000-00000000
    txn branch: 0
    oct: 0, prv: 0, sql: 0, psql: 0, user: 0/SYS
    O/S info: user: oracle, term: UNKNOWN, ospid: 2998446, machine: ct-db3
    program: oracle@ct-db3 (P000)
    waiting for ‘wait for a undo record’ blocking sess=0x0 seq=69 wait_time=0 seconds since wait
    started=243 =0, =0, =0
    Dumping Session Wait History
    for ‘wait for a undo record’ count=1 wait_time=97677 =0, =0, =0
    for ‘wait for a undo record’ count=1 wait_time=97675 =0, =0, =0
    for ‘wait for a undo record’ count=1 wait_time=97673 =0, =0, =0
    for ‘wait for a undo record’ count=1 wait_time=97682 =0, =0, =0
    for ‘wait for a undo record’ count=1 wait_time=97679 …

    The above 2 wait events are mainly meant for parallel transaction recovery.

    Solution
    Solution

    ======

    To disable the parallel rollback by setting the following parameter

    fast_start_parallel_rollback = false

    Explanation

    ========

    Sometimes Parallel Rollback of Large Transaction may become very slow. After killing a large running transaction (either by killing the shadow process or aborting the database) then database seems to hang, or SMON and parallel query servers taking all the available CPU.

    In fast-start parallel rollback, the background process SMON acts as a coordinator and rolls back a set of transactions in parallel using multiple server processes.

    Fast start parallel rollback is mainly useful when a system has transactions that run a long time before a commit, especially parallel Inserts, Updates, Deletes operations. When SMON discovers that the amount of recovery work is above a certain threshold, it automatically begins parallel rollback by dispersing the work among several parallel processes.

    There are cases where parallel transaction recovery is not as fast as serial transaction recovery, because the PQ slaves are interfering with each other. It looks like the changes made by this transaction cannot be recovered in parallel without causing a performance problem. The parallel rollback slave processes are most likely contending for the same resource, which results in even worse rollback performance compared to a serial rollback.

  2. roger Says:

    How to Disable Parallel Transaction Recovery When Parallel Txn Recovery is Active [ID 238507.1]

    ——————————————————————————–

    修改时间 29-APR-2010 类型 PROBLEM 状态 PUBLISHED

    Checked for relevance on 18-Sep-2008
    Checked for relevance on 29-April-2010

    Symptom(s)
    ~~~~~~~~~~

    Parallel Transaction Recovery is taking too long.

    You can use V$TRANSACTION USED_UBLK to estimate how long the rollback is going
    to take but there is no formula for this. If you shutdown the database after
    rollback has started, it will begin where it left off.

    You can also look at V$FAST_START_TRANSACTIONS for UNDOBLOCKSDONE
    versus UNDOBLOCKSTOTAL.

    Change(s)
    ~~~~~~~~~~

    A large transaction got killed or rolled back.

    Cause
    ~~~~~~~

    There are cases where parallel transaction recovery is not as fast as serial
    transaction recovery, because the pq slaves are interfering with each other.
    This depends mainly on the type of changes that need to be made during rollback
    and usually may happen when rolling back INDEX Updates in parallel.

    Fix
    ~~~~

    Dynamically switch from parallel recovery to serial. If you are in a clustered
    environment you will need to do this on all instances at the same time:

    1. Find SMON’s Oracle PID:

    Example:

    SQL> select pid, program from v$process where program like ‘%SMON%’;

    PID PROGRAM
    ———- ————————————————
    6 oracle@stsun7 (SMON)

    2. Disable SMON transaction cleanup:

    SVRMGR> oradebug setorapid
    SVRMGR> oradebug event 10513 trace name context forever, level 2

    3. Kill the PQ slaves that are doing parallel transaction recovery.
    You can check V$FAST_START_SERVERS to find these.

    4. Turn off fast_start_parallel_rollback:

    alter system set fast_start_parallel_rollback=false;

    If SMON is recovering, this command might hang, if it does just control-C out of it. You may need to try this many times to get this to complete (between SMON cycles).

    5. Re-enable SMON txn recovery:

    SVRMGR> oradebug setorapid
    SVRMGR> oradebug event 10513 trace name context off

    References
    ~~~~~~~~~~~
    NOTE 144332.1
    Parallel Rollback may hang database, Parallel query servers get 100% cpu

  3. 兜兜 Says:

    这个实验如果添加 insert into select @dblink ,也许就能看到 并行恢复慢了,我也只是猜想!

    这里有一个AWR ,和这个有关 http://www.itpub.net/thread-1814543-1-1.html

Leave a Reply

You must be logged in to post a comment.