After a migration of a database to a faster machine the database was slow and appears to hung. The problem was a slow and big lob segment. The following bug was the reason for the behaviour:
Bug 6376915: ENQ: HW - CONTENTION WITH LOB SEGMENTS.
The bigger the lob segment, the more concurrent CPUs, the more write and read I/O => the bigger the problem.
Bug 6376915: ENQ: HW - CONTENTION WITH LOB SEGMENTS.
The bigger the lob segment, the more concurrent CPUs, the more write and read I/O => the bigger the problem.
Oracle LOBs + ASSM + many CPUs = enq: HW – contention
Our new hardware architecture had significant more cores anf threads per CPU. The problem already existed with the old hardware but the new hardware made it really worse. Take a look at the statspack report:
Top 5 Timed Events Avg
%Total
~~~~~~~~~~~~~~~~~~
wait Call
Event Waits Time (s)
(ms) Time Wait Class
------------------------------
------------ ----------- ------ ------ ----------
enq: HW - contention 224,968 658,323
2926 93.1 Configurat
db file sequential read 1,028,687 22,611 22 3.2 User I/O
db file sequential read 1,028,687 22,611 22 3.2 User I/O
CPU time
20,024 2.8
read by other session 191,140 2,677
14 0.4 User I/O
direct path read 379,201 2,261 6
0.3 User I/O
^LWait
Events
->
s - second
...
Avg
%Time Total Wait wait
Waits
Event Waits -outs
Time (s) (ms) /txn
----------------------------
-------------- ------ ----------- ------- ---------
enq:
HW - contention
224,968 99.9 658,323
2926 1.9
db
file sequential read
1,028,687 .0 22,611 22 8.7
...
^LEnqueue Activity
-> only enqueues with waits are shown
-> Enqueue stats gathered prior to 10g
should not be compared with 10g data
-> ordered by Wait Time desc, Waits
desc
Enqueue Type (Request Reason)
------------------------------------------------------------------------------
Requests Succ Gets Failed
Gets Waits Wt Time (s) Av Wt Time(ms)
------------ ------------ -----------
----------- ------------ --------------
HW-Segment High Water Mark
12,154 8,437 3,398 23 16,624 722,796.87
RO-Multiple Object Reuse (fast object
reuse)
333 333 0 37 8 223.78
TX-Transaction (index contention)
23 23 0 23 2 70.52
The problem was the slide of the High-Water-Mark or the allocation of space above the high water mark. It happens in too small chunks and concurrent steps causing contention. This only happens with automatic segment space management (ASSM) tablespaces. More CPU threads and more cores compound the problem and lead to more contention wait events.
The solution is modifying the high water mark manuel in bigger steps than the database engine would modify it. A step should cover one to ten days and release the database from the allocation work for the segment:
But be careful:
Bug 9711859 ORA-600 [ktsptrn_fix-extmap] during extent allocation caused by bug 8198906
ORA-600 [ktsptrn_fix-extmap] caused by a corruption in a LOB segment stored in an ASSM tablespace when there is a manual extent allocation through:
ALTER TABLE ALLOCATE EXTENT.
Bug 9711859 ORA-600 [ktsptrn_fix-extmap] during extent allocation caused by bug 8198906
ORA-600 [ktsptrn_fix-extmap] caused by a corruption in a LOB segment stored in an ASSM tablespace when there is a manual extent allocation through:
ALTER TABLE ALLOCATE EXTENT.
ERROR at line 1:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kddummy_blkchk], [22], [1858570], [18009], [], [], [], []
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kddummy_blkchk], [22], [1858570], [18009], [], [], [], []
If you see this error it is the end for the segment allocation mechanism. It can not allocate any more extends. There is a patch for this bug but it only fixes the software bug. Your segment stays broken and it has to be migrated to a new segment and dropped afterwards.