Consulting and software technology blog: Oracle Database 10g HW contention in lob segemts

After a migration of a database to a faster machine the database was slow and appears to hung. The problem was a slow and big lob segment. The following bug was the reason for the behaviour:

Bug 6376915: ENQ: HW - CONTENTION WITH LOB SEGMENTS.

The bigger the lob segment, the more concurrent CPUs, the more write and read I/O => the bigger the problem.

Oracle LOBs + ASSM + many CPUs = enq: HW – contention

Our new hardware architecture had significant more cores anf threads per CPU. The problem already existed with the old hardware but the new hardware made it really worse. Take a look at the statspack report:

Top 5 Timed Events Avg %Total

~~~~~~~~~~~~~~~~~~ wait Call

Event Waits Time (s) (ms) Time Wait Class

------------------------------ ------------ ----------- ------ ------ ----------

enq: HW - contention 224,968 658,323 2926 93.1 Configurat
db file sequential read 1,028,687 22,611 22 3.2 User I/O

CPU time 20,024 2.8

read by other session 191,140 2,677 14 0.4 User I/O

direct path read 379,201 2,261 6 0.3 User I/O

^LWait Events

-> s - second

...

Avg

%Time Total Wait wait Waits

Event Waits -outs Time (s) (ms) /txn

---------------------------- -------------- ------ ----------- ------- ---------

enq: HW - contention 224,968 99.9 658,323 2926 1.9

db file sequential read 1,028,687 .0 22,611 22 8.7

...

^LEnqueue Activity

-> only enqueues with waits are shown

-> Enqueue stats gathered prior to 10g should not be compared with 10g data

-> ordered by Wait Time desc, Waits desc

Enqueue Type (Request Reason)

------------------------------------------------------------------------------

Requests Succ Gets Failed Gets Waits Wt Time (s) Av Wt Time(ms)

------------ ------------ ----------- ----------- ------------ --------------

HW-Segment High Water Mark

12,154 8,437 3,398 23 16,624 722,796.87

RO-Multiple Object Reuse (fast object reuse)

333 333 0 37 8 223.78

TX-Transaction (index contention)

23 23 0 23 2 70.52

The problem was the slide of the High-Water-Mark or the allocation of space above the high water mark. It happens in too small chunks and concurrent steps causing contention. This only happens with automatic segment space management (ASSM) tablespaces. More CPU threads and more cores compound the problem and lead to more contention wait events.

The solution is modifying the high water mark manuel in bigger steps than the database engine would modify it. A step should cover one to ten days and release the database from the allocation work for the segment:

ALTER TABLE <lob_table> MODIFY LOB (<column_name>) (allocate extent (size <extent size>));

But be careful:

Bug 9711859 ORA-600 [ktsptrn_fix-extmap] during extent allocation caused by bug 8198906

ORA-600 [ktsptrn_fix-extmap] caused by a corruption in a LOB segment stored in an ASSM tablespace when there is a manual extent allocation through:
ALTER TABLE ALLOCATE EXTENT.

ERROR at line 1:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kddummy_blkchk], [22], [1858570], [18009], [], [], [], []

If you see this error it is the end for the segment allocation mechanism. It can not allocate any more extends. There is a patch for this bug but it only fixes the software bug. Your segment stays broken and it has to be migrated to a new segment and dropped afterwards.

Thursday, March 26, 2015

Oracle Database 10g HW contention in lob segemts