Database Version: 9.2.0.8
OS: AIX

Recently I faced a huge resource utilization on a Production environment. The cpu utilization was in a constant range above 90%. Looking at active session query found out PX wait ( PX Deq: Execute Reply, PX Deq: Table Q Normal, PX Deq: Execution Msg ) in a huge amount.

@sess_wait_hash

EVENT                       SQL_HASH_VALUE NAME          CNT
--------------------------- -------------- ----------- -----
PX Deq: Execute Reply            112922866                 1
PX Deq: Execution Msg            251445661                 8
PX Deq: Execution Msg            251445661                 8
PX Deq: Execution Msg            251445661                 8
PX Deq: Execution Msg            251445661                 2
PX Deq: Execution Msg            251445661                 6
PX Deq: Execution Msg            251445661                 8
PX Deq: Execution Msg            251445661                24
db file sequential read          338604825                 1
buffer busy waits                338604825                 1
db file sequential read          338604825                 1
db file sequential read          338604825                 1
db file sequential read          365717211                 1
db file sequential read          376953346                 1
db file scattered read           551940432                 1
PX Deq: Execution Msg            624512707                 1
PX Deq: Execution Msg            624512707                 1
direct path read                 624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
direct path read                 624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
db file sequential read          624512707                 1
				.
				.
				.
PX Deq: Execute Reply           1443065605                 1
PX Deq: Execute Reply           1443065605                 1
PX Deq: Execute Reply           1443065605                 1
PX Deq: Execute Reply           1443065605                 1
PX Deq: Execute Reply           1443065605                 2
PX Deq: Execute Reply           1443065605                 1
PX Deq: Execute Reply           1443065605                 1
db file sequential read         4029308862                 1

Well looking into statspack report found out “PX Deq: Execute Reply” in top 2 waits for almost every snap period. Started looking out for those query which was addressing these events.

Started with the first one,

hash_value => 251445661

SQL> select hash_value, sql_text from v$sql where hash_value=251445661;

HASH_VALUE SQL_TEXT
---------- -----------------------------------------------------
251445661 SELECT A1.C0 C0 FROM :Q1086010000 A1 ORDER BY A1.C0

Didn’t get much clue from this statement. So went for the second one.

hash_value => 624512707

HASH_VALUE SQL_TEXT
---------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
624512707 SELECT /*+ Q1086010000 NO_EXPAND ROWID(A1) */ A1."COD_ACCT_NO" C0 FROM "FCRHOST"."CH_ACCT_MAST" PX_GRANULE(0, BLOCK_RANGE, DYNAMIC)  A1 WHERE A1."FLG_MNT_STATUS"=:B1 AND A1."COD_LAST_MNT_MAKERID"=:B2

explain for the sql text,

        ID  PARENT_ID OPERATION                      OBJECT_NAME             COST CARDINALITY
---------- ---------- ------------------------------ ----------------- ---------- -----------
         0            SELECT STATEMENT                                      29284
         1          0 GRANULE ITERATOR ROWID RANGE
         2          1 TABLE ACCESS BY ROWID RANGE    CH_ACCT_MAST           29284        2652

hmmm well got something at least. ok so lets go for next one,

hash_value => 1443065605

SQL> select hash_value, sql_text from v$sql where hash_value=1443065605;

HASH_VALUE SQL_TEXT
---------- ----------------------------------------------------------------------------------------------------------------------------------------------
1443065605 Select cod_acct_no  "Account Number" from ch_acct_mast where flg_mnt_status=:"SYS_B_0"and cod_last_mnt_makerid=:"SYS_B_1" order by cod_acct_no

        ID  PARENT_ID OPERATION              OBJECT_NAME            COST CARDINALITY
---------- ---------- ---------------------- ---------------- ---------- -----------
         0            SELECT STATEMENT                              3665
         1          0 SORT ORDER BY                                 3665           1
         2          1 PARTITION RANGE ALL
         3          2 TABLE ACCESS FULL      CH_ACCT_MAST           3661           1

4 rows selected.

Well looking at all above query, a common thing observed was segment CH_ACCT_MAST table. For all PX wait, this segment was the major culprit which made me to lock deep into. Had a chat with the development team and found out this was among the major tables. Looked for the table structure,

SQL> select NUM_ROWS, DEGREE, SAMPLE_SIZE, LAST_ANALYZED from dba_tables where table_name='CH_ACCT_MAST';

  NUM_ROWS DEGREE     SAMPLE_SIZE LAST_ANAL
---------- ---------- ----------- ---------
   2481998          8      992799 07-AUG-11

The degree of the table was set to 8, which made it to parallely access the block. One thing that came in my mind was to change the degree to 1 and then check.

alter table FCRHOST.CH_ACCT_MAST noparallel;

SQL> select NUM_ROWS, DEGREE, SAMPLE_SIZE, LAST_ANALYZED from dba_tables where table_name='CH_ACCT_MAST';

  NUM_ROWS DEGREE     SAMPLE_SIZE LAST_ANAL
---------- ---------- ----------- ---------
   2481998          1      992799 07-AUG-11

The utilization didn’t reduce on the fly. The optimizer for the current running query was considering the degree as 8 so was going for “PX Deq: Execute Reply” for the existing session. But then the resource utilization gradually started reducing. Later then the PX wait was not observed in any of the statspack report.

The total session count was around 1200 and the active session was around 80 to 100 before changes.

SQL> @last_call

  SID SPID         USERNAME     LOGONDATE          OSUSER      STATUS   MODULE                           LAST_CALL_ET
----- ------------ ------------ ------------------ ----------- -------- -------------------------------- ------------
  643 3174526      FCRHOST      10-08 17:06:36     orabdb1     ACTIVE   oracle@p57011A (TNS V1-V3)               2005
  365 9167048      FCRHOST      10-08 17:06:36     orabdb1     ACTIVE   oracle@p57011A (TNS V1-V3)               2005
  728 9285638      FCRHOST      10-08 17:06:36     orabdb1     ACTIVE   oracle@p57011A (TNS V1-V3)               2005
  483 3928154      FCRHOST      10-08 17:06:36     orabdb1     ACTIVE   oracle@p57011A (TNS V1-V3)               2005
  361 9867444      FCRHOST      10-08 17:16:14     user1       ACTIVE   dllhost7.exe                             1414
   87 8712240      FCRHOST      10-08 17:18:46     user1       ACTIVE   dllhost7.exe                             1275
  654 6840380      FCRHOST      10-08 17:18:46     user1       ACTIVE   dllhost7.exe                             1275
 2026 7516224      FCRHOST      10-08 17:18:46     user1       ACTIVE   dllhost7.exe                             1275
 1894 3194890      FCRHOST      10-08 17:18:46     user1       ACTIVE   dllhost7.exe                             1275
.																.
 1183 5259468      FCRHOST      10-08 17:18:46     user1       ACTIVE   dllhost7.exe                             1275
  243 5279874      FCRHOST      10-08 17:18:46     user1       ACTIVE   dllhost7.exe                             1275
 1692 9523290      FCRHOST      10-08 17:18:46     user1       ACTIVE   dllhost7.exe                             1275
 1235 7114776      FCRHOST      10-08 17:20:06     user1       ACTIVE   dllhost7.exe                             1101
 1121 8740888      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
  312 8208586      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
  993 9633796      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
  636 9900162      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
							.
  698 5849132      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
 1570 3366932      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
 1270 7934128      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
 1412 8261864      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
 1723 7975126      FCRHOST      10-08 17:23:54     user1       ACTIVE   dllhost7.exe                              967
 1519 7139466      FCRHOST      10-08 17:24:06     user1       ACTIVE   dllhost7.exe                              947
							.
 1565 5406894      FCRHOST      10-08 17:27:05     user1       ACTIVE   dllhost7.exe                              776
  974 10002540     FCRHOST      10-08 17:27:05     user1       ACTIVE   dllhost7.exe                              776
 1767 5025922      FCRHOST      10-08 17:27:05     user1       ACTIVE   dllhost7.exe                              776
 1616 9642074      FCRHOST      10-08 17:27:05     user1       ACTIVE   dllhost7.exe                              776
  595 5181608      FCRHOST      10-08 17:34:23     user1       ACTIVE   JDBC Thin Client                          338
  321 1503428      FCRHOST      10-08 17:35:47     user1       ACTIVE   dllhost7.exe                              177
 1766 3956864      FCRHOST      10-08 16:31:53     orabdb1     ACTIVE   oracle@p57011A (TNS V1-V3)                 42
 1975 7815352      FCRHOST      10-08 17:35:24     user1       ACTIVE   dllhost.exe                                31
  128 2871532      FCRHOST      10-08 17:39:44     user1       ACTIVE   dllhost.exe                                12
  261 8384530      FCRHOST      10-08 17:35:33     oracle      ACTIVE   oracle@X36601 (TNS V1-V3)                   9
 1756 4186120      FCRHOST      10-08 17:39:43     user1       ACTIVE   dllhost.exe                                 0
 1562 4132940      FCRHOST      10-08 13:37:43     SYSTEM      ACTIVE   oracle@p57011A (TNS V1-V3)                  0
  382 7737388      FCRHOST      10-08 15:06:37     SYSTEM      ACTIVE   oracle@p57011A (TNS V1-V3)                  0
  947 2105378      FCRHOST      10-08 09:58:16     SYSTEM      ACTIVE   PosService.exe                              0
  921 4595712      SYS          10-08 15:39:33     ora9ifcr    ACTIVE   sqlplus@p57011B (TNS V1-V3)                 0

84 rows selected.

This then gradually reduced to 20 – 30 active session.

SQL> @last_call

  SID SPID         USERNAME      LOGONDATE          OSUSER        STATUS   MODULE                         LAST_CALL_ET
----- ------------ ------------- ------------------ ------------- -------- ------------------------------ ------------
  261 8384530      FCRHOST       10-08 17:35:33     oracle        ACTIVE   oracle@X36601 (TNS V1-V3)               288
  526 5939426      FCRHOST       10-08 17:40:52     orabdb1       ACTIVE   oracle@p57011A (TNS V1-V3)              228
 1925 2883608      FCR_ALERTS    10-08 17:41:28                   ACTIVE                                           192
 1982 5668986      FCRHOST       10-08 17:40:02     oracle        ACTIVE   oracle@X36601 (TNS V1-V3)               108
  248 7995432      FCRHOST       10-08 17:43:18     user1         ACTIVE   dllhost.exe                              31
  682 9953450      FCRHOST       10-08 17:42:28     oracle        ACTIVE   oracle@X36601 (TNS V1-V3)                26
  243 3010692      FCRHOST       10-08 17:44:14     user1         ACTIVE   dllhost7.exe                             16
   78 3440676      FCRHOST       10-08 17:43:03     user1         ACTIVE   dllhost.exe                              12
 1254 6750266      FCRHOST       10-08 17:44:04     user1         ACTIVE   dllhost.exe                              11
  452 9298112      FCRHOST       10-08 17:40:26     user1         ACTIVE   dllhost.exe                              10
 1975 7815352      FCRHOST       10-08 17:35:24     user1         ACTIVE   dllhost.exe                               1
 1741 7876842      FCRHOST       10-08 17:44:39     SYSTEM        ACTIVE   JDBC Thin Client                          1
  947 2105378      FCRHOST       10-08 09:58:16     SYSTEM        ACTIVE   PosService.exe                            1
  921 4595712      SYS           10-08 15:39:33     ora9ifcr      ACTIVE   sqlplus@p57011B (TNS V1-V3)               0
  835 6664196      FCRHOST       10-08 13:27:32     SYSTEM        ACTIVE   oracle@p57011A (TNS V1-V3)                0
 1586 3707046      FCRHOST       10-08 17:43:12     user1         ACTIVE   dllhost.exe                               0
  293 827422       FCRHOST       10-08 17:36:58     user1         ACTIVE   dllhost.exe                               0

17 rows selected.

The session query which was taking around 30 mins for the complete process was then reduced to less than 5 mins. The time consumption was then reduced to a great extend.

Advertisements