We recently upgraded 10g Database from 10.2.0.4 to 10.2.0.5 psu patch 10. Upgradation was perfect, but unfortunately we faced an issue. The auto AWR snapshot was not generating. If we try to go for manual snapshot, it use to get hanged. By hang I mean to say stuck at command prompt. We had raised an SR with Oracle for this issue. Took a long while to identify and resolve. I thought why not share this issue with resolution.
AWR snapshot was not generated. With manual snap creation the execution used to get hanged.
They asked for 10046 trace. To take trace for manual snap creation “execute dbms_workload_repository.create_snaphot();”
SQL> oradebug setospid 25807 -- ospid SQL> oradebug unlimit SQL> oradebug Event 10046 trace name context forever, level 12 -- waiting for few minutes SQL> oradebug Event 10046 trace name context off SQL> oradebug dump errorstack 3 -- waiting for few minutes SQL> oradebug dump errorstack 3 SQL> exit
oradebug level 3 errorstack provides the following information:
Call Stack Trace
Files Currently Opened
Process State for the pid
Pinned Buffer History
Fixed PGA, SGA and UGA
In memory trace dump
Trace file shows:
application name: sqlplus.exe, hash value=254292535 waiting for 'enq: WF - contention' wait_time=0, seconds since wait started=1432 name|mode=57460006, 0=0, 0=0 blocking sess=0x000000057A706358 seq=37 Dumping Session Wait History for 'enq: WF - contention' count=1 wait_time=2.999985 sec name|mode=57460006, 0=0, 0=0
The session collecting snapshot is continually waiting for “enq: WF – contention”. So, the waiting was for this wait to flush the statistics to AWR. mmon trace shows, it error out while flushing the stats.
MMON trace are generated in bdump location.
MMON trace shows:
*** KEWRAFM1: Error=13518 encountered by Auto Flush Main. 13518, 00000, "Invalid database id (%s)" // *Cause: The operation failed because the specified database ID // does not exist in the Workload Repository. // *Action: check the database id and retry the operation.
The mmon trace provided some more hint. The error state that database ID is invalid as the database ID doesn’t exists.
The MMON process takes snapshots at regular intervals (default 60 minutes) and inserts that data into the AWR tables. These tables containing AWR information are under SYS schema. The AWR related tables all begin with “WR”:
SQL> select distinct dbid from WRM$_DATABASE_INSTANCE ; no row selected.
It has to show DB ID similar to the output for next statement. If it doesn’t, then in that case the DB ID needs to be populated in AWR tables i.e the missing DB ID.
SQL> select dbid from v$database;
WRM$_DATABASE_INSTANCE will be populated when instance is bounced. This is the case when we try to recreate AWR with the help of catnoawr.sql and catawrtb.sql. The DBID is not populated in the AWR tables. The Database needs to be bounced.
When a function / procedure is invoked from DBMS_WORKLOAD_REPOSITORY package, it refers to WRM$_DATABASE_INSTANCE table to get dbid information. If the AWR repositories are recreated, the table doesn’t have any data in it.
After a reboot of the database, the dbid was recorded in WRM$_DATABASE_INSTANCE table of sys. But this didn’t resolved our problem. We were still not able to generate AWR snaps.
Then we went for systemstate 266 trace. Collected system state dump level 266 for the session for creating manual snapshot.
sqlplus /nolog SQL> oradebug setmypid Statement processed. SQL> oradebug unlimit Statement processed. SQL> oradebug dump systemstate 266 Statement processed. -- waiting for few minutes SQL> oradebug dump systemstate 266 Statement processed. SQL> oradebug tracefile_name c:\oracle112\diag\rdbms\orcl\orcl\trace\orcl_ora_1136.trc SQL> exit
*** 2011-10-04 11:23:47.339 *** KEWRAFC: Flush slave failed, AWR Enqueue Timeout *** 2011-10-04 11:24:47.339 *** KEWRAFC: Flush slave failed, AWR Enqueue Timeout *** 2011-10-04 11:25:50.339
The trace shows the waiting is for flushing snapshots. Remedy for the problem is to disable the flushing of AWR data.
alter system set "_awr_disabled_flush_tables" = 'wrh$_tempstatxs';
Its a dynamic parameter, doesn’t require bounce of database.
For More: AUTOMATED WORKLOAD REPOSITORY