Monday, 7 November 2022

Troubleshooting Data Guard Log Transport Services in oracle

 Troubleshooting Data Guard Log Transport Services in oracle

Log Transport Services Overview:

Data Guard is using an Oracle NET-Connection from the Primary to the Standby Database. The Initialization Parameter ‘log_archive_dest_n’ is used for the Transport Services to a particular Standby Database. This Parameter consists of several Attributes to configure the Log Transport Services to the corresponding Standby Database.

If there is a log_archive_dest_n defined on the Primary Database which is pointing to a Standby Database (ie. the ‘SERVICE’-Attribute is used), the Primary Database tries to connect to the Standby Database using the Descriptor or TNS-Alias given in that Attribute. The LGWR-Process creates a Worker Process (LNS-, NSS-, NSA- or TTXX-Process) which is responsible for the Connection to the Standby Database and Transporting the Redo. The corresponding Server Process is a RFS-Process on the Standby Database. In Addition there is an ARCH-Process connected to the Standby Database responsible for the Heartbeat and Gap Resolution Mechanism.

Troubleshooting Steps:

 First of all verify if you can connect to the Standby Database at all. An easy Way to find out is simply open a SQL*PLUS-Session from the Primary Database ORACLE_HOME and try to connect to the Standby Database using the same Descriptor or TNS-Alias given in the ‘SERVICE’-Attribute of log_archive_dest_n:


SQL> connect sys/<Password>@<Standby Service> as sysdba

SQL> select database_role, db_unique_name from v$database;


If it connects and returns the correct Database Role and db_unique_name the Connection basically looks fine. If it returns any Error, likely the Heartbeat is hitting the same Error (you should see in the ALERT.LOG). In that Case you can follow up on

Note 1432367.1: Troubleshooting - Heartbeat failed to connect to standby 

Each log_archive_dest_n has a corresponding Initialization Parameter to toggle the Status for this Destination. This Parameter is called ‘log_archive_dest_state_n’. Verify it is set to ‘enable’ for the corresponding log_archive_dest_n.

Ensure there is sufficient Bandwith available on the Network. It must be able to cope with the Redo Generation Rate at any Time – even during Peak Times. See

Run the ORAchk-Tool against your Configuration to verify if your current Setup matches with the recommended Setup. You can get further Information about ORAchk here:

Note 1268927.2: ORAchk - Health Checks for the Oracle Stack 

Verify if there is any active Firewall or Router in the Network Connection between the Primary and Standby Database. Those can have Features enabled which are able to modify TCP-Packets which can impact the SQLNET-Communication and raise Errors like ORA-3135 or Redo Corruptions in the Log Transport Services

SERVICE

DB_UNIQUE_NAME

VALID_FOR

Compare the Settings for the DG_CONFIG in the Initialization Parameter ‘log_archive_config’ with the db_unique_name for your remote Archive Destinations. It must contain the db_unique_names from the Primary and all its Standby Databases. It must be set to the same Value on all the participating Standby Databases, too.

In order to have most Benefit, Standby RedoLogs should be configured and used by Log Transport Services. Review

Note 219344.1: Usage, Benefits and Limitations of Standby Redo Logs (SRL)

If the Data Guard Broker is used, note that all Initialization Parameters and Attributes are maintained by the Data Guard Broker. If you want to modify anything you have to modify the corresponding Data Guard Broker Property. In Case of any Problems you can also review the Data Guard Broker Logfile (DRC<SID>.LOG) for each Instance. It can be found in the ‘trace’-Directory of each Instance (see the Initialization Parameters 'background_dump_dest' or 'diagnostic_dest' for the exact Paths).

To gather further Tracing Information you can enable Archive Tracing on the Primary and Standby Database. Archive Tracing can be enabled via the Initialization Parameter ‘log_archive_trace’. There are different Levels available. Enabling different Levels at the same Time can be done by setting ‘log_archive_trace’ to the Sum of the desired Levels. Setting back to ‘0’ disables the Tracing. Relevant and common Levels for Log Transport Services are:

For the Primary Database:


1:            Tracks archiving of log files

2:            Tracks archive status by archive log file destination

64:         Tracks ARCn process state activity

128:       Tracks FAL server process activity

512:       Tracks LGWR redo shipping network activity

2048:     Tracks RFS/ARCn ping heartbeat


For the Standby Database:


1:            Tracks archiving of log files

2:            Tracks archive status by archive log file destination

64:          Tracks ARCn process state activity

256:       Tracks RFS Logical Client

1024:     Tracks RFS physical client

2048:     Tracks RFS/ARCn ping heartbeat


There are some more Levels which are concerning local Archiving and Log Apply Services which can be set as well if required. See

for a full List of available Levels. The Tracing Output can be found in the current Process-Tracefiles (eg. LGWR, NSS, RFS…). You can review those Traces to get detailed Information.


 


No comments:

Post a Comment