Monday, 7 November 2022

What is an Archive Gap And its possible Solution


 What is an Archive Gap And its possible Solution

PURPOSE

 This Document show the various Possibilities to detect and resolve a Redo Gap on a Standby Database.

 

What is an Archive Gap ??

 

An Archive Gap is a Range of missing Redo on the Standby Site that prevents Log Apply Services to proceed. This typically happens when the Standby Site is unable to receive Redo from the Primary Database or the Redo Information is not available on the Standby Database. Possible and common Causes for Archive Gaps are:

Network Disconnects or stop of Log Transport Services

Outages of the Standby Database

Misconfigurations of Log Transport Services

I/O-Issues on the Standby Site

Manual Deletion of ArchiveLogs before they are applied to the Standby

Insufficient Bandwith in the Network between the Primary and Standby Site

Once there is an Archive Gap on the Standby Database the Log Apply Services will get stuck until the Gap is resolved, ie. the missing Redo in Form of ArchiveLogs is fetched and made available on the Standby Site. Log Apply Services then can pick it up and proceed.

 

Methods of Gap Resolution


There are 4 Possibilities to resolve an Archive Gap on a Standby Database. Those are discussed below.

Automatic Gap Resolution

Automatic Gap Resolution is performed automatically by the Log Transport Services. Basically the currently transferred Redo is compared with the last received. If there is a Mismatch, it is detected by the receiving RFS-Process on the Standby Database, which will automatically request the missing Log Sequence from the Primary Database again via the ARCH-RFS Heartbeat Ping. This Type of Gap Resolution is using the Service defined in log_archive_dest_n on the Primary Database serving this Standby Database. In Addition the ARCH-RFS Heartbeat Ping performs a Poll of the current Sequence to detect an Archive Gap. If there is one detected it will get resolved the same Way. Once a Gap is resolved the Transport Process (ARCH or LGWR) is notified about the Resolution of the Gap. For Automatic Gap Resolution there is no special Setting or monitoring required.

 

FAL (Fetch Archive Log) Gap Resolution

Once an ArchiveLog is received or archived from a Standby RedoLog on the Standby Database, it is registered in the Standby Controlfile (you can query the Registration by v$archived_log on a Physical Standby Database and dba_logstdby_log on a Logical Standby Database). If such a File is missing or corrupted for any Reason (eg. it got deleted by Fault), FAL is called to perform a Gap Resolution. This is the Case because such missing Logfiles are typically detected by the Log Apply Services on the Standby Database. Those are working independent from the Log Transport Services and do not have a direct Link to the Primary Database. To use FAL, there must be one or two (prior Oracle 11.2.0) Initialization Parameters setup on the Standby Database:

FAL_SERVER: Specify an Oracle Net Service Name (TNS-Alias or Connect Descriptor) that points to the Database from where the missing ArchiveLog(s) should be requested. This can either be the Primary Database, but also another Standby-, ArchiveLog Repository- or Far Sync Standby (> Oracle 12.1.0) Database inside the Data Guard Configuration. It is possible to specify multiple Service Names (Comma separated). FAL will then sequentially attempt those Databases to resolve the Gap.

FAL_CLIENT (< Oracle 11.2.0): Specify an Oracle Net Service Name (TNS-Alias or Connect Descriptor) that points from the FAL_SERVER Database(s) back to the Standby Database (ie. that’s the Destination where the FAL_SERVER Database should send the Redo to). Ensure this TNS-Alias exists in the TNSNAMES.ORA of your FAL_SERVER Database(s). This Parameter is not required any more since Oracle 11.2.0. However you have to ensure there exists a corresponding log_archive_dest_n on your FAL_SERVER Database(s) which is pointing to the Standby Database requesting the Gap Resolution.

Once the Log Apply Services detect an Archive Gap it sends a FAL Request to the FAL_SERVER handing over the FAL_CLIENT (or db_unique_name for Version > 11.1.0). An ARCH-Process on the FAL_SERVER tries to pick up the request Sequence(s) from that Database and sends it back to the FAL_CLIENT (or uses the Destination valid for this db_unique_name). If the first FAL_SERVER is not able to resolve the Gap, the next FAL_SERVER in the List will be attempted. If it cannot be resolved by all FAL_SERVERs the FAL-Request fails and a corresponding Message will be put in the ALERT.LOG of the Standby Database.

In order to successfully complete a Gap Request the requested ArchiveLog Sequence(s) must be available on the FAL_SERVER Database (on Disk and the corresponding Entry in the Controlfile).

FAL is available since Oracle 9.2.0 for Physical Standby Database and Oracle 10.1.0 for Logical Standby Databases.

 

Manual Gap Resolution

If an Archive Gap cannot be resolved automatically by any of the previously mentioned Methods, you can still try to manually resolve an Archive Gap.

You can query v$archive_gap on a Physical Standby Database or dba_logstdby_log on the Logical Standby Database to determine a current Archive Gap, eg.

 

On Physical standby


SQL> select * from v$archive_gap;

On Logical standby

SQL> select thread#, sequence# from dba_logstdby_log l where next_change# not in

     (select first_change# from dba_logstdby_log where l.thread#=thread#)

     order by thread#, sequence#;

 

Now copy the returned Sequences to the Standby Database manually to the desired Location. If the missing are not yet registered on the Standby Database, you have to register them before the Log Apply Services are able to read those Logfiles. You can register ArchiveLogs using

Physical Standby:

SQL> alter database register logfile ‘<File-Specification>’;

 Logical Standby:

SQL> alter database register logical logfile ‘<File-Specification>’;

 

Once they are registered Log Apply Services will pick up the ArchiveLogs and proceed.

 

Roll forward using Incremental Backup (Physical Standby only)

If a Gap cannot be resolved with any of the previous Methods, the Gap is quite large and may take a long Time to resolve or the Gap cannot be resolved because the missing ArchiveLogs do not exist on any Database any more, you can still roll forward a Physical Standby Database using an incremental Backup from SCN. This Feature is available since Oracle 10.2.0. The Idea is to record the latest SCN applied to the Standby Database, then create an incremental Backup from that SCN on Primary Database using RMAN together with a Backup of the current Controlfile (as Standby Controlfile).

Then we first replace the old Standby Controlfile with the Standby Controlfile from the incremental Backup and apply the incremental Backup on the Standby Database. This is a real fast and easy Way to bring a Standby Database back close to the current Status of the Primary Database. Since the Steps to take can be different in the various Releases, please look into Chapter

“Using RMAN Incremental Backups to Roll Forward a Physical Standby Database”


No comments:

Post a Comment