5.Managing Databases in ASM
In the chapters
so far, we have discussed the various components of ASM and how they
all work together to provide a solid storage management solution, taking
advantage of Oracle metadata and providing performance benefits by
storing data the way that Oracle databases need them. But how do the
relational database management system (RDBMS) and the new storage
manager work together? At this point, every reader must be wondering
about the answer to this important question. In this chapter, let’s
discuss the interaction between ASM and the RDBMS. We will also discuss
the various interfaces to the ASM instance and how to migrate data into
and from ASM storage.
ASM supports both a single-instance version
of the Oracle Database and a clustered version. Although from a
strategic point of view ASM best fits a clustered configuration such as
Real Application Clusters (RAC), it is not short of any features when it
comes to implementing ASM in a single-instance database. So in this
chapter, unless otherwise mentioned, all discussions apply to both types
of configurations.
Interaction Between ASM and Database
An RDBMS instance is the standard Oracle
instance and is the client of the ASM instance. The RDBMS-to-ASM
communications are always intranode—that is, the RDBMS will not contact
the remote ASM instance (in case of RAC) for servicing disk group
requests. This is true until 12c ASM. See Chapter 14 for details.
It is important to note that RDBMS instances
do not pass an I/O request to the ASM instance—that is, RDBMS instances
do not proxy the I/O request through ASM. When an RDBMS instance opens
an ASM file such as the system tablespace data file, ASM ships the
file’s extent map to the RDBMS instance, which stores the extent’s map
in its System Global Area (SGA). Given that the RDBMS instance has the
extent map for the ASM file cached in the SGA, the RDBMS instance can
process reads/writes directly from the disk without further intervention
from the ASM instance. As is the case with raw devices, the I/O request
size is dictated by the RDBMS instance, not by ASM. Therefore, if the
RDBMS needs to read or flush an 8KB database block, it will issue an 8KB
I/O, not a 1MB I/O.
Users often ask whether there is any
dependency between Automatic Segment Space Management (ASSM)–based data
files and ASM. There is no dependency between ASM and ASSM (although
their acronyms only differ by one extra letter). ASM manages how data
files are laid out across disks (among other things) without looking at
the data stored in those data files. ASSM deals with empty blocks that
exist within a single segment (it replaces freelists and freelist
groups). ASM does not care if the data files contain segments that are
freelist-managed or autospace-segment-managed via Bitmap Managed Block
(BMB), so no conversion of internal structures takes place.
An active RDBMS instance that uses ASM storage
can operate just like a typical database instance. All file access is
directly performed with minimal ASM intervention. When a file is opened
or created, the ASM instance sends the file layout to the RDBMS instance
and all subsequent inputs/outputs (I/Os) are done using the extent map
stored in the RDBMS instance.
All ASM file access can only be done via the
RDBMS instance and its utilities. For example, database backups of ASM
files can be performed only with RMAN. Note that utilities such as the
Unix dd command are not recommended for backing up or restoring ASM disk
groups.
Database instances usually interact with the
ASM instance when files are created, resized, deleted, or opened. ASM
and RDBMS instances also interact if the storage configuration changes,
such as when disks are added, are dropped or fail.
From a user perspective, the
database-file-level access (read/write) of ASM files is similar to
non-ASM, except that any database filename that begins with a plus sign
(+) will automatically be handled and managed using the ASM code path.
However, with ASM files, the database file access inherently has the
characteristics of raw devices—that is, unbuffered (direct I/O) with
kernelized asynchronous I/O (KAIO). Because the RDBMS instance issues
I/Os against raw devices, there is no need to set the database parameter
FILESYSTEMIO_OPTION unless network file system (NFS) volumes are used
as ASM disks. Additionally, because ASYNCIO is enabled by default, no
additional parameters are needed to leverage ASM. In fact, from database
side, no database init.ora parameters are mandatory to support ASM.
Cloning ASM and Database Homes
As of Oracle Database 11g Release 2,
ASM is now part of the Grid Infrastructure (GI) stack and no longer part
of the database software stack. In this section, we review how to clone
the ASM and database software stack for non-RAC environments. To start
the ASM software cloning process, the software stack should be archived
with the tar or zip command as root. Savvy Unix folks can take advantage
of the tar command and redirect the files to the ssh command and thus
extract the files in a single command on the target server. Here’s an
example of the single command that can be leveraged as root from the
/u01/app/oracle/product/11.2.0 directory:
Note that grid is the Grid Infrastructure
HOME software directory. This single command will archive the complete
subdirectory of the grid directory, copy the contents to the target
host, and extract it, preserving all the permissions. The tar command is
the preferred option because symbolic links will be preserved.
A golden copy of the non-RAC Grid
Infrastructure Home should be maintained to clone to other servers.
Certain directories that contain log files can be cleaned up prior to
establishing a golden image copy of the ASM software stack.
Once the software is copied to the target
server, we can start the cloning process. To clone the GI stack, change
your directory to the $ORACLE_HOME/clone/bin directory and execute the
Perl cloning script with the following options:
Executing the shell script will result in an output that looks like this:
As root, execute the orainstRoot.sh script to
register the new GI Home with the central inventory located in the
/u01/app/oraInventory subdirectory. The location of oraInventory can be
identified by perusing the contents of the /etc/oraInst.loc file.
Executing the orainstaRoot.sh script produces the following result:
As instructed, execute the root.sh script:
As the final step of the cloning process, execute the Perl script again, invoking the roothas.pl script:
We now have successfully configured the GI
stack with ASM for the standalone server. The next step will be to
create the ASM disk groups. There are several ways to create the ASM
disk groups. ASMCA (ASM Configuration Assistant) and SQL*Plus (with the
create diskgroup command) are two such options. In this chapter, we
introduce the asmcmd mkdg command:
If the server is configured with ASMLIB, the
ASMLIB disk names can be retrieved with the /etc/init.d/oracleasm
listdisks command. Simply replace the “ORCL:” disk names with the ASMLIB
disk names retrieved from the listdisks command. The contents of the
sample DATA.xml and FRA.xml files are shown here:
Create Database in Command-Line Mode
Now that the ASM disk groups are created,
our next step is to clone over the database software stack. The database
software cloning process is similar to the GI stack. With the database
software stack, we also issue the clone.pl script and pass the following
arguments for ORACLE_BASE, ORACLE_HOME, OSDBA_GROUP, OSOPER_GROUP,
ORACLE_HOME_NAME, and INVENTORY_LOCATION:
Standardization is crucial for effective and
rapid Oracle GI and database software provisioning. If all the DBAs were
standardized with the same directory structures for ORACLE_HOME,
ORACLE_BASE, and INVENTORY_LOCATION, we could leverage the same cloning
scripts to clone the database home software from one server to another.
We could realistically provision a fully functional database(s) in a
matter of hours instead of days or even weeks. The biggest benefit of
cloning the GI and database software stack is that we do not need to
configure X-Windows or vncserver on the database server or deal with the
X-Windows client or vncviewer on our desktops.
Executing the clone_db.txt script yields the following results:
As instructed, execute the root.sh script:
Even the root.sh shell script is executed in
silent mode. As a final review, check the suggested log file to see the
output of root.sh:
We have successfully cloned over the GI and
database software stack; now the last step is to create the database. Of
course, our goal is also to create the database in silent mode with
DBCA for a standalone server (in the upcoming sections, we will review
the other options for creating a database). In the following example, we
create a database called RMANPOC on the DATA disk group with 300MB redo
logs:
Executing dbca in silent mode with the options just specified will produce the following output:
As suggested in the dbca output, you should
review the log file for additional details. For RAC databases, the
syntax for dbca will be similar except that the –nodelist options will
include all the nodes of the RAC cluster. The following example was
produced to create the DBFSPRD database on the half RACK Exadata:
One note of caution: We cannot enable
archivelog mode leveraging dbca in silent mode unless we create a
template database. Just like we create a golden image for our GI and
database software stack, we should also create a golden image template
database that has all of our security policies in place, the correct
redo log sizes and number, standby redo logs, monitoring schemas, custom
profiles, custom password management, and so on. After we establish our
golden image database with all of our standards intact, we can leverage
DBCA to create a golden image template database that can be leveraged
to easily create new databases. Creating a golden image template from
DBCA will essentially shut down the database and perform a cold backup
of the database. Once the golden image database is created, all new
databases we create will automatically inherit all the changes that were
incorporated into the golden image database.
Unfortunately, creating the golden image
template database is outside the scope of this book. For advanced
examples on leveraging DBCA in silent mode, visit the http://dbaexpert.com/blog website.
As an alternative approach, we can create the
database in the old traditional mode with SQL*Plus. This method is no
longer a recommended option because we have the DBCA with the silent
option, but a few of the proud still prefer this approach. To create a
database with SQL*Plus, the following script can be leveraged:
In this example, we are creating a database
called CLIDBA in the DATA disk group. The redo logs are multiplexed to
the DATA and FRA disk groups with a size of 300MB per member. This
script assumes that all the necessary directories under
$ORACLE_BASE/admin/$ORACLE_SID are created and the initialization
parameters (or spfile) are already established under $ORACLE_HOME/dbs.
For old-time veterans, this approach may still be a preferred method.
After we create the database, we execute the standard set of scripts:
catalog.sql, catproc.sql, catexp.sql, and catdbsyn.sql. If you create a
custom database with DBCA, the database is created similar to this
approach, and these scripts are also executed behind the scenes.
Database Interaction with ASM
DBAs predominantly choose the DBCA in GUI
mode as their tool of choice, which is provided with the product. As
shown in preceding examples, DBCA can also be invoked in silent mode
(nongraphical approach). Creating a database using the GUI requires you
to have some sort of X Server or a VNC Server running to export your
display to. This also implies that you have to forward your X packets
using putty or SecrureCRT. This section highlights key screens in the
DBCA as they relate to ASM and RAC.
The creation of a Real Application Clusters
(RAC) database is different from the regular standalone configuration
because a RAC configuration consists of one database and two or more
instances.
In the past, we preferred the GUI simply
because there are fewer steps to remember. When you use DBCA, the steps
are predefined and based on the selected template, and the type of
database is automatically created, sized, and configured. However, the
script approach has an advantage over the GUI approach in the sense that
the user can see what is happening during the creation process and
physically monitor the process. Another advantage of this option is that
you can create the script based on the needs of the enterprise.
DBCA helps in the creation of the database. It
follows the standard naming and placement conventions defined in the
Optimal Flexible Architecture (OFA) standards.
DBCA can be launched automatically as part of
the Oracle installation process or manually by directly executing the
dbca command from the $ORACLE_HOME/bin directory. From this screen, the
type of database to be created is selected. The screen provides two
choices: Oracle Real Application Clusters (RAC) database and Oracle
single instance database. Figure 5-1
shows the DBCA Welcome screen. Note that RAC One is a RAC database on a
single node. The steps needed to configure RAC One and regular RAC are
the same.
Configuration of the database using either
option should be similar. For the purposes of our discussions, we will
use the RAC option. Therefore, select the Oracle Real Application
Clusters (RAC) database option and then click Next.
NOTE
The Oracle Real Application Clusters
database option is visible only if the Clusterware is configured. If
this option is not shown on this screen, the database administrator
(DBA) should cancel the configuration process and verify that the
Clusterware and ASM are running before proceeding.
|
The third screen defines how the database is configured in the cluster (see Figure 5-2).
In this example, we will choose Admin-Managed over Policy-Managed
because the cluster size is only three nodes. As a general rule,
policy-managed databases make the most sense with four or more nodes. On
this same screen is the node selection window, and the appropriate node
where RAC needs to be configured is selected. Because the Oracle
Universal Installer (OUI) copies the required files to all nodes
participating in the cluster, it is advisable to select all nodes listed
and then click Next.
On DBCA step 6, shown in Figure 5-3, the database file storage location choices are listed. For 11gR2
RAC, only two choices are offered: ASM and Clustered Filesystem (raw is
no longer supported). Choose ASM, select the Use Oracle-Managed Files
option, and indicate the default disk group location. Oracle-managed
files are detailed in Chapter 7.
As Figure 5-4 shows, when ASM is chosen as the storage type, a pop-up screen appears that requires the user to specify the ASMSNMP password.
When Oracle Clusterware is initially
installed, the ASMSNMP user is generated by the ASM Configuration
Assistant (ASMCA) when an ASM instance is created. During this process,
the user is also prompted for the ASMSNMP password. ASMSNMP is primarily
used by Oracle Enterprise Manager to monitor ASM instances. This
account is granted the SYSDBA privilege.
The next screen, shown in Figure 5-5,
is the Recovery Configuration screen. In this screen, the DBA has the
option to create a fast recovery area (FRA) and enable archiving. For
easy sharing of recovery data among the various nodes participating in
the cluster, it is a requirement that these areas are located on shared
storage. OUI has options to select the appropriate location by browsing
through a list.
Both the archive log files and FRA are to be configured on the same type of storage.
The next several screens (steps 8–11 in the
DBCA process, and not shown here) present the option to create sample
schemas when the database is configured, and they allow the DBA to
select various instance-specific information such as memory allocation
(for example, shared pool and buffer cache), sizing (such as processes),
character modes (for example, UTF8), and the connection methods
(dedicated, for instance). The DBA defines these parameters by selecting
the respective tabs on the screen.
The last screen shows the database creation
options. In this screen, the DBCA provides the option to generate the
database creation scripts or to create the database. The DBA can select
both the options, in which case OUI generates the scripts and
subsequently creates the database automatically. Alternatively, users
can allow DBCA to generate the scripts and execute them manually. Ensure
that the Create Database check box is selected and then click Finish.
When the DBA selects the Finish option, the
DBCA begins creating the database. When the process has finished, a new
database and the required database instances are created.
Based on the database configuration specified,
the progress screen shows the various stages of the installation
process. After all the stages of the installation are complete, the DBCA
starts the instances on the respective nodes. Finally, the OCR is
updated with the new database information and its dependencies, such as
ASM disk groups, ACFS mount points, and the listener.
Disk Groups and Databases
A disk group can contain files from many
different Oracle databases. These databases can be from the same server
or they can reside on different servers. However, a disk and a database
file can be part of only one disk group. Furthermore, one Oracle
Database may also store its files in multiple disk groups managed by the
same ASM instance. Allowing multiple databases to share a disk group
provides greater potential for improved disk utilization and greater
overall throughput.
To reduce the complexity of managing ASM and
its disk groups, it is recommended that you maintain and manage two ASM
disk groups per RAC cluster. Additional disk groups may be added to
support tiered storage classes in information lifecycle management (ILM)
or hierarchical storage management (HSM) deployments. Another variation
of this model deployed by customers is where a third disk group is
created and used to house the temporary tablespaces. In these cases, the
temporary tablespace disk group is usually on less-expensive storage.
The two primary disk groups are used for the database map to the following areas:
Database area This
area stores active database files such as data files, control files,
online redo logs, and change-tracking files used in incremental backups.
Fast recovery area This
is where recovery-related files are created, such as multiplexed copies
of the current control file and online redo logs, archived redo logs,
backup sets, and flashback log files.
If chosen at database creation time (via
DBCA), an FRA stores an active copy of the control file and one member
set of the redo log group to provide higher availability for the
database. Note that additional copies of the control file or extra log
files can be created and placed in either disk group.
NOTE
When using multiple control files in
different disk groups, all the disk groups need to be available before
the database can be started. The dependency between database and ASM
disk groups maintained by CRS ensures that all necessary disk groups are
mounted correctly.
|
What about cases where there are several
databases on the server, each one with differing recovery point
objectives (RPOs) and recovery time objectives (RTOs)? One solution is
to create a single large disk group for all the database areas and
(possibly) a separate FRA for each database. This caters to environments
where the database growth can be contained and where disk utilization
is of great importance. There are many other alternatives, such as
grouping databases together into a single DATA disk group and a single
FRA disk group. The main point here is not to create too many disk
groups or to mimic a file system environment. This just ends up
producing “islands of storage pools” and becomes highly unmanageable.
An additional scenario is where multiple
single-instance databases share disk groups managed by ASM in a
clustered environment. In such situations, the primary objective is
database consolidation.
Transportable Tablespaces and ASM
This section describes the method for
creating transportable tablespaces (TTSs), where the source and
destination databases are ASM based. This method uses standard utilities
such as Datapump and the database package DBMS_FILE_TRANSFER. This
package is used to transfer the Datapump metadata and data files to the
remote database server. For completeness, the command CONVERT DATABASE
is covered in the last subsection.
We will examine a working example of creating
and plugging in transportable tablespaces. In the example, two servers
(host1 and host2) are in disparate locations, with each one on an
independent Hitachi storage array. Server host1 is the source database
and will house the database named db1. The server host2, which is deemed
the target server and will subscribe to the transportable tables, will
house the database SSKYDB.
Performing the Preliminary Steps to Set Up the TTSs
Before setting up the TTSs, you need to take the following steps:
1. Create or use two existing tablespaces on the source database:
Although having two tablespaces is not
necessary in all cases, we will need them for this example to
demonstrate object dependency.
NOTE
OMF is being employed in this example, so set the init.ora parameter DB_CREATE_FILE_DEST to the appropriate disk group name.
|
2. Create a table in TTS_1 and an index in TTS_2 to ensure that the tablespaces have object dependencies:
3. Check to make sure you have a self-contained transportable set. Oracle provides a PL/SQL package that aids in this check:
4. Query the TRANSPORT_SET_VIOLATIONS view to see whether any dependency violations exist:
5. Create a new service names entry,
which will point to the destination database to which the tablespaces
will be transported. For example, add the following lines to
tnsnames.ora:
6. As SYSTEM, create a database link
between the two databases. This is needed because you will be using the
DBMS_FILE_TRANSFER package to move metadata between the two databases:
7. Create a directory object in the
source database db1 to hold the dump file. Because we are using ASM,
this needs to be an ASM object; make sure that the directory path ends
with a slash (/):
8. Create another directory object in the source database that points to an operating system path for the log file:
9. Create a directory object in the source database that points to the data files:
10. Grant read/write access to the
user you will perform the export as (you need only perform this step if
you are using a nonprivileged user):
11. Repeat steps 7 through 10 for the target database SSKYDB as well.
12. Set all tablespaces in the transportable set to read-only:
13. Check the status of the tablespaces on the source database:
14. Export the metadata for the two tablespaces:
15. Use DBMS_FILE_TRANSFER to send the dump file across to the target. Alternatively, asmcmd copy could also be used.
16. Check the filenames on the source database for the two tablespaces being transported:
17. Transfer the two data files to the target database using DBMS_FILE_TRANSFER:
18. On host2 (the target server), import the data file metadata using Datapump:
NOTE
For the TRANSPORT_DATAFILES parameter, you
can either use the alias names (filenames in the DBMS_FILE_TRANSFER) or
use the names generated by DBMS_FILE_TRANSFER (these start with the name
File_Transfer.xxxx.xxxxxx). To determine the latter system-generated
names, use the asmcmd line tool by simply entering cd
+DATA/sskydb/datafile followed by the ls –l command.
|
19. Switch the tablespaces back to read/write mode:
20. Verify that the data files are successfully plugged in:
21. To validate that the data was imported, select the required tables:
Alert Log Monitoring
Monitoring the ASM instance is no different
than any other database and starts with scrutinizing the alert log file.
The ASM instance alert log file is located in the
$DIAG_DEST/asm/+asm/+ASM1/trace directory for the +ASM1 instance, in the
$DIAG_DEST/asm/+asm/+ASM2/trace directory for the +ASM2 instance, and
so on. The DIAG_DEST is defined as an initialization parameter and
defaults to $ORACLE_BASE.
The following function in bash or ksh shell can easily help you change your directory to the ASM trace directory:
This function interprets the current
ORACLE_SID and changes your directory to the trace subdirectory for your
current ORACLE_SID. For non-ASM ORACLE_SIDs, simply type the word trace and press ENTER.
You will instantly be changed to the trace subdirectory of the
$DIAG_DEST directory. For ASM ORACLE_SIDs, you will need to pass the
parameter of asm, as shown here:
We can monitor the ASM alert log with the
following Perl script. The entire premise of the alert log monitoring
script is to capture all the ORA- messages in the alert_+ASM[i].log,
where i represents the ASM instance ORACLE_SID number. Efficiency
is the key to the alert log monitoring script, so we capture and store
the last line number at the end of the alert log file. A subsequent scan
of the alert starts from the last stored line number and scans to the
end of the file for ORA- messages. At the end of the file, the line
number is stored again to be used as the beginning point for the next
scan.
As a prerequisite, we should create a bdump in
the $ORACLE_BASE/admin/$ORACLE_SID directory as
/u01/app/oracle/admin/+ASM/bdump. In addition, we need to create a
symbolic link in the bdump directory that points to the
alert_+ASM[i].log in the trace directory. Here’s an example of the alert
log symbolic link:
In this particular example, the symbolic link
is created as alert_+ASM.log (without the ASM instance name) that points
to the alert_+ASM1.log file. We do this intentionally so that our cron
jobs can be the same on every database server, regardless of whether the
server is a standalone server or a RAC node. Here’s what the cron job
should look like:
SH_TOP, listed in the alertlog.pl file,
defines where the Perl script is being executed from. Underneath the
SH_TOP directory, the last line number of the alert log will be placed
in the log subdirectory. The log subdirectory should be created as part
of the alert log file monitoring configuration:
If, for some reason, you want the alert log
mining to start from the beginning, you can simply change the contents
of the alert_+ASM.id file to 0.
The alert_notification_db.ksh line of the Perl
script can be replaced with mail or mailx. In our example, each ORA-
message detected in the alert log file is uploaded to an Oracle table
for future analytics.
Just like the database alert log, the ASM alert log also needs to be purged and archived:
The –s option specifies the alternative state
file for the logrotate utility. In order for the oracle or grid unix
account to execute logrotate, the –s option must be specified because
the default state file is /var/lib/logrotate.status. Typically, only the
root account has access to write to this file. The –f option specifies
the force option to rotate the file with a vengeance, even if the
logrotate utility does not think that the rotate is necessary. The last
option for the log rotate script is to indicate the configuration file,
where we can specify the following:
The location of the log file to rotate and options for log rotation
The frequency of the rotation
The number of files to keep
Whether to compress or not compress
After we execute the log rotate script for
the first time, we can see that the old alert log is compressed and the
new file has a file size of zero bytes:
Monitoring the ASM Disk Group
The ASM disk group can easily be remotely
monitored from any server on your network. All that is required is
SQL*Plus, access to the database, and SELECT privileges against the
V$ASM_DISKGROUP_STAT dictionary view. The shell script and all the
ancillary scripts shown in this section can be downloaded from http://viscosityna.com/asmbook/.
Instead of querying V$ASM_DISKGROUP_STAT from
the ASM instance, we can query the underlying database. In most
environments, we should be able to query a single database to get ASM
disk group free space information for the server. This shell script
accepts two parameters:
The paging threshold
The TNS alias listed in the TNSNAMES.ORA file
The script sends an alert if the free space
threshold drops below the $PAGER_THRESHOLD parameter for the specified
database $DB as the TNS ALIAS in the TNSNAMES.ORA file.
The shell script also logs into the database as the RODBA schema, which stands for read-only DBA, and only has select dictionary privileges against the database:
This shell script is unique in the way it
sends all alerts by calling the alert_notification.ksh shell script. The
alert_notification.ksh shell script is set up to handle mailx, Tivoli
alerts integrated with wpostemsg, and even HP Openview alerts with
opcmsg. The alert_notification.ksh shell script can also be downloaded
from http://viscosityna.com/asmbook/.
Issues DBAs Encounter with ASM
Many DBAs struggle with the same set of
problems inside of ASM because they normally don’t encounter these
problems in the file system world. In particular, several distinct areas
can cause additional production outages if DBAs do not know the correct
procedures for navigating inside of ASM. We want to bring these common
problems to the surface so that everyone can easily tackle them and
mitigate any kind of unnecessary outages.
Spfile with an Incorrect Initialization Parameter
What do you do if you cannot start up the
instance because of an incorrect initialization parameter and your
spfile is inside ASM? You will not even be able to start up the nomount
database. As a general rule, always create a backup of the spfile into a
pfile before you issue any alter system commands and commit the changes
to the spfile:
However, if you happen to get stuck in a
situation where you cannot bring up a database instance due to an
incorrect syntax in the initialization parameter, you can change your
environment to the ASM instance and create the pfile. First, determine
where your spfile is located with the srvctl command:
Then, from the ASM instance, create a copy of the spfile from asmcmd:
You can invoke the native Unix strings command to pull out the contents of the spfile to a text file:
You will have to do some manual massaging of
the file because some lines will overlap and some lines will be cut off.
You can take this opportunity to fix the erroneous initialization
parameter entry that caused this issue to start. Once you get your
golden initialization parameter, you can replace your old spfile. You
can proactively remove the existing DATA_EXAD/prod/spfileprod.ora file
first. Once you start your database instance in nomount mode, you can
create the spfile from pfile using the following syntax and then restart
the database:
Archiver Is Stuck
What do you do when your archive destination
(in this case, the FRA disk group) hits 100-percent utilization? Every
now and then, we encounter the following dreaded Oracle error message:
DBAs never want to see this message. Often, it
is not because we ran of space in the ASM FRA disk group but rather
because our initialization parameter db_recovery_file_dest_size is not
sized adequately. We can fix this easily by modifying the
db_recovery_file_dest_size parameter to a value that is significantly
higher. For example, we can set the db_recovery_file_dest_size to 5TB,
like so:
However, if the FRA disk group runs out of
space and becomes 100-percent utilized, you can end up in a situation
where you have to perform some heroics to get out of the situation. When
the FRA disk group reaches 100-percent full, you will want to perform
one of these tasks:
Remove old archive logs (if possible).
Back up and delete archive logs.
Add more storage to FRA by adding new disks. See Chapter 4 for details on how to add storage to a given disk group.
To remove old archive logs, log into asmcmd
as the grid user. Often the grid user is oracle; in the Exadata world,
the grid user is oracle. After logging in as the grid user, source your
environment so that you can successfully invoke the asmcmd command.
After sourcing your environment, launch asmcmd. Within asmcmd, change
your directory to the FRA directory, followed by the database name,
followed by the archive log directory, as demonstrated here:
The easiest way would be to remove the oldest
directory. In this scenario, we can safely remove the 2012_09_19/,
2012_11_07/, and 2012_11_08/subdirectories with the following command:
If you are pressed for space and do not have
the luxury of removing days of archive logs, you can back up the archive
logs first to the file system and then remove the older backups. The
best way to back up and remove archive logs is by using RMAN with the
following command:
This will simply back up all the archive logs
and delete them. You will immediately start noticing available space in
the FRA because this command will back up and delete five archive logs
at a time. Another option is to simply delete the old archive logs from
RMAN as well:
Moving the Control Files to ASM
DBAs often struggle trying to move control
files from the file system to the ASM disk groups. The first thing to be
aware of is that the database should be in nomount mode when the
activities occur. Second, there should be at least one control file copy
per ASM disk group. Some DBAs often make the mistake of creating three
control files in the DATA disk group and forget to create a control file
copy for the FRA disk group. Instead of creating three control files
for the DATA disk group, the recommended architecture will be to place
one control file in the DATA disk group and the second copy of the
control file in the FRA disk group. To move the control file to ASM, you
can do one of two things:
Copy the control file using the asmcmd cp command from the file system to the ASM disk groups.
Restore the control files to the ASM disk groups using RMAN.
As a preliminary step, restore the control
file from the file system to the ASM disk groups. You should issue the
restore command to each of the disk groups:
Repeat the restore step for the FRA disk group (in our example, +dba_pf101):
As an alternative option, starting with Oracle Database 11g
Release 2, we can take advantage of the asmcmd command in lieu of the
RMAN restore command and copy (cp) the files from the file system to the
ASM disk group:
As the final step, change the control_files
initialization parameter with the new location in ASM where these files
should be placed with the alter system command:
In the preceding examples, we restored the
control files and specified the target location in our RMAN command.
Instead, we can modify the location target of the control files in our
CONTROL_FILES initialization parameter and simply issue the following
command:
This command will restore the control files to all the locations specified in the initialization parameters.
As a final step, you will need to shut down and restart the database with the new control files in ASM.
Starting with Oracle Database 11g
Release 2, the control file backup and snapshot control file need to be
located on shared storage. Shared storage can be in the ASM disk group,
NFS, or ACFS. In a RAC configuration, any instance in the cluster can
write to the snapshot control file. For this reason, the snapshot
control file needs to be visible to all instances. You will encounter an
“ORA-00245: control file backup failed; target is likely on a local
file system” error while backing up the control file using
SQL*Plus/RMAN, if an auto-backup of the control file is configured on
nonshared location. You need to move the snapshot control file to an ASM
disk group:
You also need to back up the control file to
shared storage. To back up the control file to the FRA disk group
(+RECO_EXAD on Exadata), issue the following command:
Notice that the backup control file’s name includes the tag specified in the control file backup.
Use Cases for DBAs
This section describes how various use-case
scenarios affect DBAs who manage ASM configurations as well as
illustrates some specific tasks when ASM comes into the picture.
Impact of DB_UNIQUE_NAME for the ASM Directory Structure
The DB_UNIQUE_NAME plays a vital role in the
ASM directory structure. By default, the DB_NAME and DB_UNIQUE_NAME
initialization parameters are the same. The DB_UNIQUE_NAME dictates the
directory name used after the +DATA and +FRA disk group name. The
following directory structures are created below the DB_UNIQUE_NAME
directory:
+DATA/DB_UNIQUE_NAME/datafile
+DATA/DB_UNIQUE_NAME/controlfile
+DATA/DB_UNIQUE_NAME/parameterfile
+DATA/DB_UNIQUE_NAME/onlinelog
+DATA/DB_UNIQUE_NAME/changetracking
+DATA/DB_UNIQUE_NAME/tempfile
+FRA/DB_UNIQUE_NAME/archivelog
+FRA/DB_UNIQUE_NAME/backupset
+FRA/DB_UNIQUE_NAME/controlfile
+FRA/DB_UNIQUE_NAME/onlinelog
So you ask, why is this important? This
allows you to create a physical standby database with a different
directory structure than the primary database. The DB_UNIQUE_NAME also
plays a critical role when we are manually cloning a database. If we
want to clone the PROD database to QA, we keep the DB_NAME as PROD but
change the DB_UNIQUE_NAME to QA during the RMAN restore process. Once
the database restore and recovery are complete, we rename the PROD
database to QA. At this point, we change the initialization parameters
for DB_NAME to match DB_UNIQUE_NAME and rebuild the control file.
Backing Up the Database to the FRA Disk Group
OK, you have backed up your database to the
+FRA disk group. You do not own a license of the media management layer
(MML), nor do you ever plan to purchase a license. So what do you do
now? Let’s start by reviewing our backup script to perform a full
database backup:
By querying the RMAN repository, we can collect high-level information about the backup that was performed:
Taking advantage of the tag from the full
database backup, we recognize that the FULL_DB_BACKUP backup set
includes the primary keys 946, 947, 948, 949, and 950. Leveraging the
primary keys of the backup set, we can perform a backup of the backup
set to the file system. You can perform a copy of the backup set or the
image copy to the backup file system using this command:
A more preferable approach would be to leverage the tag again to perform the backup of the backup set to the file system:
As you will notice throughout this chapter,
RMAN tags are heavily utilized to assist and simplify our backup and
restore infrastructure. The same logic applies to image copies that we
apply to the FRA disk group. To copy database images from the FRA disk
group to the file system, issue the following RMAN command:
Specifying Default Locations for Data Files and Redo Logs
The db_create_file_dest parameter should
always be set to the DATA disk group, where the database data/temp files
reside. Instead of specifying a location of the database file during
tablespace creation or adding a data file to a tablespace, we can
leverage the default location for all newly created data files. The
db_create_online_log_dest_1 and db_create_online_log_dest_2 parameters
refer to redo logs and play an important role as well. Here are the
three initialization parameters being investigated:
Because we have our db_create_file_dest parameter set, we can issue a simple command to create a tablespace:
Notice that no location option was specified.
Automatically, the asm_default_test tablespace will be created in the
DATA disk group (+DATA_EXAD) with a 100MB data file. Because auto-extend
is automatically enabled, this newly created data file will be able to
auto-extend to approximately 32GB.
Adding a data file to a tablespace has never been so easy. We simply issue the following command:
Again, a 100MB data file will be created in
the location specified in the db_create_file_dest parameter. Like
everyone else who manages large database, you want to be able to specify
32GB data files. The syntax is extremely simple for that, too:
Not only does the db_create_file_dest
parameter save time in creating/altering tablespaces, it can also
mitigate potential fat-fingering during the tablespace growth process.
If the + sign is left out accidentally during a tablespace growth
process, the new data file would locate itself in the $ORACLE_HOME/dbs
directory, as shown here:
Other RAC instances will report foul play on
the newly created data file because it will not be able to read/write to
the data file. To get yourself out of this situation, you can copy the
data file from the file system back to ASM with either RMAN or asmcmd
after offlining the data file. Once the data file is copied into ASM,
you can switch the data file to copy and initiate a data file recovery.
Once data file recovery completes, you can online the data file for
production usage.
So far we have covered default locations for
data files. The two parameters, db_create_online_log_dest_1 and
db_create_online_dest_2, specify the default locations for online redo
logs. For the RAC instance, we can create another redo log group with
the size 250MB:
Once the redo log is created in our +DATA_EXAD
and +RECO_EXAD disk groups, we can issue a query against the v$logfile
and v$log views to retrieve the location of our redo log group:
As you can see, redo log group 30 was
successfully created in the +DATA_EXAD and +RECO_EXAD disk groups. The
default locations for the redo logs will come in extremely handy as we
clone databases with RMAN. When we issue the “alter database open
resetlogs” command, all the redo logs will be created in the
db_create_online_log_dest_1 and db_create_online_dest_2 disk groups.
Copying Files from One ASM Instance to Another
What do you do if you need to ship archive
logs from the primary database to the standby database? What do you do
if you need to ship the backups from the primary RAC cluster on the
primary data center to the disaster recovery data center physical
standby RAC cluster and you do not have a file system to stage your
backups? Oracle Database 11g Release 1 provided the feature to
copy files from an ASM disk group to another ASM disk group with the
asmcmd cp command. We can leverage this technique, but for a majority of
DBAs, an easier approach would be to copy the archivelog backups from
source to target destination using the standard OS scp command.
The classic problem that a lot of companies
face today is that larger companies have multiterabyte databases or even
databases that are hundreds of terabytes. Obviously, if you have a
multiterabyte database, you can’t copy that over the network, especially
over a WAN. In this case, you will be backing up to tape and restoring
the backup. If you happen to own the Media Management Layer (MML)
license, you can restore directly to ASM. If you do not, ASM Clustered
File System (ACFS) plays a vital role in this scenario. Like a local
file system or a network file system, ACFS can be used as the
destination for disk-to-disk backup. You can ask the system
administrators to sweep the ACFS file system to tape. On the target Data
Guard server, the system administrator can restore the backup taken
from the primary data center to the ACFS or local file system.
Lot of companies have spare space on the ASM
disk groups. We can easily carve a volume out of an ASM disk group and
instantly create an ACFS file system on demand as needed. With ACFS, you
can dynamically resize an ACFS file system as needed to meet your space
requirements. For larger companies, the time to perform the backup to
tape, collect and ship the tapes to the disaster recovery site, add the
tapes to the tape array, and restore the tape content to the ACFS file
system can take days. During this time, we need to manually ship the
archive logs from the primary data center to the disaster recovery data
center. To make the process simpler, we want to ship archive logs from
the primary data center FRA ASM disk group (+ASM1 instance) to the
disaster recovery site FRA ASM disk group (+ASM1). The tedious tasks of
performing RMAN backups of the archive logs to the ACFS, copying the
archive logs to the ACFS file system on the disaster recovery site, and
restoring the archive logs will overburden most of the DBAs. We need a
much simpler approach to transferring the archive log from the primary
data center to the disaster recovery data center.
We can copy Oracle-related files from one ASM
instance to another ASM instance using the asmcmd cp command. The syntax
to perform the copy command is as follows:
If you are pushing archive logs, you will want
to push increments, either by the day or incrementally throughout the
day, of the delta archive logs that were generated since the last push
of the archive log files. The following script does exactly that:
In this script example, replace the ARCH_DATE
using the format YYYY_MM_DD with the actual date for the archive logs we
are interested in transferring. The push_passwd variable represents the
password for the user account called dgmenu. Instead of using the SYS
account, we created a user account called dgmenu with the proper sysasm
privileges on the ASM instance. The logic behind this decision is to
drop or disable the dgmenu account after the archivelog propagation
exercise is complete. The shell script generates a file called
push_arch_asm.list.[i] that houses the list of archive log names that
match for the day. The following output only lists the first three
archive logs in question:
Once the script identifies the archive logs to
push by listing out the directory contents, it composes the remainder
of the syntax. Because this script has the requirements of copying the
deltas of the archive logs since the last push, it performs a diff
between the current push_arch_asm.list file and the latest file it
detects. You will also notice that the password for the privileged user
account with the sysasm privilege is hard-coded in the script. The
passwords are not echoed back during the execution asmcmd cp command.
Prior to starting to send the archive logs to
the disaster recovery site, you will need to manually create the
directory structures on the ASM +FRA disk group. From the asmcmd command
prompt, cd to +FRA and then mkdir ARCH; from the ARCH directory, mkdir
PROD. This will complete the setup required on the target ASM instance.
Obviously, the directory structure can be different and most likely will
be different in your environment.
Capture the output of the preceding shell
script, redirect the output to another shell script, and execute the
shell script in no-hang-up mode (nohup) in the background. Executing the
new captured shell script will yield the following output:
As the archive logs are being shipped, you can
check the +FRA/ARCH/PROD directory and issue the du command to see
space consumption increase on the directory. One very important thing to
note is that we intentionally do not fully qualify the target filename;
we cannot provide the file.incarnation file/incarnation pair, which is
used to ensure uniqueness. If the target filename is fully qualified,
you will encounter an ASMCMD-8016 error. The cp command will fail if you
fully qualify the target filename because the ASM filename is not a
single-file creation form that can be used to create a single file. The
filename should not contain the file number/incarnation:
Another important piece of information to
mention is that the target filename generates an alias rather than using
the system-generated filename. Notice the filename alias for
thread_2_seq_2455:
How to Migrate from File System to ASM with Minimal Downtime
You can use a couple of clever techniques to
move a database from file system to ASM with minimal downtime. One
out-of-the-box solution is to leverage Data Guard in the equation and
simply perform a switchover to the database on ASM. Another option is to
leverage RMAN image copies with forever incremental updates. In this
section, we cover the option to perform incremental updates to image
copy backups.
Because we can perform our due diligence and
prepare our ASM environment ahead of time, we can significantly reduce
our outage window. No matter what the size of the database is—500GB,
5TB, or even 50TB—we can cut our downtime to be less than 30 minutes
moving the even largest databases out there.
With the Data Guard approach, we have to worry
about the database being in force logging mode. If the database is not
in force logging mode, we will have to go back to development and
business to investigate batch jobs, scheduled maintenance, and any
specific tasks executed in unrecoverable or nologging mode. If we
identify any jobs running in nologging mode, we will have to modify the
jobs or come to a realization that we cannot force logging mode quite
yet. On the other hand, the Data Guard option does provide an added
advantage because we can switch back over to the file system if the
storage on the ASM does not meet the IOPs and throughput requirements of
the database workload.
To start our migration to the ASM, we must
first perform an RMAN image copy of the database to the Data disk group.
This step will be the longest-running step of the process. We can
expect to see about 1TB an hour for the RMAN image copy to ASM. For
performance considerations, you can increase the number of RMAN channels
from six to eight. To perform an RMAN image copy of the database to
ASM, execute the following command:
Notice the “tag=” embedded in the syntax to
identify the backup. Tags are crucial to the forever incremental update
strategy. If we do not specify a user-friendly tag, RMAN will generate a
tag automatically and assign it to the backup.
Another tidbit to add is that the RMAN image
copy is an exact copy of the database files. The size of the RMAN image
copy will be identical to the size of the database. You cannot perform a
compressed RMAN image copy.
The RMAN image copy can be taken a week prior
to cutover, several days prior to cutover, or even the night before
cutover. You will have to make the decision based on how long it takes
and how comfortable you are with the process. The closer you take the
RMAN image copy to the cutover date, the fewer incremental backups and
updates you will have to perform. Once we have successfully performed
our level 0 image copy, we can proceed with performing the level 1
incremental backup:
In the level 1 incremental backup, notice that
we specify that it is to be used to recover our level 0 image copy
backup. Also, pay particular attention to the tag being used. The tag is
the same one we specified in our level 0 image copy backup. For backup
simplicity purposes, we are placing the level 1 incremental backups in
the file system. Next, we perform the incremental update, which is
really a recovery to our database image copy to bring the database to a
state that’s current with the production database. Once the update
process is complete, we can perform our incremental backup again. This
is the process we will repeat, thus the marketing term “forever
incrementals.” Let’s see what the syntax for the update image copy
process looks like:
Notice that the tag is referenced again in the
recovery process as well. We will need to repeat steps 2 and 3 until
cutover time. Furthermore, we will want to repeat steps 2 and 3 as much
as possible, at a minimum of once per day. On the day of cutover, we
will want to repeat steps 2 and 3 often. As we near the cutover window,
we will want to perform another incremental update to the database image
copy. At the time of cutover, we will want to shut down the database
and bring it back up in mount mode. As the last step, we will perform
the final incremental update:
Now we are ready to switch our database to the
image copy. Old-timers would spend time generating the script to alter
the database rename file from source to target for each data file. We
can rename data files for the entire database with a single switch
database command:
We are almost there. We need to issue the
recover database command to get our database on ASM to be completely
consistent and then open the database:
We still have some cleanup to do. We still
need to move the redo logs (really, create new groups and drop old
groups) to ASM, move the control files to ASM, move the spfile to ASM,
and move our temporary tablespace to ASM. Of the steps listed earlier,
the control file and spfile relocation efforts need an outage window.
All other activities can be done while the database is online. We
encourage you to perform the redo log activity prior to the cutover
weekend. The same logic applies for the temp tablespace move to ASM as
well. Because we can easily ping pong back and forth between ASM and the
file system, you should reduce the stress during the cutover weekend
and perform these two tasks beforehand. Oracle fully supports a hybrid
ASM and file system database placement configuration. The cookbook
approach to moving the redo logs and temporary tablespaces to ASM is
available in the subsequent sections of this chapter.
How to Move the Temporary Tablespace to ASM
The temporary tablespace also needs to be
moved from file system to ASM. Remember that RMAN does not back up the
temporary tablespace. The easiest way to move the temporary tablespace
to ASM is to create a new tablespace and move all the database accounts
to use the new temporary tablespace:
If a database account was created specifying
the temporary tablespace explicitly, you will have to modify these
accounts to the newly created temporary tablespace. In the following
example, we are relocating the system account’s default tablespace to
the new temporary tablespace:
As the final step, we need to drop the
previous temporary tablespace. If the drop of the temporary tablespace
takes too long, you can query the V$SORT_USAGE view for consumers of the
temporary tablespace.
Alternatively, we can add the new temp files
to ASM and drop the old temp files. First, we will query the V$TEMPFILE
dictionary view to identify the temp file we want to drop from the file
system:
In this example, we will add another temp file
to the +DATA_EXAD disk group and drop the old temp file that existed on
the file system:
How to Move the Spfile to ASM
Moving the spfile to ASM is a simple
process. We can leverage several techniques to achieve our end goal. We
can create the spfile from pfile and specify the location inside ASM,
restore the spfile from our last backup to ASM, or copy the spfile with
asmcmd to the DATA disk group. The easiest way is to leverage asmcmd:
Once you copy the spfile from the file system to ASM, you need to update the contents of the OCR to reflect the new location:
As a precautionary measure, you can shut down
and restart the database to make sure you are able to restart the
database with the spfile inside of ASM:
As a general rule, you may want to consider creating a backup pfile to a file system before touching the spfile.
How to Move a Tablespace from One Disk Group to Another
Often we have to move a tablespace or data
file from one disk group to another because we either ran out of space
in the existing disk group or the existing disk group does not provide
the level of IOPs and throughput necessary for the database workload.
To move a tablespace from one disk group to
another, we must take tablespaces offline, copy the data files from one
disk group to the other, rename the data files for the tablespace, and
bring the tablespace back online. For large tablespaces, the amount of
downtime required to copy data files from one disk group to another can
take hours or even days. We can introduce tablespace recovery in the
equation to reduce the downtime. We can copy the tablespace from one
disk group to another, rename the data files for the tablespace, and as
the final step recover the tablespace to forward the System Change
Number (SCN) to be consistent with the database. Again, for small
tablespaces, this approach is perfectly acceptable because the amount of
downtime will be minimal. The amount of downtime to perform the final
recovery step can be excessive when we talking about rolling forward a
half day or a full day’s worth of database activity.
As DBAs working with ASM, especially working
in the RAC world, we should have high availability and minimal downtime
in mind with everything we do. To move a tablespace with minimal
downtime, we need to introduce incremental backups to reduce the
downtime. In our test case, we expose four sets of scripts: r1.sql,
r2.rman, r3.rman, and rf.rman. Here are the steps:
1. r1.rman performs the image copy of the tablespace.
2. r2.rman performs the incremental backup for the tablespace.
3. r3.rman performs the incremental
update (recovery) in the new tablespace on the new disk group. We can
perform steps r2.rman and r3.rman as many times as we need to reduce our
downtime needed for rf.rman.
4. rf.rman, which stands for RMAN
final, executes the final step, where we take the tablespace offline,
switch the tablespace to copy, recover the new tablespace on the new
disk group, and bring the tablespace back online.
In the r1.rman script, we simply copy the tablespace called ck as an image copy from the DATA disk group to the RECO disk group:
As a second step, we will perform an
incremental backup for the tablespace. In our test case, we perform the
incremental backup to the file system on /u01. The bkups subdirectory is
a symbolic link that points to the NFS mount point. Alternatively, we
can perform the backup to an ASM disk group:
As a third step, we perform our incremental
update to the new ck tablespace in the new disk group. Tags are
instrumental in performing incremental updates.
We can repeat steps r2.rman and r3.rman as
often as we need to keep the newly copied tablespace as current as
possible. When we can get a brief outage window or when we can quiesce
the tablespace, we can perform the rf.rman final step. As you can see
next, the final step takes the tablespace offline, switches the
tablespace to the copy image that we have been recovering, recovers the
new tablespace, and brings the new tablespace on the new disk group
online:
Executing the rf.rman final step yields the following output:
After we execute rf.rman as the final step,
the tablespace is now running from the new disk group. You can leverage
this same technique to do the following:
Move a data file from one ASM disk group to another disk group
Move a tablespace from file system to an ASM disk group
Move a data file from file system to an ASM disk group
Create Multiplexed Redo Logs
From time to time, DBAs need to resize redo
logs for optimal performance. The size of online redo log files depends
on the amount of redo the database generates. As a general rule, we
should not see a log switch more than once every 20 minutes. If you have
excessive redo log switches (greater than three to four in an hour),
you should consider creating new redo groups and drop the old redo
groups. Unfortunately, we cannot resize online redo logs. It is
reasonable to see a redo log in the range of a hundred megabytes to
several gigabytes in size. To create new redo logs, you can leverage the
alter database add logfile command:
As a best practice, we should multiplex redo
log groups with one member in the DATA and FRA disk groups; we should
also at a minimum have at least four groups for each thread.
For standby redo logs (SRL), we do not need to
multiplex them. We just need to have one SRL more than the total number
of redo groups for the entire database. An SRL should be created on the
fastest disk group possible; it does not need to belong to a specific
thread. SRLs should be created on both the primary and standby
databases. As a best practice, SRLs should be created on the primary
database prior to the Data Guard instantiation process. SRLs will
automatically be copied to the physical standby. The syntax to create an
SRL looks like this:
Dropping existing redo log groups is a little
more tricky. You just need to make sure that the redo group you are
dropping is not current or active. If it is not active, you can drop the
redo log group with the following command:
To switch between redo logs, you can consider leveraging one of these three commands:
Summary
ASM is a storage management solution from
Oracle Corporation that has made the storage management layer more
suitable and flexible. In this chapter, we discussed this technology and
you learned how the ASM storage management solution differs from other
solutions available on the market as well as how ASM complements RAC.
This chapter covered using ASM with an RDBMS
instance, starting with the basic installation and configuration through
maintenance administration and performance monitoring. While
configuring the RDBMS instance using DBCA, we followed the best practice
of using two disk groups—one for data and the other for FRA. Regardless
of whether the database is a clustered solution or a single instance,
the configuration and deployment of ASM are no different. We covered the
various methods in which data can be migrated from a non-ASM-based
Oracle database into ASM. Finally, we looked into the monitoring
options.
CHAPTER
6
ASMLIB Concepts and Overview
Early in the
design of ASM, Oracle decided that an alternative to the standard
operating system interface for device management, disk discovery, and
provisioning was necessary and that this mechanism needed to be easily
integrated into ASM. The core concept was to produce a storage
management interface called ASMLIB. ASMLIB is not required to run ASM;
it is instead an add-on module that simplifies the management and
discovery of ASM disks.
This chapter covers the ASMLIB layers and their corresponding benefits.
Benefits of ASMLIB
The objective of ASMLIB is to provide a more
efficient mechanism for managing disks and input/output (I/O)
processing of ASM storage. ASMLIB provides a set of interdependent
functions that are implemented in a layered fashion. From an
implementation perspective, these functions are grouped into two sets of
functions:
Device discovery
I/O processing
Each function group is dependent on the
existence of the lower-level group. For example, device discovery
functions are the lowest-layer functions.
Device Discovery
Device discovery identifies and names the
storage devices that are operated on by higher-level functions. The
discovery function makes the characteristics of the disk available to
ASM. Disks discovered through ASMLIB do not need to be available through
normal operating system interfaces.
The device discovery function takes a
discovery string as input from the user and then performs a disk
discovery. The output from this procedure is a set of all qualified
disks. In a Real Application Clusters (RAC) environment, the discover
procedure executes the discovery function on the node where the
procedure was issued.
Disk Management and Discovery
ASMLIB automatically takes the ASM disk name
from the name that the ASMLIB administrative tool provides. This
simplifies adding disks and correlating OS names with ASM names; also,
pre-naming disks eliminates erroneous disk management activities. The
user can manually change the name by using the NAME clause while issuing
a CREATE DISKGROUP or ADD DISK statement. The default discovery string
for ASM is NULL, so with ASMLIB in place, all disks are automatically
discovered if NULL is kept as the discovery string. Because the user
does not even need to modify the ASM_DISKSTRING, this makes disk
discovery much more straightforward.
The ASMLIB permissions are persistent across
reboot, and in the event of major/minor number changes, ASMLIB will
continue to find its labeled disks. In RAC environments, the disk
identification and discovery process is similar to that of a
single-instance environment. Once the disks are labeled on one node, the
other clustered nodes simply use the default disk discovery string, and
discovery is seamless.
I/O Processing
The current standard I/O model (one where
ASMLIB is not implemented) imposes OS central processing unit (CPU)
overhead partly because of mode and context switches. The deployment of
ASMLIB reduces the number of state transitions from kernel-to-user mode
by employing a more efficient I/O scheduling and call processing
mechanism. A single call to ASMLIB can submit and reap multiple I/Os.
This dramatically reduces the number of calls to the operating system
when performing I/O. Additionally, all the processes in an instance can
use one I/O handle to access the same disk. This eliminates multiple
open calls and multiple file descriptors.
The ASMLIB I/O processing function, which is
implemented as a device driver within the operating system kernel,
provides an optimized asynchronous interface for scheduling I/O
operations and managing I/O operation completion events. These
functions, in effect, extend the operating system interface.
Consequently, the I/O processing functions must be implemented as a
device driver within the operating system kernel.
NOTE
ASMLIB does not use the kernel asynchronous
(async) I/O calls, such as the standard io_submit(2) and io_getevents(2)
system calls. In fact, it does not use kioctx types at all. ASMLIB uses
the interface provided by the ASMLIB kernel driver. Async I/O is
automatically used with ASMLIB.
|
Reduced Overhead
ASMLIB provides the capability for a process
(ASM RBAL) to perform a global open/close on the disks that are being
dropped or closed. Every file and device currently open on the system
has an open file descriptor, and the system has a finite number (defined
at the system level) of these file descriptors. ASMLIB reduces the
number of open file descriptors on the system, thus making it less
likely that the system will run out of global file descriptors. Also,
the open and close operations are reduced, ensuring orderly cleanup of
file descriptors when storage configurations change. A side benefit of
this capability is that the database starts up faster. Without ASMLIB,
file descriptors often cannot be cleaned up for dropped disks until ASM
instances are shut down. This is because idle database foreground
processes, which have open file descriptors, never get around to closing
file descriptors.
ASMLIB performance benefits are evident when
the CPU usage on the server is very high (85 to 90 percent); this is
when ASMLIB’s reduced context switches and efficient reaping of I/Os
really help to minimize CPU overhead. Also, for very big databases with a
large number of open data files, ASMLIB efficiently handles file
descriptors and thus reduces the number of open file descriptors on the
system, making it less likely to run out of global file descriptors.
Also, the open and close operations are reduced, ensuring orderly
cleanup of file descriptors when the storage configuration changes.
Although it helps reduce CPU overhead, ASMLIB
is foremost a device management tool used to simplify ASM disk
administration. This is especially evident when large clusters of four
or more nodes are being deployed. Once ASMLIB is used to mark the ASM
disks one node, the remaining nodes in the cluster only need to scan and
discover the devices. The repetitive administrative and error-prone
tasks needed to manage OS devices for RAC are significantly minimized.
ASMLIB Implementation
This section will cover the essentials of ASMLIB packaging and implementation.
ASMLIB consists of two main components:
ASMLib This
consists of a userspace library with config tools. Since the userspace
library contains Oracle header files, it is distributed as a binary-only
module.
oracleasm.ko A kernel module that implements the asm device for the asm device via /dev/oracleasm.
Note that there is only one asm library for
the various Linux platforms, and this library is opened by ASM and
Oracle Database processes. This library also interacts with the OS
through the /dev/oracleasm. Further note that the userspace library is
opaque to the underlying OS version; however, the kernel module and
device driver are kernel and OS dependent
ASMLIB Installation
Oracle’s ASMLIB software is available from the Oracle Technology Network (OTN) at http://otn.oracle.com. Users are encouraged to go to the ASMLIB site to download the appropriate Linux rpm files for their platform.
Three ASMLIB packages are available for each
Linux platform. For example, the following three packages are for the
Intel (x86_64) architecture (as of this writing):
Package | Description |
oracleasm-support-2.1.7-1.el5.x86_64.rpm | This package contains the command-line utilities. |
oracleasmlib-2.0.4-1.el5.x86_64.rpm | This is the core ASMLIB library. |
oracleasm-2.6.18-274.7.1.el5-2.0.5-1.el5. x86_64.rpm | This is the kernel driver and is kernel dependent. |
Choose the correct kernel driver rpm file
based on the uname –r command on the system to be installed. For
example, if the OS is Red Hat Enterprise Linux 5 AS and the kernel is
2.6.18-274.7.1.el5, then accordingly choose the suite of packages
associated with this kernel:
oracleasm-2.6.18-274.7.1.el5-2.0.5-1.el5.x86_64.rpm
oracleasm-2.6.18-274.7.1.el5xen-2.0.5-1.el5.x86_64.rpm
oracleasm-2.6.18-274.7.1.el5debug-2.0.5-1.el5.x86_64.rpm
oracleasm-2.6.18-274.7.1.el5-debuginfo-2.0.5-1.el5.x86_64.rpm
The ASMLib kernel driver is included in the
Unbreakable Enterprise Kernel, thus no driver package needs to be
installed when using this kernel. However, the oracleasm-support and
oracleasmlib packages still need to be installed. These can be obtained
through the Oracle Unbreakable Linux Network (ULN), available at the
following location:
Note that the version number listed in the
ASMLib kernel package is the Linux kernel version that was used to
compile the driver. Install the required rpm files using the rpm –Uvh
command. For example, if the platform is the Intel x86_64 architecture
and the kernel is 2.6.18-274.7.1-el, then the following would need to be
installed:
Installing ASMLIB for Unbreakable Linux Network
For systems registered with Oracle’s
Unbreakable Linux Network, all the ASMLIB software can be downloaded
directly from ULN. This greatly simplifies ASMLIB installation and
upgrades.
The ASMLIB rpm file is part of the Oracle
Software for Enterprise Linux channel. A system must be added to this
channel to install the library package. This requires that users log in
at the ULN website, https://linux.oracle.com,
and select the Systems tag. This brings up a list of systems registered
with ULN. Select the system that will install the ASMLIB package. Then,
select the Subscribe button under the Oracle Software for Enterprise
Linux link. The system is now able to download the ASMLIB rpm file.
Once the system is registered with ULN, it
will automatically download (and optionally installs) the new upgrade
versions of all the ASMLIB packages via the up2date –u command. Log in
as root and run the following:
Upgrading ASMLIB
In certain cases the Linux kernel needs to
upgraded. When this occurs, the ASMLIB kernel driver needs to be
upgraded as well. The oracleasm update-driver command can be used to
download and install the latest ASMLib driver for the installed kernel.
This program, which is a front end for
linkoracleasm:oracleasm-update-driver, queries the Oracle Technology
Network. It supports drivers available via the Unbreakable Linux Network
and on the Oracle Technology Network. If no kernel version is specified
for the command, it will update the currently running kernel. If
multiple kernels are installed, this command can be used to update
multiple kernel versions.
To upgrade ASMLIB, the system administrator
can take the following steps. Note that this is strictly a software
upgrade and no disk labels are manipulated. Also, this is a local node
upgrade; therefore, for RAC environments, each individual node should be
upgraded.
1. Shut down ASM.
2. Execute /usr/sbin/oracleasm stop.
3. Execute /usr/sbin/oracleasm update-driver, as follows:
4. Execute /usr/sbin/oracleasm start.
5. Start up ASM.
Setting Up and Configuring ASMLIB
Once the appropriate Linux rpm files are
installed, the ASMLIB installation places a utility in the /usr/sbin and
/etc/init.d directory called oracleasm. As a best practice, the
/etc/init.d/oracleasm utility should be used only to configure, enable,
and disable ASMLIB. The /usr/sbin utility should be used as the
front-end tool to perform all device management activities, such as to
create, delete, list, and scan disks. The next step is to set up and
configure ASMLIB. Typically, a system administrator with root access
performs this step.
Execute the /etc/init.d/oracleasm script with
the configure option. This sets up and configures ASMLIB for the system.
You must execute this command on each node of RAC cluster. In our
example, ASM is running as the grid user and the asmadmin group:
At the end of this configuration, a virtual
file system (specific to ASMLIB) called /dev/oraclesasm is created and
mounted. This file system has two subdirectories called disks and iid.
The disks subdirectory lists the ASMLIB scanned and present on the
system. The iid directory contains metadata information about all
instances (ASM and databases) that registered with the ASMLIB. These
directories are for internal use only, and should not be manipulated in
any way by users.
Finally, the configuration will load the
oracleasm.o driver module and mount the ASM driver file system because
the on-boot load of the module was selected during the configuration.
Now that ASMLIB is installed, the automatic
startup of ASMLIB can be enabled or disabled manually with the enable
and disable options. Using the /etc/init.d/oracleasm utility, the
process runs as follows:
Managing Disks in ASMLIB
After the storage administrator provides
disk devices to the server, they become available to the server and can
be seen in the /proc/partitions virtual file on Linux.
The oracleasm requires that disks have a disk
partition defined. However, oracleasm has a whitelist of the devices
that do not need to be partitioned. Devices in this whitelist include
RAM disks and device mapper (DM) devices.
Creating ASMLIB Disks
Once the system administrator appropriately
partitions the disk devices using the Linux fdisk utility, each disk
device is ready to be configured as an ASMLIB disk (in other words, the
system administrator creates an ASMLIB disk). The administrator does
this using the oracleasm createdisk command. The createdisk command
takes two input parameters—the user-defined disk name followed by the
device name—as follows:
Disk names are limited to 24 characters. They
must start with a letter but may otherwise consist of any ASCII
characters, including capital letters, numbers, and underscores.
Each disk successfully created is considered a marked ASMLIB disk and is listed in the oracleasm file system, /dev/oracleasm/disks.
You can query disk devices to determine
whether they are valid ASMLIB disks. To do so, use the following
oracleasm querydisk commands. The querydisk command is applicable to
both the raw device and the ASMLIB disk. The oracleasm querydisk command
has two options: the -d and -p options. The -d option displays the
device number of the queried disk. The -p option attempts to locate
matching device paths if the blkid(8) tool is installed.
Alternatively, all the disks marked and created using ASMLIB can be listed using the following oracleasm listdisks command:
If a disk device is not an ASMLIB, the following is displayed:
Disks that are no longer needed by ASM can be unmarked and deleted using the oracleasm deletedisk command:
When ASMLIB is deployed in a RAC environment,
the shared disk architecture of RAC allows the createdisk command to be
run on only one node; all other nodes of the RAC cluster simply need to
pick up the disk list via an ASMLIB scan command. For example, in a
two-node RAC cluster, node 1 can mark an ASMLIB disk and node 2 can then
execute a scan of the ASMLIB disks to discover the marked ASMLIB disks.
This process mitigates the need to discover and set up the disks
manually on each node and provides each disk with a unique clusterwide
name.
NOTE
The ASMLIB scandisks command should not be
used against devices that are currently open and in use (that is, are
part of an open disk group). Running this command against open disks
will close and reopen them, which causes the disk group to be dismounted
and any database using the disk group to abort. Users should instead
run oracleasm scandisks specifically for any newly configured or
modified disks (for example, oracleasm scandisk /dev/sdxx).
|
After the appropriate ASMLIB disks are
created, the ASM init.ora parameter ASM_DISKSTRING can either be left
with the default (NULL) or set to ‘ORCL:*’. Once this is set, ASM will
discover the disks as follows:
Note that if ASMLIB is not being used, then
the preceding query returns “System” under the LIBRARY column. See the
“Troubleshooting ASMLIB” section in this chapter for more details.
Renaming Disks
The renamedisk command is used for changing
the label of an existing member without losing data. Note that the
modification of the disk must be done while ASM is not accessing the
disk. Therefore, the disk group must be dismounted, and in an RAC
environment, all ASM nodes in the cluster must have dismounted the disk
group as well. Corruption may result if a renamedisk operation is done
while any ASM instance is accessing the disk to be relabeled. Because
this renamedisk command is dangerous, ASM indicates this by printing out
a gentle message after the command is executed:
The renamedisk command takes two parameters: the raw device name followed by the ASM disk name.
Discovering Disks
The command /usr/sbin/oracleasm-discover is a
simple utility that determines which devices Oracle’s Linux ASMLIB sees
during discovery. This command is more of a debugging tool to validate
that the discovery is listing all the required disks. This command also
lists the maximum I/O size (maxio) per disk—that is, the maximum I/O, in
sectors, that ASMLIB can send to the device as one command.
The maximum I/O size comes from the small
computer system interface (SCSI) host bus adapter (HBA) driver and
anything else in the disk device chain. In the preceding case, maxio
equals 512—that is, 512 byte sectors, or 256K. If maxio shows low
values, such as 128, then it is possible that intermediary components,
such as multipathing drivers or HBA drivers, may be the limiting factor.
The following script is useful to map ASMLIB devices to OS devices:
Migrating to ASMLIB
There may be cases where you need to convert
a “member” ASM raw disk to an ASMLIB disk; for example, you may need to
do so when installing ASMLIB after the ASM infrastructure is in place.
Conversely, if you’re de-installing ASMLIB, you can convert an ASMLIB
disk to a standard ASM disk. All this can be done without destroying the
data on the disk. However, the disk group must be dismounted before the
conversion because ASM cannot be actively using the disk.
This conversion works without destroying data
because of the structures on the disk and what ASM reads on disk open.
ASM reads the header off of the disk and recognizes the disk group to
which it belongs. When a disk is added to a disk group, ASM writes
several things on disk. Two of the most important items are the disk
marker (or tag) and the ASMLIB label. All ASM disks have the tag
ORCLDISK stamped on them. If the tag already exists, the disk is either
currently or was formerly in use by ASM. This tag can be created by ASM
or ASMLIB. For example, when ASM initially uses a disk as a raw device
via disk group creation, ASM automatically adds the tag. Conversely,
ASMLIB adds this tag when a disk is configured by the ASMLIB command
/usr/sbin/oracleasm createdisk. Regardless of which tool creates the
tag, once this tag is added, the disk is considered “marked.”
The second part of the disk header that is
relevant to ASMLIB is the ASMLIB label. This string of 24 characters
allotted for ASMLIB is used to identify the disk. When the disk is
configured by ASMLIB using the /usr/sbin/oracleasm createdisk command,
the associated label is written to the device. Note that ASM preserves
the disk’s contents, but it does write to this section reserved for
ASMLIB.
The following is a sample dump of an ASM disk
(without ASMLIB) that was incorporated into a disk group. Notice the ASM
disk name in the header, DATA_0002:
The following is a sample dump of an ASMLIB
disk created using the following /usr/sbin/oracleasm createdisk command.
Notice that VOL1 is appended to the ORCLDISK label:
In this example, the tag ORCLDISK and the
label VOL1 are stamped to the disk. The next time that ASMLIB scans for
disks, it will see the tag ORCLDISK and the label VOL1 and create an
ASMLIB disk-to-device mapping using the name VOL1. Subsequently, ASM
will discover this disk via the ASMLIB driver. If the disk were not
initially managed by ASMLIB—that is, if it were instead managed by ASM’s
block device (non-ASMLIB) access—then when ASM discovers the disk, it
would detect that no ASMLIB label exists, bypass this section of the
header, and open the device as a native (non-ASMLIB) ASM disk. This is
why a disk can move from ASMLIB access to raw device access with no
problems.
Using ASMLIB and Multipathing Utilities
As was stated in Chapter 4,
multipathing drivers generally virtualize subpaths using pseudo
devices. During disk discovery, ASMLIB uses the pseudo file
/proc/partitions. This is a Linux file that records all the devices and
partitions presented to the machine. To function correctly with
multipathing, ASMLIB must operate only on the pseudo devices. Thus, the
behavior of the ASMLIB discovery must be altered to leverage the pseudo
devices. You can configure this behavior in the file
/etc/sysconfig/oracleasm by changing two parameters:
ORACLEASM_SCANORDER
Dictates the ASMLIB scan order of the devices, using the generic prefixORACLEASM_SCANEXCLUDE
Indicates which devices should not be discovered by ASMLIB
For example, if you are configuring ASMLIB with EMC’s PowerPath, you can use the following setup:
In this example, ASMLIB scans EMC PowerPath
disks identified under /dev/emcpower* and excludes all the devices in
/dev/sd*. Consult your multipathing vendor for details and a support
matrix.
The following is an example of an /etc/sysconfig/oracleasm file:
Using ASMLIB and Device Mapper
I/O multipathing is the ability to access
storage block devices via multiple paths from the host computer. To have
high availability in storage access, you must make use of the I/O
multipathing solution, which provides failover between I/O paths. Many
of the multipathing solutions also provide load balancing of I/O between
the I/O paths, which gives better performance. The path can be thought
of a connection between a port on the host to a port on the target
storage device. Many multipathing solutions are provided by various
vendors. In this section, we are mainly going to talk about Linux
multipathing based on Linux Device Mapper. Linux Device Mapper–based
multipathing provides path failover and path load sharing capability on
the redundant paths to the block device from the host. The block device
could be of any type—SCSI, IDE, and so on.
User Case Scenario
A customer configures ASMLIB on Linux Device
Mapper. There are three locations where device mapper devices exists.
Which device should be used for ASMLIB?
/dev/dm-*
/dev/mpath/mpathN
/dev/mapper/mpathN
ASMLib currently acknowledges devices only
listed in /proc/partition. The only device mapper devices listed in
/proc/partitions are the /dev/dm-* devices. However, these are not
generated by default. In order to generate /dev/dm-* names
appropriately, users will need to comment out the line in
/etc/udev/rules.d/50-udev.rules that reads
like this:
NOTE
The /dev/dm-* devices are not named
consistently across nodes in a cluster. For this reason, users should
create ASMLib devices using the /dev/mapper names. However, users do not
need to change any permissions because ASMLib handles permissions for
you.
|
Troubleshooting ASMLIB
The section provides a top-down list of items to review and validate that ASMLIB is installed and working correctly.
1. Use the current release of
ASMLIB. Verify the software versions. ASMLIB requires a driver exactly
matching the running kernel, thus matching the oracleasm kernel package
with the output of the uname –r command:
Verify the ASMLIB installation as indicated in the installation documentation.
2. Make sure that the oracleasm configure command ran properly. Confirm this configuration as follows:
a. Execute
the lsmod command (as root) to show the loaded oracleasm module. The
oracleasm module should be listed with a “Used by” column setting of 1.
b. Execute the command cat /proc/filesystems, and make sure that an entry named oracleasmfs exists in this file system listing.
c. Execute the command df –ha. This should show you that oracleasmfs is mounted on /dev/oracleasm:
d. Make
sure that oracleasm createdisk was properly run for the candidate
disks. To be used for ASM, a disk must be marked by the createdisk
command. When a disk is “marked,” a signature is written to the header
of the disk—that is, the disk is stamped for ASM use. You can validate
this by using the following commands:
Execute the oracleasm listdisks command. This command displays marked ASMLIB disks. This command will list all marked disks.
Execute the oracleasm querydisk command for each disk marked to ensure that each is marked.
3. Execute ls -l
/dev/oracleasm/disks to ensure the ownership and permissions are
grid:asmadmin (that is, the user and group used in the configure
oracleasm configure command) for each disk name that was created using
the oracleasm createdisk command:
4. Verify that the ASMLIB discovery
string (either at the ASM Configuration Assistant [ASMCA] prompt or in
the ASM init.ora ASM_DISKSTRING parameter) is set to ORCL:* or to NULL.
Also, if the ASM instance is active, check the ASM alert log to see
whether the correct string is being used.
5. Use the Unix grep command against
the ASM alert log to see whether ASM is displaying any messages
regarding discovery. A successfully loaded ASMLIB will display the
following general message (depending on the version):
The following query shows disks that ASMLIB properly discovered:
NOTE
If the query shows SYSTEM under the Library
column, then the ASM_DISKSTRING is not set correctly—that is, ASMLIB is
not used to access the disks. ASMLIB needs to access the disks through
the diskstring ORCL:*. Check ASM_DISKSTRING and verify that it is set to
ORCL:* or to NULL:
|
6. Look for errors:
a. Use /usr/sbin/oracleasm-discover ‘ORCL:*’ to discover devices or watch for the following error messages in the ASM alert log:
This message usually means that the
ASMLIB configuration permissions were incorrectly specified during the
/usr/sbin/oracleasm configure stage.
NOTE
In its current implementation, ASMLIB
creates devices in /dev/oracleasm/disks/. Setting a discovery string to
/dev/oracleasm/disks/* causes ASM to use standard OS system calls to
access the devices. It is not guaranteed that future versions of ASMLIB
will continue to create entries in this location. Users should not set
the ASM_DISKSTRING to /dev/oracleasm/disks/*.
|
b. Use strace to debug the command /usr/sbin/oracleasm-discover ‘ORCL:*’:
Check for the reference to library libasm.so:
If you get
check
if the library exists or if the permissions are correct (755). Also
validate that the directories in the path also have the correct
permissions (755).
ASM instance report errors ORA-604, ORA-15183, and ORA-15180 after deleting file libasm.so.
When
all the files under /opt/oracle/extapi path are deleted, if ASMLIB is
used, the following errors will be reported when mounting disk groups:
c. ASMLIB provides additional logging for the following functional areas:
ENTRY Function func call entry
EXIT Function call exit
DISK Disk information
REQUEST I/O requests
BIO BIOS backing I/O
IOC ASM_IOCS
ABI ABI entry points
The settings are recorded in the file /proc/fs/oracleasm. The following are the default values:
The three possible values are deny, off, and allow.
You can change the logging by executing the following (note that this does not require the oracleasm service to be restarted):
Here, xxxx is the function listed previously.
Typically, when users have problems accessing
devices via the ASMLIB discovery string or cannot list ASMLIB disks,
this is the result of a failure to install the right version of one of
the three packages in the current Linux ASMLIB implementation, or due to
missing or inability to access the library
/opt/oracle/extapi/64/asm/orcl/1/libasm.so. Also, if you’re using a
32-bit Linux installation, you need to verify the permissions of the
ASMLIB library file in /opt/oracle/extapi/32/asm/orcl/1/libasm.so.
Summary
ASMLIB is the support library for the ASM.
ASMLIB provides an Oracle database using ASM with more efficient access
to disk groups. The purpose of ASMLIB, which is an add-on to ASM, is to
provide an alternative interface to identify and access block devices.
These features provide benefits such as improved manageability and
performance as well as greater integrity.
No comments:
Post a Comment