Automatic Storage Management in a Cloud World
Cloud computing
means many things to different people. At the core of most people’s
concept of cloud computing is a set of concepts speaking to information
technology agility, scalability, and cost minimization. Early proponents
of cloud computing visualized IT becoming a utility very much like
electricity. To some extent, in the mobile platform space, that vision
has become reality. The cost of mobile applications has never been
lower, and the data feeding these applications is almost universally
accessible.
This chapter’s focus is how cloud computing
influences the management of storage in an Oracle Database world. An
important aspect of storage management for Oracle Database is the
database feature known as Automatic Storage Management (ASM). ASM is an
integrated volume manager and file system for Oracle’s database. It
takes the place of host-based volume managers and file systems, which
have long been the practice when deploying Oracle. Bundled with ASM is
ASM Cluster File System (ACFS), which is a POSIX-compliant file system
that relies on the volume management provided by ASM as volume space for
a file system supporting files outside the database.
The subsequent chapters of this book focus on
the currently available capabilities of ASM and related technologies.
This chapter addresses many of the recent changes that have been
influenced by cloud computing, and how cloud computing will impact ASM
in the future.
The Early Years
ASM was developed in the pre–Oracle 9i
days. This was before the release of Real Application Clusters (RAC)
databases providing commodity-class clustered databases. However, the
product that became RAC was previously called Oracle Parallel Server, or
simply OPS. Although it was expected that ASM would serve a wide
selection of needs, from small databases upward, Oracle development
believed that ASM would likely be important to OPS and the clustering of
database instances. Originally, ASM was called Oracle Storage Manager
(OSM), reflecting that relationship with OPS. In those early days,
storage management for customers deploying large Oracle databases was
rather challenging. As a customer, if you wanted to deploy OPS, you were
stuck with only two choices: NFS-based storage and raw-attached
storage. This limited choice was because Oracle’s clustered database
architecture is a shared-everything architecture requiring all storage
devices to be accessible by all database servers running in a cluster.
Furthermore, host-based global file systems that could accommodate
Oracle’s shared-everything architecture were not readily available.
Consequently, the storage management choices for the customer were
either to use the storage provided by network filers or to eliminate the
file system altogether and deploy raw volumes for the Oracle database
instances to access directly without any intermediary file system. Most
customers choose the latter because it was perceived that file servers
could not deliver the required I/O performance for large-scale
databases.
Although deploying raw storage for the Oracle
database can deliver excellent storage performance, it comes with a high
management overhead because of the ongoing storage management needed to
maintain the data layout as database workloads change. Furthermore,
database and systems administrators usually had to work closely together
to match the I/O demand requirements of database objects with the I/O
capabilities of storage devices.
Elsewhere in the storage industry, vendors
began developing enterprise-class cluster-aware storage management
products. The most successful of vendors was Veritas with their
cluster-aware file system and logical volume manager. Platform vendors,
most notably IBM and HP, also introduced products in this space, either
through partnering with Veritas or developing cluster-aware storage
products independently. Because these products were important to
customers, Oracle partnered with these vendors to ensure that they had
viable solutions supporting the needs for Oracle Parallel Server, which
later became RAC.
To help simplify storage management, Oracle
started an initiative called SAME, for “Stripe and Mirror Everything.”
The idea behind this initiative was simple, yet powerful. It proposed
that customers organize database file systems in a way that follows two
principles:
Stripe all files across all disks
Mirror file systems across disks for high availability
Previously, customers invested considerable
time optimizing many separate file systems to match the requirements for
each database object. The SAME concept greatly simplified things by
reducing the number of file systems and, consequently, the ongoing
tuning of storage resources for the Oracle database. At the core of ASM
is the SAME concept. ASM stripes all files across all disks and
optionally mirrors the files.
With its introduction, ASM offered features
that set it apart from conventional file system alternatives. Storage
management for an Oracle database environment typically involves
operating system–based volume managers and file systems. File systems
and volume managers are most often managed by system administrators, who
would set up the file system for the database administrator. This means
that system administers have frequent and recurring interactions with
database administrators. ASM, however, provided a database
administrator–centric file system from which database files are managed.
The entity of space management in ASM is called a disk group,
which can be thought of as a file system. The disk group is usually the
charter of the database administrator, rather than the system
administrator. The effect of this is that it changed the way in which
logical space is managed. This change meant that system administrators
continue to provide physical volumes that populate a disk group, but
DBAs become responsible for managing the disk groups, which is the file
system that the database depends on. Furthermore, because ASM inherently
implements the Stripe and Mirror Everything concept within the disk
group, it eliminates the kind of management overhead previously required
from system administrators.
Unlike conventional file systems, ASM is
integrated with the Oracle database. This integration provides
optimization not possible with a conventional file system. Like the
database, ASM utilizes an instance that is simply a collection of
processes sharing a region of memory. The ASM instance is responsible
for file metadata that maps the locations of file blocks to physical
storage locations. However, when the database performs an I/O operation,
that operation does not pass through an ASM instance. The database
executes I/O directly to the storage device. With a conventional file
system, when the database performs an I/O operation, that operation is
processed by a layer in the file system. Another way of stating this is
that unlike a conventional file system, ASM is not in the I/O path
between the database and storage.
Perhaps the most significant aspect of ASM is
that it works well with RAC, or Real Application Clusters. As of Oracle
Release 11.2, each node of a cluster has an ASM instance. All the
database instances depend on the ASM instance operating on that node.
The communication between the databases and the ASM instance on the node
is efficient. And all the ASM instances within a cluster coordinate
their activities and provide the management of the shared disk group
space for the associated databases utilizing that cluster. Because of
the simplicity of management and efficiency in clusters, usage of ASM
for RAC is quite high and believed to be over 85 percent of deployments.
First Release of ASM and Beyond
Automatic Storage Management was introduced with Oracle 10g.
From a development perspective, the release came at the Oracle Open
World conference in September of 2003. During that event, Larry Ellison
spoke of grid computing and how Real Application Clusters delivers the
reliability and performance of a mainframe at a fraction of the cost.
Charles Rozwat, Executive VP of Development at the time, delivered a
keynote speech with a demonstration of ASM. ASM fundamentally changed
the way in which storage for the Oracle database is managed, and this
fact kept all of us very active with customers and storage partners
presenting ASM’s value to them.
Another product-related activity in the early
days is best described as the development of an ASM ecosystem. The
development team met with storage vendors presenting the value
proposition of ASM to Oracle’s customers. Another partner activity in
the early days was performance measurements and compatibility testing
with the partner’s storage arrays. It was in the mutual interest of
Oracle and the storage array vendors to ensure that ASM performed well
with their equipment. From these measurements, whitepapers were written
that documented best-practice procedures for using ASM with their
storage hardware. One of these efforts led to the validation of thin
provisioning in an ASM environment. Oracle worked with thin provisioning
pioneer 3Par to illustrate compatibility between thin provisioning
storage and ASM.
ASM provides an integrated volume manager and
file system for the Oracle database. While the Oracle database is
possibly one of the most important storage deployments for enterprise
customers, it is not the only consumer of storage. Obviously, every
customer has applications and nonrelational data requiring storage. This
means that ASM users had to fragment their storage management between
Oracle databases and everything else. Consequently, the ASM development
team came up with the idea that it would be really useful to enable ASM
as a platform for general-purpose storage management for all of the
customer’s data needs. This concept became the basis for the next major
ASM development focus. Oracle hired an entire development team that had
worked on file systems in the VMS operating system and they became the
group to deliver ASM’s next major stage of evolution.
To expand the data managed outside of the
database environment meant that a POSIX-compliant file system had to be
able to utilize storage residing in an ASM disk group. The architecture
for providing this capability came in two parts. The first is the
capability of exposing an ASM file as a logical volume on which a file
system resides. The second part is a cluster file system utilizing that
exposed logical volume. Exposing ASM files as file system volumes makes
sense in that ASM files are normally quite large because they are
intended for a database. The component providing this capability is
called ASM Device Volume Manager (ADVM). It is a loadable operating
system module that talks to an ASM instance for the purpose of acquiring
volume extent layout metadata. ADVM, in turn, presents the storage
space represented in the extent map metadata as a logical device in the
operating system. Finally, ADVM is “cluster aware,” meaning it can
present its logical volumes coherently across a cluster.
The second part of extending ASM storage
management was the development of a cluster file system utilizing the
storage space presented by ADVM. That component is called ASM Cluster
File System (ACFS). It is a POSIX-compliant cluster file system
implementing a wide range of features described in Chapter 10.
The combination of ADVM and ACFS enables a platform for storage
management that extends beyond the database and is available on all
Oracle platforms, except HPUX. (HPUX is not available at this time
because of the lack of available HPUX internal information required for
the development of drivers for that environment.)
ADVM and ACFS became available with the 11.2
release of Oracle. This was a major release for ASM, which included the
ability to support Oracle’s Clusterware files in an ASM disk group.
Oracle’s Clusterware requires two critical storage components called the
“voting disks” and the Oracle Clusterware Repository (OCR). Previously,
these two entities had to be stored on dedicated storage devices. With
the release of 11.2, the Clusterware files can be kept in an ASM disk
group, thus eliminating a major management challenge. This feature is a
bit of magic because ASM depends on Oracle Clusterware, yet the
Clusterware files can now reside in ASM. It’s a chicken-and-egg problem
that development cleverly solved.
ASM is intimately related to Oracle
Clusterware and Real Application Clusters. Although ASM provides many
advantages to single-instance databases, the clustered version of the
Oracle database (that is, RAC) is where ASM sees most of its use. For
these reasons, ASM is now bundled with ACFS/ADVM and Oracle Clusterware
into a package called Grid Infrastructure. Although ASM and ACFS can be
installed in a non-RAC environment, the Grid Infrastructure bundling
greatly simplifies the installation for customers running RAC.
The Cloud Changes Everything
The “cloud” or the enablement of cloud has
had a transformative impact not only on the IT industry but in our daily
lives as well. This section covers cloud computing as it impacts
Private Database Clouds and specifically ASM.
What Is the Cloud?
This chapter started by alluding to a
definition for cloud computing. Although many companies re-label their
products and features as “cloud enabled” (something we call
“cloudification”), the cloud moniker does imply something real with
respect to product features and what customers want. At the core of the
definition is a customer desire for improved scalability, greater
agility, and cost reduction with respect to otherwise conventional
products. Cloud enabling generally implies transforming consumer and
business applications into a nebulous infrastructure where the
applications are managed elsewhere and access is presented via a
network, which is generally the Internet. Cloud applications and their
related infrastructure are most often thought to be managed by a third
party, with the perception that infinite resources are available to meet
your changing demands and you only pay for what you use. The electric
utility is often used as the perfect metaphor.
Certainly, from a societal impact, the more
significant impact of cloud computing is for end consumers. However, the
question here is what does cloud computing mean for the Oracle database
in general, and the ASM feature in particular? This question is
examined by looking at cloud computing trends, what the requirements are
for enterprise relational databases, and how the storage-related
aspects of the latter question influences product evolution in the
future.
Relational Databases in the Cloud
For the purposes of examining the impact of
cloud computing on Oracle storage management, we must consider the
environments and needs of Oracle’s customers. Oracle’s largest customers
deploy business-critical applications around Oracle’s database. It is
often the case that if these databases fail, the impact to the
underlying application can profoundly affect the business. Oracle
customers typically have teams of people managing the operation of their
enterprise applications and the underlying infrastructure. It is the
case that there can be a cost-savings opportunity for some customers by
outsourcing particular applications to third-party companies. Examples
of this are seen in the Salesforce.com market as well as Oracle’s own
hosting business. However, the focus in this discussion is with respect
to database and related storage management customer requirements. The
more critical application environments deploy a redundant infrastructure
that ensures continuity of operation because business continuity
depends on the uptime of these environments. Such companies simply
cannot depend on a third party to supply the necessary resources.
When discussing the deployment models, the descriptions most often used are public cloud computing and private cloud computing.
Public cloud computing can be thought of as the outsourcing of whole
applications to third-party providers delivering access to applications
over a public Internet. Obviously, customers could access infrastructure
elements of applications, such as databases, over the Internet as well.
However, the purpose here is to focus on cloud computing’s impact on
the inverse of public cloud computing, which is private cloud computing.
Private cloud computing is the means of delivering some of the value of
cloud computing, but through corporate-managed networks and computing
infrastructures. The tradeoff of private clouds is that customers retain
control over security and deployment of vital elements, although at a
higher cost to the enterprise than could be afforded through a public
cloud. For the purposes of the following discussion, private clouds in
the enterprise are simply referred to as enterprise cloud computing.
What does it mean to manage an Oracle database
in an enterprise cloud? The change to an enterprise cloud model is
primarily how the databases and underlying infrastructure are managed.
In a conventional model, separate applications and databases are
deployed in a vertical fashion. This means new applications and
supporting infrastructure are committed to their own hardware stack. In
an enterprise cloud model, new applications and databases are not
deployed on dedicated platforms and software infrastructure, but may
share platforms with other applications. One model that supports such
sharing is multitenancy, which means the sharing of a software
component by multiple consumers through logical separation of the
consumers by the software component. With respect to a database, an
example of multitenancy is the sharing of a single database instance by
multiple applications where each application contains its own schema and
the database provides the means of enforcing access separation between
the applications sharing the database instance.
Another architectural tool used to create
enterprise clouds is virtualization. An example of virtualization used
in the service of enterprise clouds is server virtualization, where
companies deploy several applications and associated infrastructure on a
single platform, but each application environment operates on its own
virtual server image. There is public debate as to the merits of these
approaches for creating an enterprise cloud environment, but from an
Oracle perspective, a lot of development attention surrounds product
features supporting these trends.
From a conceptual level, there are at least
four obvious product development areas, with respect to Oracle’s
database, that will evolve to support customers creating enterprise
clouds:
Large-scale clustering An
enterprise cloud is a collection of IT resources that is malleable and
can expand or contract to changing application demands. From an Oracle
database perspective, this flexibility is provided with database
clustering. Enterprise clouds dictate that the enterprise will have a
growing need for larger clusters of servers and storage that are easily
managed.
Large-scale platform sharing As
much as there is a need to scale database services for demanding
applications operating within a cluster, there is also a requirement to
effectively share database instances for less demanding applications on a
single server. Examples of technologies providing such sharing include
database multitenancy and server virtualization.
Efficient cluster reconfiguration An
enterprise cloud with respect to clustering is not one large single
cluster, but many separately managed clusters. An enterprise cloud
requires these collections of clusters to be easily reconfigured to
adapt to changing needs. There are also unplanned reconfigurations,
which are the results of component failures. Consequently, cluster
reconfigurations must be as seamless and transparent to the applications
as possible.
Enterprise cloud management model Cloud
computing in the enterprise is as much about a change regarding
management mindset as a technology change. The enterprise cloud
management model dictates thinking of IT as delivering a set of services
rather than components. It does not matter to the end consumer where
their business applications run, as long as the expected services are
delivered as agreed upon by their service-level agreements (SLAs). This
means that the staff managing the enterprise cloud must have tools for
ensuring the SLAs are delivered and that services are appropriately
charged for.
The preceding key requirements regarding
cloud computing in the enterprise will drive product development for the
next several years. Next, we’ll look at how these requirements will
likely affect ASM evolution.
ASM in the Cloud
ASM was originally intended to solve one
problem: reduce the management challenge associated with storage used
for the Oracle database. However, the original approach implemented with
ASM meant that it could not be used for managing storage and data
outside of the database. The second major development phase for ASM
brought ACFS that extended ASM’s storage management model to data
outside of the database. Clouds computing in the enterprise will likely
further the idea of ASM being the underpinning for storage management
for all elements remotely associated with the Oracle database. Cloud
computing means applications and supporting infrastructure must be
malleable within and across servers and clusters. Storage management
that is tied to an isolated platform impedes this malleability.
Common Storage Management
Storage management must be global with
respect to the enterprise cloud. From an architectural perspective,
global storage management can be achieved either at the storage level or
at the host level. At the extremes, global storage management at the
storage level implies storage is totally managed and made available to
applications and infrastructure through storage array products, such as
those available from EMC and Network Appliance. Global storage
management at the host means that less is expected from the physical
storage and that storage management is principally provided by host
management components providing a global management structure with
respect to the enterprise cloud. ASM/ACFS is an example of host-based
global storage management, and over time it will extend to provide a
greater reach of management, not only with respect to application data,
but across cluster boundaries. The idea is that ASM/ACFS will be the
common storage and data management platform for the enterprise cloud.
Enterprise Cloud Robustness
For ASM to be an effective platform for
storage and data management in an enterprise cloud, it must adapt to the
four product development areas described in the previous section. It
should be expected that ASM evolution will include growing support for
larger clusters. An example of this is that as the cluster size
increases, ASM will not become an impediment to that growth, which could
result from lock contention and reconfiguration overhead. All
host-based global storage management components require some form of
serialization for access to shared resources. This commonly involves a
global lock manager. As the cluster size increases, contention for locks
increases. If not implemented effectively, this contention can impede
the largest effective size of a cluster.
A related ASM evolution is associated with the
cost of cluster reconfiguration. Whenever a cluster is
reconfigured—either planned or unplanned—overhead is associated with
reconfiguring management elements and updating the metadata associated
with all the active members of the cluster. Larger clusters,
particularly in the enterprise cloud, imply a far greater frequency of
cluster reconfiguration events. It should be expected that ASM will not
only evolve to minimize this overhead, but also to limit the impact to
services that might otherwise be affected by a reconfiguration event.
Enterprise Cloud Policy Management
The cloud computing environment is far more
dynamic than a non-cloud environment. Clusters’ membership will change
frequently, and the storage management infrastructure must quickly adapt
to these frequent changes. Additionally, storage management will
require provisioning and must deliver a wide range in service levels
with respect to performance and reliability. Matching the available
storage capabilities against a consistently changing set of demands
could lead to an unmanageable environment.
Cross-Cluster Sharing
A typical enterprise cloud will likely
contain many separate cluster environments. Separate cluster
environments provide fault and performance isolation between workloads
that are highly independent and require varying service levels. Yet,
even with this separation, there will be a need to share access to data
between the clusters. An example of this is the needed separation
between the production environment and the test and development
environment of the same application. Although the test and development
environment is isolated from the production environment, testing may
require controlled access to the production environment. This places a
requirement on ASM to enable cross-cluster access of data. This is not
easily available in Oracle 11.2, but will be a requirement in the
future.
Summary
Cloud computing has been driven by the need
to reduce costs, improve utilization, and produce better efficiency. At
the center of this cloud movement has been the Private Database Cloud.
With its support for all file and content types, Oracle ASM similarly
has become a core component of the Oracle Private Database cloud
movement.
2
ASM and Grid Infrastructure Stack
In releases prior to 11gR2, Automatic Storage Management (ASM) was tightly integrated with the Clusterware stack. In 11gR2,
ASM is not only tightly integrated with the Clusterware stack, it’s
actually part of the Clusterware stack. The Grid Infrastructure stack is
the foundation of Oracle’s Private Database Cloud, and it provides the
essential Cloud Pool capabilities, such as growing server and storage
capacity as needed. This chapter discusses how ASM fits into the Oracle
Clusterware stack.
Clusterware Primer
Oracle Clusterware is the cross-platform
cluster software required to run the Real Application Clusters (RAC)
option for Oracle Database and provides the basic clustering services at
the operating system level that enable Oracle software to run in
clustered mode. The two main components of Oracle Clusterware are
Cluster Ready Services and Cluster Synchronization Services:
Cluster Ready Services (CRS) Provides
high-availability operations in a cluster. The CRS daemon (CRSd)
manages cluster resources based on the persistent configuration
information stored in Oracle Cluster Registry (OCR). These cluster
resources include the Oracle Database instance, listener, VIPs, SCAN
VIPs, and ASM. CRSd provides start, stop, monitor, and failover
operations for all the cluster resources, and it generates events when
the status of a resource changes.
Cluster Synchronization Services (CSS) Manages
the cluster configuration by controlling which nodes are members of the
cluster and by notifying members when a member (node) joins or leaves
the cluster. The following functions are provided by the Oracle Cluster
Synchronization Services daemon (OCSSd):
Group Services A distributed group membership system that allows for the synchronization of services between nodes
Lock Services Provide the basic cluster-wide serialization locking functions
Node Services Use OCR to store state data and update the information during reconfiguration
OCR Overview
The Oracle Cluster Registry is the central
repository for all the resources registered with Oracle Clusterware. It
contains the profile, state, and ownership details of the resources.
This includes both Oracle resources and user-defined application
resources. Oracle resources include the node apps (VIP, ONS, GSD, and
Listener) and database resources, such as database instances, and
database services. Oracle resources are added to the OCR by tools such
as DBCA, NETCA, and srvctl.
Voting File Overview
Oracle Clusterware maintains membership of the nodes in the cluster using a special file called voting disk (mistakenly also referred to as quorum disk).
Sometimes, the voting disk is also referred to as the vote file, so
you’ll see this referenced both ways, and both are correct. This file
contains the heartbeat records from all the nodes in the cluster. If a
node loses access to the voting file or is not able to complete the
heartbeat I/O within the threshold time, then that node is evicted out
of the cluster. Oracle Clusterware also maintains heartbeat with the
other member nodes of the cluster via the shared private interconnect
network. A split-brain syndrome occurs when there is a failure in the
private interconnect whereby multiple sub-clusters are formed within the
clustered nodes and the nodes in different sub-clusters are not able to
communicate with each other via the interconnect network but they still
have access to the voting files. The voting file enables Clusterware to
resolve network split brain among the cluster nodes. In such a
situation, the largest active sub-cluster survives. Oracle Clusterware
requires an odd number of voting files (1, 3, 5, …) to be created. This
is done to ensure that at any point in time, an active member of the
cluster has access to the majority number (n / 2 + 1) of voting files.
Here’s a list of some interesting 11gR2 changes for voting files:
The
voting files’ critical data is stored in the voting file and not in the
OCR anymore. From a voting file perspective, the OCR is not touched at
all. The critical data each node must agree on to form a cluster is, for
example, miscount and the list of voting files configured.
In Oracle Clusterware 11g
Release 2 (11.2), it is no longer necessary to back up the voting
files. The voting file data is automatically backed up in OCR as part of
any configuration change and is automatically restored as needed. If
all voting files are corrupted, users can restore them as described in
the Oracle Clusterware Administration and Deployment Guide.
Grid Infrastructure Stack Overview
The Grid Infrastructure stack includes
Oracle Clusterware components, ASM, and ASM Cluster File System (ACFS).
Throughout this chapter, as well as the book, we will refer to Grid
Infrastructure as the GI stack.
The Oracle GI stack consists of two
sub-stacks: one managed by the Cluster Ready Services daemon (CRSd) and
the other by the Oracle High Availability Services daemon (OHASd). How
these sub-stacks come into play depends on how the GI stack is
installed. The GI stack is installed in two ways:
Grid Infrastructure for Standalone Server
Grid Infrastructure for Cluster
ASM is available in both these software stack
installations. When Oracle Universal Installer (OUI) is invoked to
install Grid Infrastructure, the main screen will show four options (see
Figure 2-1).
In this section, the options we want to focus on are Grid
Infrastructure for Standalone Server and Grid Infrastructure for
Cluster.
Grid Infrastructure for Standalone Server
Grid Infrastructure for Standalone Server is
essentially the single-instance (non-clustered) configuration, as in
previous releases. It is important to note that in 11gR2, because
ASM is part of the GI stack, Clusterware must be installed first before
the database software is installed; this holds true even for
single-instance deployments. Keep in mind that ASM will not need to be
in a separate ORACLE_HOME; it is installed and housed in the GI
ORACLE_HOME.
Grid Infrastructure for Standalone Server does
not configure the full Clusterware stack; just the minimal components
are set up and enabled—that is, private interconnect, CRS, and
OCR/voting files are not enabled or required. The OHASd startup and
daemon replaces all the existing pre-11.2 init scripts. The entry point
for OHASd is /etc/inittab, which executes the /etc/init.d/ohasd and
/etc/init.d/init.ohasd control scripts, including the start and stop
actions. This OHASD script is the framework control script, which will
spawn the $GI_HOME/bin/ohasd.bin executable. The OHASd is the main
daemon that provides High Availability Services (HAS) and starts the
remaining stack, including ASM, listener, and the database in a
single-instance environment.
A new feature that’s automatically enabled as
part of Grid Infrastructure for Standalone Server installation is Oracle
Restart, which provides high-availability restart functionality for
failed instances (database and ASM), services, listeners, and dismounted
disk groups. It also ensures these protected components start up and
shut down according to the dependency order required. This functionality
essentially replaces the legacy dbstart/dbstop script used in the pre-11gR2
single-instance configurations. Oracle Restart also executes health
checks that periodically monitor the health of these components. If a
check operation fails for a component, the component is forcibly shut
down and restarted. Note that Oracle Restart is only enabled in GI for
Standalone Server (non-clustered) environments. For clustered
configurations, health checks and the monitoring capability are provided
by Oracle Clusterware CRS agents.
When a server that has Grid Infrastructure for
Standalone Server enabled is booted up, the HAS process will initialize
and start up by first starting up ASM. ASM has a hard-start (pull-up)
dependency with CSS, so CSS is started up. Note that there is a
hard-stop dependency between ASM and CSS, so on stack shutdown ASM will
stop and then CSS will stop.
Grid Infrastructure for Cluster
Grid Infrastructure for Cluster is the
traditional installation of Clusterware. It includes multinode RAC
support, private interconnect, Clusterware files, and now also installs
ASM and ACFS drivers. With Oracle Clusterware 11gR2, ASM is not
simply the storage manager for database files, but also houses the
Clusterware files (OCR and voting files) and the ASM spfile.
When you select the Grid Infrastructure for Cluster option in OUI, as shown previously in Figure 2-1,
you will be prompted next on file storage options for the Clusterware
files (Oracle Clusterware Registry and Clusterware voting file). This is
shown in Figure 2-2.
Users are prompted to place Clusterware files
on either a shared file system or ASM. Note that raw disks are not
supported any longer for new installations. Oracle will support the
legacy method of storing Clusterware files (raw and so on) in upgrade
scenarios only.
When ASM is selected as the storage location for Clusterware files, the Create ASM Disk Group screen is shown next (see Figure 2-3).
You can choose external or ASM redundancy for the storage of
Clusterware files. However, keep in mind that the type of redundancy
affects the redundancy (or number of copies) of the voting files.
For example, for normal redundancy, there
needs to be a minimum of three failure groups, and for high redundancy a
minimum of five failure groups. This requirement stems from the fact
that an odd number of voting files must exist to enable a vote quorum.
Additionally this enables to tolerate one or two disk failures and still
provide quorums.
This first disk group that is created during
the installation can also be used to store database files. In previous
versions of ASM, this disk group was referred to as the DATA disk group.
Although it is recommended that you create a single disk group for
storing the Clusterware files and database files, for users who are
employing a third-party vendor snapshot technology against the ASM disk
group, users may want to have a separate disk group for the Clusterware
files. Users may also deploy a separate disk group for the Clusterware
to leverage normal or high redundancy for the Clusterware files. In both
of the cases, users should create a small CRSDATA disk group with 1MB
AU and enough failure groups to support the redundancy required. Next,
the installation users can then use ASMCA to create the DATA disk group.
Voting Files and Oracle Cluster Repository Files in ASM
In versions prior to 11gR2, users
needed to configure and set up raw devices for housing the Clusterware
files (OCR and voting files). This step creates additional management
overhead and is error prone. Incorrect OCR/voting files setup creates
havoc for the Clusterware installation and directly affects run-time
environments. To mitigate these install preparation issues, 11gR2
allows the storing of the Clusterware files in ASM; this also
eliminates the need for a third-party cluster file system and eliminates
the complexity of managing disk partitions for the OCR and voting
files. The COMPATIBLE.ASM disk group compatibility attribute must be set
to 11.2 or greater to store the OCR or voting file data in a disk
group. This attribute is automatically set for new installations with
the OUI. Note that COMPATIBLE.RDBMS does not need to be advanced to
enable this feature. The COMPATIBLE.* attributes topic is covered in Chapter 3.
Voting Files in ASM
If you choose to store voting files in ASM,
then all voting files must reside in ASM in a single disk group (in
other words, Oracle does not support mixed configurations of storing
some voting files in ASM and some on NAS devices).
Unlike most ASM files, the voting files are
wholly consumed in multiple contiguous AUs. Additionally, the voting
file is not stored as a standard ASM file (that is, it cannot be listed
in the asmcmd ls command). However, the disk that contains the voting
file is reflected in the V$ASM_DISK view:
The number of voting files you want to create in a particular Oracle ASM disk group depends on the redundancy of the disk group:
External redundancy A
disk group with external redundancy can store only one voting file.
Currently, no supported way exists to have multiple voting files stored
on an external redundancy disk group.
Normal redundancy A disk group with normal redundancy can store up to three voting files.
High redundancy A disk group with high redundancy can store up to five voting files.
In this example, we created an ASM disk group
with normal redundancy for the disk group containing voting files. The
following can be seen:
ASM puts each voting file in its own failure group within the disk group. A failure group
is defined as the collection of disks that have a shared hardware
component for which you want to prevent its loss from causing a loss of
data.
For example, four drives that are in a single
removable tray of a large JBOD (just a bunch of disks) array are in the
same failure group because the tray could be removed, making all four
drives fail at the same time. Conversely, drives in the same cabinet can
be in multiple failure groups if the cabinet has redundant power and
cooling so that it is not necessary to protect against the failure of
the entire cabinet. If voting files are stored on ASM with normal or
high redundancy, and the storage hardware in one failure group suffers a
failure, then if another disk is available in a disk group in an
unaffected failure group, ASM allocates new voting files in other
candidate disks.
Voting files are managed differently from
other files that are stored in ASM. When voting files are placed on
disks in an ASM disk group, Oracle Clusterware records exactly on which
disks in that disk group they are located. Note that CSS has access to
voting files even if ASM becomes unavailable.
Voting files can be migrated from raw/block
devices into ASM. This is a typical scenario for upgrade scenarios. For
example, when a user upgrades from 10g to 11gR2, they are
allowed to continue storing their OCR/voting files on raw, but at a
later convenient time they can migrate these Clusterware files into ASM.
It is important to point out that users cannot migrate to Oracle
Clusterware 12c from 10g without first moving the voting
files into ASM (or shared file system), since raw disks are no longer
supported even for upgraded environments in 12c.
The following illustrates this:
Voting File Discovery
The method by CSS that identifies and locates voting files has changed in 11.2. Before 11gR2, the voting files were located via lookup in OCR; in 11gR2, voting files are located via a Grid Plug and Play (GPnP) query. GPnP, a new component in the 11gR2
Clusterware stack, allows other GI stack components to query or modify
cluster-generic (non-node-specific) attributes. For example, the cluster
name and network profiles are stored in the GPnP profile. The GPnP
configuration, which consists of the GPnP profile and wallet, is created
during the GI stack installation. The GPnP profile is an XML file that
contains bootstrap information necessary to form a cluster. This profile
is identical on every peer node in the cluster. The profile is managed
by gpnpd and exists on every node (in gpnpd caches). The profile should
never be edited because it has a profile signature that maintains its
integrity.
When the CSS component of the Clusterware
stack starts up, it queries the GPnP profile to obtain the disk
discovery string. Using this disk string, CSS performs a discovery to
locate its voting files.
The following is an example of a CSS GPnP
profile entry. To query the GPnP profile, the user should use the
supplied (in CRS ORACLE_HOME) gpnptool utility:
The CSS voting file discovery string anchors
into the ASM profile entry; that is, it derives its DiscoveryString from
the ASM profile entry. The ASM profile lists the value in the ASM
discovery string as ‘/dev/mapper/*’. Additionally, ASM uses this GPnP
profile entry to locate its spfile.
Voting File Recovery
Here’s a question that is often heard: If
ASM houses the Clusterware files, then what happens if the ASM instance
is stopped? This is an important point about the relationship between
CSS and ASM. CSS and ASM do not communicate directly. CSS discovers its
voting files independently and outside of ASM. This is evident at
cluster startup when CSS initializes before ASM is available. Thus, if
ASM is stopped, CSS continues to access the voting files, uninterrupted.
Additionally, the voting file is backed up into the OCR at every
configuration change and can be restored with the crsctl command.
If all voting files are corrupted, you can restore them as described next.
Furthermore, if the cluster is down and cannot
restart due to lost voting files, you must start CSS in exclusive mode
to replace the voting files by entering the following command:
Oracle Cluster Registry (OCR)
Oracle Clusterware 11gR2 provides the
ability to store the OCR in ASM. Up to five OCR files can be stored in
ASM, although each has to be stored in a separate disk group.
The OCR is created, along with the voting
disk, when root.sh of the OUI installation is executed. The OCR is
stored in an ASM disk group as a standard ASM file with the file type
OCRFILE. The OCR file is stored like other ASM files and striped across
all the disks in the disk group. It also inherits the redundancy of the
disk group. To determine which ASM disk group the OCR is stored in, view
the default configuration location at /etc/oracle/ocr.loc:
The disk group that houses the OCR file is automounted by the ASM instance during startup.
All 11gR2 OCR commands now support the
ASM disk group. From a user perspective, OCR management and maintenance
works the same as in previous versions, with the exception of OCR
recovery, which is covered later in this section. As in previous
versions, the OCR is backed up automatically every four hours. However,
the new backup location is <GRID_HOME>/cdata/<scan name>.
A single OCR file is stored when an external
redundancy disk group is used. It is recommended that for external
redundancy disk groups an additional OCR file be created in another disk
group for added redundancy. This can be done as follows:
In an ASM redundancy disk group, the ASM
partnership and status table (PST) is replicated on multiple disks. In
the same way, there are redundant extents of OCR file stored in an ASM
redundancy disk group. Consequently, OCR can tolerate the loss of the
same number of disks as are in the underlying disk group, and it can be
relocated/rebalanced in response to disk failures. The ASM PST is
covered in Chapter 9.
OCR Recovery
When a process (OCR client) that wants to
read the OCR incurs a corrupt block, the OCR client I/O will
transparently reissue the read to the mirrored extents for a normal- or
high-redundancy disk group. In the background the OCR master (nominated
by CRS) provides a hint to the ASM layer identifying the corrupt disk.
ASM will subsequently start “check disk group” or “check disk,” which
takes the corrupt disk offline. This corrupt block recovery is only
possible when the OCR is configured in a normal- or high-redundancy disk
group.
In a normal- or high-redundancy disk group, users can recover from the corruption by taking either of the following steps:
Use the ALTER DISK GROUP CHECK statement if the disk group is already mounted.
Remount
the disk group with the FORCE option, which also takes the disk offline
when it detects the disk header corruption. If you are using an
external redundancy disk group, users must restore the OCR from backup
to recover from a corruption. Starting in Oracle Clusterware 11.2.0.3,
the OCR backup can be stored in a disk group as well.
The workaround is to configure an additional
OCR location on a different storage location using the ocrconfig -add
command. OCR clients can tolerate a corrupt block returned by ASM, as
long as the same block from the other OCR locations (mirrors) is not
corrupt. The following guidelines can be used to set up a redundant OCR
copy:
Ensure
that the ASM instance is up and running with the required disk group
mounted and/or check ASM alert.log for the status for the ASM instance.
Verify
that the OCR files were properly created in the disk group, using
asmcmd ls. Because the Clusterware stack keeps accessing OCR files, most
of the time the error will show up as a CRSD error in the crsd.log. Any
errors related to an ocr* command will generate a trace file in the
Grid_home/log/<hostname>/client directory; look for kgfo, kgfp, or
kgfn at the top of the error stack.
Use Case Example
A customer has an existing three-node cluster with an 11gR1
stack (CRS 11.1.0.7; ASM 11.1.0.7; DB 11.1.0.7). They want to migrate
to a new cluster with new server hardware but the same storage. They
don’t want to install 11.1.0.7 on the new servers; they just want to
install 11.2.0.3. In other words, instead of doing an upgrade, they want
to create a new “empty” cluster and then “import” the ASM disks into
the 11.2 ASM instance. Is this possible?
Yes. To make this solution work, you will
install the GI stack and create a new cluster on the new servers, stop
the old cluster, and then rezone the SAN paths to the new servers.
During the GI stack install, when you’re prompted in the OUI to
configure the ASM disk group for a storage location for the OCR and
voting files, use the drop-down box to use an existing disk group. The
other option is to create a new disk group for the Clusterware files and
then, after the GI installation, discover and mount the old 11.1.0.7
disk group. You will need to do some post-install work to register the
databases and services with the new cluster.
The Quorum Failure Group
In certain circumstances, customers might want to build a stretch cluster. A stretch cluster
provides protection from site failure by allowing a RAC configuration
to be set up across distances greater than what’s typical “in the data
center.” In these RAC configurations, a third voting file must be
created at a third location for cluster arbitration. In pre-11gR2 configurations, users set up this third voting file on a NAS from a third location. In 11gR2, the third voting file can now be stored in an ASM quorum failure group.
The “Quorum Failgroup” clause was introduced
for setups with Extended RAC and/or for setups with disk groups that
have only two disks (respectively, only two failure groups) but want to
use normal redundancy.
A quorum failure group is a special type of
failure group where the disks do not contain user data and are not
considered when determining redundancy requirements. Unfortunately,
during GI stack installation, the OUI does not offer the capability to
create a quorum failure group. However, this can be set up after the
installation. In the following example, we create a disk group with a
failure group and optionally a quorum failure group if a third array is
available:
If the disk group creation was done using
ASMCA, then after we add a quorum disk to the disk group, Oracle
Clusterware will automatically change the CSS vote disk location to the
following:
Clusterware Startup Sequence—Bootstrap If OCR Is Located in ASM
Oracle Clusterware 11g Release 2 introduces an integral component called the cluster agents. These agents are highly available, multithreaded daemons that implement entry points for multiple resource types.
ASM has to be up with the disk group mounted
before any OCR operations can be performed. OHASd maintains the resource
dependency and will bring up ASM with the required disk group mounted
before it starts the CRSd. Once ASM is up with the disk group mounted,
the usual ocr* commands (ocrcheck, ocrconfig, and so on) can be used. Figure 2-4 displays the client connections into ASM once the entire stack, including the database, is active.
|
NOTE
This lists the processes connected to ASM using the OS ps command. Note that most of these are bequeath connections.
|
The following output displays a similar listing but from an ASM client perspective:
There will be an ASM client listed for the connection OCR:
Here, +data.255 is the OCR file number, which is used to identify the OCR file within ASM.
The voting files, OCR, and spfile are processed differently at bootstrap:
Voting file The
GPnP profile contains the disk group name where the voting files are
kept. The profile also contains the discovery string that covers the
disk group in question. When CSS starts up, it scans each disk group for
the matching string and keeps track of the ones containing a voting
disk. CSS then directly reads the voting file.
ASM spfile The
ASM spfile location is recorded in the disk header(s), which has the
spfile data. It is always just one AU. The logic is similar to CSS and
is used by the ASM server to find the parameter file and complete the
bootstrap.
OCR file OCR is stored as a regular ASM file. Once the ASM instance comes up, it mounts the disk group needed by the CRSd.
Disk Groups and Clusterware Integration
Before discussing the relationship of ASM
and Oracle Clusterware, it’s best to provide background on CRS modeling,
which describes the relationship between a resource, the resource
profile, and the resource relationship. A resource, as described
previously, is any entity that is being managed by CRS—for example,
physical (network cards, disks, and so on) or logical (VIPs, listeners,
databases, disk groups, and so on). The resource relationship defines
the dependency between resources (for example, state dependencies or
proximities) and is considered to be a fundamental building block for
expressing how an application’s components interact with each other. Two
or more resources are said to have a relationships when one (or both)
resource(s) either depends on or affects the other. For example, CRS
modeling mandated that the DB instance resource depend on the ASM
instance and the required disk groups.
As discussed earlier, because Oracle Clusterware version 11gR2
allows the Clusterware files to be stored in ASM, the ASM resources are
also managed by CRS. The key resource managed by CRS is the ASM disk
group resource.
Oracle Clusterware 11g Release 2
introduces a new agent concept that makes cluster resource management
very efficient and scalable. These agents are multithreaded daemons that
implement entry points for multiple resource types and spawn new
processes for different users. The agents are highly available and,
besides oraagent, orarootagent, and cssdagent/cssdmonitor, there can be
an application agent and a script agent. The two main agents are
oraagent and orarootagent. As the names suggest, oraagent and
orarootagent manage resources owned by Oracle and root, respectively. If
the CRS user is different from the ORACLE user, then CRSd would utilize
two oraagents and one orarootagent. The main agents perform different
tasks with respect to ASM. For example, oraagent performs the
start/stop/check/clean actions for ora.asm, database, and disk group
resources, whereas orarootagent performs start/stop/check/clean actions
for the ora.diskmon and ora.drivers.acfs resources.
The following output shows typical ASM-related CRS resources:
When the disk group is created, the disk group
resource is automatically created with the name, ora.<DGNAME>.dg,
and the status is set to ONLINE. The status OFFLINE will be set if the
disk group is dismounted, because this is a CRS-managed resource now.
When the disk group is dropped, the disk group resource is removed as
well. A dependency between the database and the disk group is
automatically created when the database tries to access the ASM files.
More specifically, a “hard” dependency type is created for the following
files types: datafiles, controlfiles, online logs, and SPFile. These
are the files that are absolutely needed to start up the database; for
all other files, the dependency is set to weak. This becomes important
when there are more than two disk groups: one for archive, another for
flash or temp, and so on. However, when the database no longer uses the
ASM files or the ASM files are removed, the database dependency is not
removed automatically. This must be done using the srvctl command-line
tool.
The following database CRS profile illustrates the dependency relationships between the database and ASM:
Summary
The tighter integration between ASM and
Oracle Clusterware provides the capability for quickly deploying new
applications as well as managing changing workloads and capacity
requirements. This faster agility and elasticity are key drivers for the
Private Database Cloud. In addition, the ASM/Clusterware integration
with the database is the platform at the core factor of Oracle’s
Engineered Systems.
NICE POST.
ReplyDeleteoracle rac online training
oracle rac training