Friday, 6 July 2018

ASM and ACFS in 12c Database

Automatic Storage Management in a Cloud World
Cloud computing means many things to different people. At the core of most people’s concept of cloud computing is a set of concepts speaking to information technology agility, scalability, and cost minimization. Early proponents of cloud computing visualized IT becoming a utility very much like electricity. To some extent, in the mobile platform space, that vision has become reality. The cost of mobile applications has never been lower, and the data feeding these applications is almost universally accessible.
This chapter’s focus is how cloud computing influences the management of storage in an Oracle Database world. An important aspect of storage management for Oracle Database is the database feature known as Automatic Storage Management (ASM). ASM is an integrated volume manager and file system for Oracle’s database. It takes the place of host-based volume managers and file systems, which have long been the practice when deploying Oracle. Bundled with ASM is ASM Cluster File System (ACFS), which is a POSIX-compliant file system that relies on the volume management provided by ASM as volume space for a file system supporting files outside the database.
The subsequent chapters of this book focus on the currently available capabilities of ASM and related technologies. This chapter addresses many of the recent changes that have been influenced by cloud computing, and how cloud computing will impact ASM in the future.
The Early Years
ASM was developed in the pre–Oracle 9i days. This was before the release of Real Application Clusters (RAC) databases providing commodity-class clustered databases. However, the product that became RAC was previously called Oracle Parallel Server, or simply OPS. Although it was expected that ASM would serve a wide selection of needs, from small databases upward, Oracle development believed that ASM would likely be important to OPS and the clustering of database instances. Originally, ASM was called Oracle Storage Manager (OSM), reflecting that relationship with OPS. In those early days, storage management for customers deploying large Oracle databases was rather challenging. As a customer, if you wanted to deploy OPS, you were stuck with only two choices: NFS-based storage and raw-attached storage. This limited choice was because Oracle’s clustered database architecture is a shared-everything architecture requiring all storage devices to be accessible by all database servers running in a cluster. Furthermore, host-based global file systems that could accommodate Oracle’s shared-everything architecture were not readily available. Consequently, the storage management choices for the customer were either to use the storage provided by network filers or to eliminate the file system altogether and deploy raw volumes for the Oracle database instances to access directly without any intermediary file system. Most customers choose the latter because it was perceived that file servers could not deliver the required I/O performance for large-scale databases.
Although deploying raw storage for the Oracle database can deliver excellent storage performance, it comes with a high management overhead because of the ongoing storage management needed to maintain the data layout as database workloads change. Furthermore, database and systems administrators usually had to work closely together to match the I/O demand requirements of database objects with the I/O capabilities of storage devices.
Elsewhere in the storage industry, vendors began developing enterprise-class cluster-aware storage management products. The most successful of vendors was Veritas with their cluster-aware file system and logical volume manager. Platform vendors, most notably IBM and HP, also introduced products in this space, either through partnering with Veritas or developing cluster-aware storage products independently. Because these products were important to customers, Oracle partnered with these vendors to ensure that they had viable solutions supporting the needs for Oracle Parallel Server, which later became RAC.
To help simplify storage management, Oracle started an initiative called SAME, for “Stripe and Mirror Everything.” The idea behind this initiative was simple, yet powerful. It proposed that customers organize database file systems in a way that follows two principles:
image   Stripe all files across all disks
image   Mirror file systems across disks for high availability
Previously, customers invested considerable time optimizing many separate file systems to match the requirements for each database object. The SAME concept greatly simplified things by reducing the number of file systems and, consequently, the ongoing tuning of storage resources for the Oracle database. At the core of ASM is the SAME concept. ASM stripes all files across all disks and optionally mirrors the files.
With its introduction, ASM offered features that set it apart from conventional file system alternatives. Storage management for an Oracle database environment typically involves operating system–based volume managers and file systems. File systems and volume managers are most often managed by system administrators, who would set up the file system for the database administrator. This means that system administers have frequent and recurring interactions with database administrators. ASM, however, provided a database administrator–centric file system from which database files are managed. The entity of space management in ASM is called a disk group, which can be thought of as a file system. The disk group is usually the charter of the database administrator, rather than the system administrator. The effect of this is that it changed the way in which logical space is managed. This change meant that system administrators continue to provide physical volumes that populate a disk group, but DBAs become responsible for managing the disk groups, which is the file system that the database depends on. Furthermore, because ASM inherently implements the Stripe and Mirror Everything concept within the disk group, it eliminates the kind of management overhead previously required from system administrators.
Unlike conventional file systems, ASM is integrated with the Oracle database. This integration provides optimization not possible with a conventional file system. Like the database, ASM utilizes an instance that is simply a collection of processes sharing a region of memory. The ASM instance is responsible for file metadata that maps the locations of file blocks to physical storage locations. However, when the database performs an I/O operation, that operation does not pass through an ASM instance. The database executes I/O directly to the storage device. With a conventional file system, when the database performs an I/O operation, that operation is processed by a layer in the file system. Another way of stating this is that unlike a conventional file system, ASM is not in the I/O path between the database and storage.
Perhaps the most significant aspect of ASM is that it works well with RAC, or Real Application Clusters. As of Oracle Release 11.2, each node of a cluster has an ASM instance. All the database instances depend on the ASM instance operating on that node. The communication between the databases and the ASM instance on the node is efficient. And all the ASM instances within a cluster coordinate their activities and provide the management of the shared disk group space for the associated databases utilizing that cluster. Because of the simplicity of management and efficiency in clusters, usage of ASM for RAC is quite high and believed to be over 85 percent of deployments.
First Release of ASM and Beyond
Automatic Storage Management was introduced with Oracle 10g. From a development perspective, the release came at the Oracle Open World conference in September of 2003. During that event, Larry Ellison spoke of grid computing and how Real Application Clusters delivers the reliability and performance of a mainframe at a fraction of the cost. Charles Rozwat, Executive VP of Development at the time, delivered a keynote speech with a demonstration of ASM. ASM fundamentally changed the way in which storage for the Oracle database is managed, and this fact kept all of us very active with customers and storage partners presenting ASM’s value to them.
Another product-related activity in the early days is best described as the development of an ASM ecosystem. The development team met with storage vendors presenting the value proposition of ASM to Oracle’s customers. Another partner activity in the early days was performance measurements and compatibility testing with the partner’s storage arrays. It was in the mutual interest of Oracle and the storage array vendors to ensure that ASM performed well with their equipment. From these measurements, whitepapers were written that documented best-practice procedures for using ASM with their storage hardware. One of these efforts led to the validation of thin provisioning in an ASM environment. Oracle worked with thin provisioning pioneer 3Par to illustrate compatibility between thin provisioning storage and ASM.
ASM provides an integrated volume manager and file system for the Oracle database. While the Oracle database is possibly one of the most important storage deployments for enterprise customers, it is not the only consumer of storage. Obviously, every customer has applications and nonrelational data requiring storage. This means that ASM users had to fragment their storage management between Oracle databases and everything else. Consequently, the ASM development team came up with the idea that it would be really useful to enable ASM as a platform for general-purpose storage management for all of the customer’s data needs. This concept became the basis for the next major ASM development focus. Oracle hired an entire development team that had worked on file systems in the VMS operating system and they became the group to deliver ASM’s next major stage of evolution.
To expand the data managed outside of the database environment meant that a POSIX-compliant file system had to be able to utilize storage residing in an ASM disk group. The architecture for providing this capability came in two parts. The first is the capability of exposing an ASM file as a logical volume on which a file system resides. The second part is a cluster file system utilizing that exposed logical volume. Exposing ASM files as file system volumes makes sense in that ASM files are normally quite large because they are intended for a database. The component providing this capability is called ASM Device Volume Manager (ADVM). It is a loadable operating system module that talks to an ASM instance for the purpose of acquiring volume extent layout metadata. ADVM, in turn, presents the storage space represented in the extent map metadata as a logical device in the operating system. Finally, ADVM is “cluster aware,” meaning it can present its logical volumes coherently across a cluster.
The second part of extending ASM storage management was the development of a cluster file system utilizing the storage space presented by ADVM. That component is called ASM Cluster File System (ACFS). It is a POSIX-compliant cluster file system implementing a wide range of features described in Chapter 10. The combination of ADVM and ACFS enables a platform for storage management that extends beyond the database and is available on all Oracle platforms, except HPUX. (HPUX is not available at this time because of the lack of available HPUX internal information required for the development of drivers for that environment.)
ADVM and ACFS became available with the 11.2 release of Oracle. This was a major release for ASM, which included the ability to support Oracle’s Clusterware files in an ASM disk group. Oracle’s Clusterware requires two critical storage components called the “voting disks” and the Oracle Clusterware Repository (OCR). Previously, these two entities had to be stored on dedicated storage devices. With the release of 11.2, the Clusterware files can be kept in an ASM disk group, thus eliminating a major management challenge. This feature is a bit of magic because ASM depends on Oracle Clusterware, yet the Clusterware files can now reside in ASM. It’s a chicken-and-egg problem that development cleverly solved.
ASM is intimately related to Oracle Clusterware and Real Application Clusters. Although ASM provides many advantages to single-instance databases, the clustered version of the Oracle database (that is, RAC) is where ASM sees most of its use. For these reasons, ASM is now bundled with ACFS/ADVM and Oracle Clusterware into a package called Grid Infrastructure. Although ASM and ACFS can be installed in a non-RAC environment, the Grid Infrastructure bundling greatly simplifies the installation for customers running RAC.
The Cloud Changes Everything
The “cloud” or the enablement of cloud has had a transformative impact not only on the IT industry but in our daily lives as well. This section covers cloud computing as it impacts Private Database Clouds and specifically ASM.
What Is the Cloud?
This chapter started by alluding to a definition for cloud computing. Although many companies re-label their products and features as “cloud enabled” (something we call “cloudification”), the cloud moniker does imply something real with respect to product features and what customers want. At the core of the definition is a customer desire for improved scalability, greater agility, and cost reduction with respect to otherwise conventional products. Cloud enabling generally implies transforming consumer and business applications into a nebulous infrastructure where the applications are managed elsewhere and access is presented via a network, which is generally the Internet. Cloud applications and their related infrastructure are most often thought to be managed by a third party, with the perception that infinite resources are available to meet your changing demands and you only pay for what you use. The electric utility is often used as the perfect metaphor.
Certainly, from a societal impact, the more significant impact of cloud computing is for end consumers. However, the question here is what does cloud computing mean for the Oracle database in general, and the ASM feature in particular? This question is examined by looking at cloud computing trends, what the requirements are for enterprise relational databases, and how the storage-related aspects of the latter question influences product evolution in the future.
Relational Databases in the Cloud
For the purposes of examining the impact of cloud computing on Oracle storage management, we must consider the environments and needs of Oracle’s customers. Oracle’s largest customers deploy business-critical applications around Oracle’s database. It is often the case that if these databases fail, the impact to the underlying application can profoundly affect the business. Oracle customers typically have teams of people managing the operation of their enterprise applications and the underlying infrastructure. It is the case that there can be a cost-savings opportunity for some customers by outsourcing particular applications to third-party companies. Examples of this are seen in the Salesforce.com market as well as Oracle’s own hosting business. However, the focus in this discussion is with respect to database and related storage management customer requirements. The more critical application environments deploy a redundant infrastructure that ensures continuity of operation because business continuity depends on the uptime of these environments. Such companies simply cannot depend on a third party to supply the necessary resources.
When discussing the deployment models, the descriptions most often used are public cloud computing and private cloud computing. Public cloud computing can be thought of as the outsourcing of whole applications to third-party providers delivering access to applications over a public Internet. Obviously, customers could access infrastructure elements of applications, such as databases, over the Internet as well. However, the purpose here is to focus on cloud computing’s impact on the inverse of public cloud computing, which is private cloud computing. Private cloud computing is the means of delivering some of the value of cloud computing, but through corporate-managed networks and computing infrastructures. The tradeoff of private clouds is that customers retain control over security and deployment of vital elements, although at a higher cost to the enterprise than could be afforded through a public cloud. For the purposes of the following discussion, private clouds in the enterprise are simply referred to as enterprise cloud computing.
What does it mean to manage an Oracle database in an enterprise cloud? The change to an enterprise cloud model is primarily how the databases and underlying infrastructure are managed. In a conventional model, separate applications and databases are deployed in a vertical fashion. This means new applications and supporting infrastructure are committed to their own hardware stack. In an enterprise cloud model, new applications and databases are not deployed on dedicated platforms and software infrastructure, but may share platforms with other applications. One model that supports such sharing is multitenancy, which means the sharing of a software component by multiple consumers through logical separation of the consumers by the software component. With respect to a database, an example of multitenancy is the sharing of a single database instance by multiple applications where each application contains its own schema and the database provides the means of enforcing access separation between the applications sharing the database instance.
Another architectural tool used to create enterprise clouds is virtualization. An example of virtualization used in the service of enterprise clouds is server virtualization, where companies deploy several applications and associated infrastructure on a single platform, but each application environment operates on its own virtual server image. There is public debate as to the merits of these approaches for creating an enterprise cloud environment, but from an Oracle perspective, a lot of development attention surrounds product features supporting these trends.
From a conceptual level, there are at least four obvious product development areas, with respect to Oracle’s database, that will evolve to support customers creating enterprise clouds:
image   Large-scale clustering   An enterprise cloud is a collection of IT resources that is malleable and can expand or contract to changing application demands. From an Oracle database perspective, this flexibility is provided with database clustering. Enterprise clouds dictate that the enterprise will have a growing need for larger clusters of servers and storage that are easily managed.
image   Large-scale platform sharing   As much as there is a need to scale database services for demanding applications operating within a cluster, there is also a requirement to effectively share database instances for less demanding applications on a single server. Examples of technologies providing such sharing include database multitenancy and server virtualization.
image   Efficient cluster reconfiguration   An enterprise cloud with respect to clustering is not one large single cluster, but many separately managed clusters. An enterprise cloud requires these collections of clusters to be easily reconfigured to adapt to changing needs. There are also unplanned reconfigurations, which are the results of component failures. Consequently, cluster reconfigurations must be as seamless and transparent to the applications as possible.
image   Enterprise cloud management model   Cloud computing in the enterprise is as much about a change regarding management mindset as a technology change. The enterprise cloud management model dictates thinking of IT as delivering a set of services rather than components. It does not matter to the end consumer where their business applications run, as long as the expected services are delivered as agreed upon by their service-level agreements (SLAs). This means that the staff managing the enterprise cloud must have tools for ensuring the SLAs are delivered and that services are appropriately charged for.
The preceding key requirements regarding cloud computing in the enterprise will drive product development for the next several years. Next, we’ll look at how these requirements will likely affect ASM evolution.
ASM in the Cloud
ASM was originally intended to solve one problem: reduce the management challenge associated with storage used for the Oracle database. However, the original approach implemented with ASM meant that it could not be used for managing storage and data outside of the database. The second major development phase for ASM brought ACFS that extended ASM’s storage management model to data outside of the database. Clouds computing in the enterprise will likely further the idea of ASM being the underpinning for storage management for all elements remotely associated with the Oracle database. Cloud computing means applications and supporting infrastructure must be malleable within and across servers and clusters. Storage management that is tied to an isolated platform impedes this malleability.
Common Storage Management
Storage management must be global with respect to the enterprise cloud. From an architectural perspective, global storage management can be achieved either at the storage level or at the host level. At the extremes, global storage management at the storage level implies storage is totally managed and made available to applications and infrastructure through storage array products, such as those available from EMC and Network Appliance. Global storage management at the host means that less is expected from the physical storage and that storage management is principally provided by host management components providing a global management structure with respect to the enterprise cloud. ASM/ACFS is an example of host-based global storage management, and over time it will extend to provide a greater reach of management, not only with respect to application data, but across cluster boundaries. The idea is that ASM/ACFS will be the common storage and data management platform for the enterprise cloud.
Enterprise Cloud Robustness
For ASM to be an effective platform for storage and data management in an enterprise cloud, it must adapt to the four product development areas described in the previous section. It should be expected that ASM evolution will include growing support for larger clusters. An example of this is that as the cluster size increases, ASM will not become an impediment to that growth, which could result from lock contention and reconfiguration overhead. All host-based global storage management components require some form of serialization for access to shared resources. This commonly involves a global lock manager. As the cluster size increases, contention for locks increases. If not implemented effectively, this contention can impede the largest effective size of a cluster.
A related ASM evolution is associated with the cost of cluster reconfiguration. Whenever a cluster is reconfigured—either planned or unplanned—overhead is associated with reconfiguring management elements and updating the metadata associated with all the active members of the cluster. Larger clusters, particularly in the enterprise cloud, imply a far greater frequency of cluster reconfiguration events. It should be expected that ASM will not only evolve to minimize this overhead, but also to limit the impact to services that might otherwise be affected by a reconfiguration event.
Enterprise Cloud Policy Management
The cloud computing environment is far more dynamic than a non-cloud environment. Clusters’ membership will change frequently, and the storage management infrastructure must quickly adapt to these frequent changes. Additionally, storage management will require provisioning and must deliver a wide range in service levels with respect to performance and reliability. Matching the available storage capabilities against a consistently changing set of demands could lead to an unmanageable environment.
Cross-Cluster Sharing
A typical enterprise cloud will likely contain many separate cluster environments. Separate cluster environments provide fault and performance isolation between workloads that are highly independent and require varying service levels. Yet, even with this separation, there will be a need to share access to data between the clusters. An example of this is the needed separation between the production environment and the test and development environment of the same application. Although the test and development environment is isolated from the production environment, testing may require controlled access to the production environment. This places a requirement on ASM to enable cross-cluster access of data. This is not easily available in Oracle 11.2, but will be a requirement in the future.
Summary
Cloud computing has been driven by the need to reduce costs, improve utilization, and produce better efficiency. At the center of this cloud movement has been the Private Database Cloud. With its support for all file and content types, Oracle ASM similarly has become a core component of the Oracle Private Database cloud movement.
 
 
2
ASM and Grid Infrastructure Stack
In releases prior to 11gR2, Automatic Storage Management (ASM) was tightly integrated with the Clusterware stack. In 11gR2, ASM is not only tightly integrated with the Clusterware stack, it’s actually part of the Clusterware stack. The Grid Infrastructure stack is the foundation of Oracle’s Private Database Cloud, and it provides the essential Cloud Pool capabilities, such as growing server and storage capacity as needed. This chapter discusses how ASM fits into the Oracle Clusterware stack.
Clusterware Primer
Oracle Clusterware is the cross-platform cluster software required to run the Real Application Clusters (RAC) option for Oracle Database and provides the basic clustering services at the operating system level that enable Oracle software to run in clustered mode. The two main components of Oracle Clusterware are Cluster Ready Services and Cluster Synchronization Services:
image   Cluster Ready Services (CRS)   Provides high-availability operations in a cluster. The CRS daemon (CRSd) manages cluster resources based on the persistent configuration information stored in Oracle Cluster Registry (OCR). These cluster resources include the Oracle Database instance, listener, VIPs, SCAN VIPs, and ASM. CRSd provides start, stop, monitor, and failover operations for all the cluster resources, and it generates events when the status of a resource changes.
image   Cluster Synchronization Services (CSS)   Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a member (node) joins or leaves the cluster. The following functions are provided by the Oracle Cluster Synchronization Services daemon (OCSSd):
image   Group Services   A distributed group membership system that allows for the synchronization of services between nodes
image   Lock Services   Provide the basic cluster-wide serialization locking functions
image   Node Services   Use OCR to store state data and update the information during reconfiguration
OCR Overview
The Oracle Cluster Registry is the central repository for all the resources registered with Oracle Clusterware. It contains the profile, state, and ownership details of the resources. This includes both Oracle resources and user-defined application resources. Oracle resources include the node apps (VIP, ONS, GSD, and Listener) and database resources, such as database instances, and database services. Oracle resources are added to the OCR by tools such as DBCA, NETCA, and srvctl.
Voting File Overview
Oracle Clusterware maintains membership of the nodes in the cluster using a special file called voting disk (mistakenly also referred to as quorum disk). Sometimes, the voting disk is also referred to as the vote file, so you’ll see this referenced both ways, and both are correct. This file contains the heartbeat records from all the nodes in the cluster. If a node loses access to the voting file or is not able to complete the heartbeat I/O within the threshold time, then that node is evicted out of the cluster. Oracle Clusterware also maintains heartbeat with the other member nodes of the cluster via the shared private interconnect network. A split-brain syndrome occurs when there is a failure in the private interconnect whereby multiple sub-clusters are formed within the clustered nodes and the nodes in different sub-clusters are not able to communicate with each other via the interconnect network but they still have access to the voting files. The voting file enables Clusterware to resolve network split brain among the cluster nodes. In such a situation, the largest active sub-cluster survives. Oracle Clusterware requires an odd number of voting files (1, 3, 5, …) to be created. This is done to ensure that at any point in time, an active member of the cluster has access to the majority number (n / 2 + 1) of voting files.
Here’s a list of some interesting 11gR2 changes for voting files:
image   The voting files’ critical data is stored in the voting file and not in the OCR anymore. From a voting file perspective, the OCR is not touched at all. The critical data each node must agree on to form a cluster is, for example, miscount and the list of voting files configured.
image   In Oracle Clusterware 11g Release 2 (11.2), it is no longer necessary to back up the voting files. The voting file data is automatically backed up in OCR as part of any configuration change and is automatically restored as needed. If all voting files are corrupted, users can restore them as described in the Oracle Clusterware Administration and Deployment Guide.
Grid Infrastructure Stack Overview
The Grid Infrastructure stack includes Oracle Clusterware components, ASM, and ASM Cluster File System (ACFS). Throughout this chapter, as well as the book, we will refer to Grid Infrastructure as the GI stack.
The Oracle GI stack consists of two sub-stacks: one managed by the Cluster Ready Services daemon (CRSd) and the other by the Oracle High Availability Services daemon (OHASd). How these sub-stacks come into play depends on how the GI stack is installed. The GI stack is installed in two ways:
image   Grid Infrastructure for Standalone Server
image   Grid Infrastructure for Cluster
ASM is available in both these software stack installations. When Oracle Universal Installer (OUI) is invoked to install Grid Infrastructure, the main screen will show four options (see Figure 2-1). In this section, the options we want to focus on are Grid Infrastructure for Standalone Server and Grid Infrastructure for Cluster.
image
image
image
FIGURE 2-1.   Oracle Universal Installer for Grid Infrastructure install
Grid Infrastructure for Standalone Server
Grid Infrastructure for Standalone Server is essentially the single-instance (non-clustered) configuration, as in previous releases. It is important to note that in 11gR2, because ASM is part of the GI stack, Clusterware must be installed first before the database software is installed; this holds true even for single-instance deployments. Keep in mind that ASM will not need to be in a separate ORACLE_HOME; it is installed and housed in the GI ORACLE_HOME.
Grid Infrastructure for Standalone Server does not configure the full Clusterware stack; just the minimal components are set up and enabled—that is, private interconnect, CRS, and OCR/voting files are not enabled or required. The OHASd startup and daemon replaces all the existing pre-11.2 init scripts. The entry point for OHASd is /etc/inittab, which executes the /etc/init.d/ohasd and /etc/init.d/init.ohasd control scripts, including the start and stop actions. This OHASD script is the framework control script, which will spawn the $GI_HOME/bin/ohasd.bin executable. The OHASd is the main daemon that provides High Availability Services (HAS) and starts the remaining stack, including ASM, listener, and the database in a single-instance environment.
A new feature that’s automatically enabled as part of Grid Infrastructure for Standalone Server installation is Oracle Restart, which provides high-availability restart functionality for failed instances (database and ASM), services, listeners, and dismounted disk groups. It also ensures these protected components start up and shut down according to the dependency order required. This functionality essentially replaces the legacy dbstart/dbstop script used in the pre-11gR2 single-instance configurations. Oracle Restart also executes health checks that periodically monitor the health of these components. If a check operation fails for a component, the component is forcibly shut down and restarted. Note that Oracle Restart is only enabled in GI for Standalone Server (non-clustered) environments. For clustered configurations, health checks and the monitoring capability are provided by Oracle Clusterware CRS agents.
When a server that has Grid Infrastructure for Standalone Server enabled is booted up, the HAS process will initialize and start up by first starting up ASM. ASM has a hard-start (pull-up) dependency with CSS, so CSS is started up. Note that there is a hard-stop dependency between ASM and CSS, so on stack shutdown ASM will stop and then CSS will stop.
Grid Infrastructure for Cluster
Grid Infrastructure for Cluster is the traditional installation of Clusterware. It includes multinode RAC support, private interconnect, Clusterware files, and now also installs ASM and ACFS drivers. With Oracle Clusterware 11gR2, ASM is not simply the storage manager for database files, but also houses the Clusterware files (OCR and voting files) and the ASM spfile.
When you select the Grid Infrastructure for Cluster option in OUI, as shown previously in Figure 2-1, you will be prompted next on file storage options for the Clusterware files (Oracle Clusterware Registry and Clusterware voting file). This is shown in Figure 2-2.
image
image
image
FIGURE 2-2.   Oracle Universal Installer Storage Option screen
Users are prompted to place Clusterware files on either a shared file system or ASM. Note that raw disks are not supported any longer for new installations. Oracle will support the legacy method of storing Clusterware files (raw and so on) in upgrade scenarios only.
When ASM is selected as the storage location for Clusterware files, the Create ASM Disk Group screen is shown next (see Figure 2-3). You can choose external or ASM redundancy for the storage of Clusterware files. However, keep in mind that the type of redundancy affects the redundancy (or number of copies) of the voting files.
image
image
image
FIGURE 2-3.   Create ASM Disk Group screen
For example, for normal redundancy, there needs to be a minimum of three failure groups, and for high redundancy a minimum of five failure groups. This requirement stems from the fact that an odd number of voting files must exist to enable a vote quorum. Additionally this enables to tolerate one or two disk failures and still provide quorums.
This first disk group that is created during the installation can also be used to store database files. In previous versions of ASM, this disk group was referred to as the DATA disk group. Although it is recommended that you create a single disk group for storing the Clusterware files and database files, for users who are employing a third-party vendor snapshot technology against the ASM disk group, users may want to have a separate disk group for the Clusterware files. Users may also deploy a separate disk group for the Clusterware to leverage normal or high redundancy for the Clusterware files. In both of the cases, users should create a small CRSDATA disk group with 1MB AU and enough failure groups to support the redundancy required. Next, the installation users can then use ASMCA to create the DATA disk group.
Voting Files and Oracle Cluster Repository Files in ASM
In versions prior to 11gR2, users needed to configure and set up raw devices for housing the Clusterware files (OCR and voting files). This step creates additional management overhead and is error prone. Incorrect OCR/voting files setup creates havoc for the Clusterware installation and directly affects run-time environments. To mitigate these install preparation issues, 11gR2 allows the storing of the Clusterware files in ASM; this also eliminates the need for a third-party cluster file system and eliminates the complexity of managing disk partitions for the OCR and voting files. The COMPATIBLE.ASM disk group compatibility attribute must be set to 11.2 or greater to store the OCR or voting file data in a disk group. This attribute is automatically set for new installations with the OUI. Note that COMPATIBLE.RDBMS does not need to be advanced to enable this feature. The COMPATIBLE.* attributes topic is covered in Chapter 3.
Voting Files in ASM
If you choose to store voting files in ASM, then all voting files must reside in ASM in a single disk group (in other words, Oracle does not support mixed configurations of storing some voting files in ASM and some on NAS devices).
Unlike most ASM files, the voting files are wholly consumed in multiple contiguous AUs. Additionally, the voting file is not stored as a standard ASM file (that is, it cannot be listed in the asmcmd ls command). However, the disk that contains the voting file is reflected in the V$ASM_DISK view:
image
image
The number of voting files you want to create in a particular Oracle ASM disk group depends on the redundancy of the disk group:
image   External redundancy   A disk group with external redundancy can store only one voting file. Currently, no supported way exists to have multiple voting files stored on an external redundancy disk group.
image   Normal redundancy   A disk group with normal redundancy can store up to three voting files.
image   High redundancy   A disk group with high redundancy can store up to five voting files.
In this example, we created an ASM disk group with normal redundancy for the disk group containing voting files. The following can be seen:
image
image
ASM puts each voting file in its own failure group within the disk group. A failure group is defined as the collection of disks that have a shared hardware component for which you want to prevent its loss from causing a loss of data.
For example, four drives that are in a single removable tray of a large JBOD (just a bunch of disks) array are in the same failure group because the tray could be removed, making all four drives fail at the same time. Conversely, drives in the same cabinet can be in multiple failure groups if the cabinet has redundant power and cooling so that it is not necessary to protect against the failure of the entire cabinet. If voting files are stored on ASM with normal or high redundancy, and the storage hardware in one failure group suffers a failure, then if another disk is available in a disk group in an unaffected failure group, ASM allocates new voting files in other candidate disks.
Voting files are managed differently from other files that are stored in ASM. When voting files are placed on disks in an ASM disk group, Oracle Clusterware records exactly on which disks in that disk group they are located. Note that CSS has access to voting files even if ASM becomes unavailable.
Voting files can be migrated from raw/block devices into ASM. This is a typical scenario for upgrade scenarios. For example, when a user upgrades from 10g to 11gR2, they are allowed to continue storing their OCR/voting files on raw, but at a later convenient time they can migrate these Clusterware files into ASM. It is important to point out that users cannot migrate to Oracle Clusterware 12c from 10g without first moving the voting files into ASM (or shared file system), since raw disks are no longer supported even for upgraded environments in 12c.
The following illustrates this:
image
image
Voting File Discovery
The method by CSS that identifies and locates voting files has changed in 11.2. Before 11gR2, the voting files were located via lookup in OCR; in 11gR2, voting files are located via a Grid Plug and Play (GPnP) query. GPnP, a new component in the 11gR2 Clusterware stack, allows other GI stack components to query or modify cluster-generic (non-node-specific) attributes. For example, the cluster name and network profiles are stored in the GPnP profile. The GPnP configuration, which consists of the GPnP profile and wallet, is created during the GI stack installation. The GPnP profile is an XML file that contains bootstrap information necessary to form a cluster. This profile is identical on every peer node in the cluster. The profile is managed by gpnpd and exists on every node (in gpnpd caches). The profile should never be edited because it has a profile signature that maintains its integrity.
When the CSS component of the Clusterware stack starts up, it queries the GPnP profile to obtain the disk discovery string. Using this disk string, CSS performs a discovery to locate its voting files.
The following is an example of a CSS GPnP profile entry. To query the GPnP profile, the user should use the supplied (in CRS ORACLE_HOME) gpnptool utility:
image
image
The CSS voting file discovery string anchors into the ASM profile entry; that is, it derives its DiscoveryString from the ASM profile entry. The ASM profile lists the value in the ASM discovery string as ‘/dev/mapper/*’. Additionally, ASM uses this GPnP profile entry to locate its spfile.
Voting File Recovery
Here’s a question that is often heard: If ASM houses the Clusterware files, then what happens if the ASM instance is stopped? This is an important point about the relationship between CSS and ASM. CSS and ASM do not communicate directly. CSS discovers its voting files independently and outside of ASM. This is evident at cluster startup when CSS initializes before ASM is available. Thus, if ASM is stopped, CSS continues to access the voting files, uninterrupted. Additionally, the voting file is backed up into the OCR at every configuration change and can be restored with the crsctl command.
If all voting files are corrupted, you can restore them as described next.
Furthermore, if the cluster is down and cannot restart due to lost voting files, you must start CSS in exclusive mode to replace the voting files by entering the following command:
image
image
Oracle Cluster Registry (OCR)
Oracle Clusterware 11gR2 provides the ability to store the OCR in ASM. Up to five OCR files can be stored in ASM, although each has to be stored in a separate disk group.
The OCR is created, along with the voting disk, when root.sh of the OUI installation is executed. The OCR is stored in an ASM disk group as a standard ASM file with the file type OCRFILE. The OCR file is stored like other ASM files and striped across all the disks in the disk group. It also inherits the redundancy of the disk group. To determine which ASM disk group the OCR is stored in, view the default configuration location at /etc/oracle/ocr.loc:
image
image
The disk group that houses the OCR file is automounted by the ASM instance during startup.
All 11gR2 OCR commands now support the ASM disk group. From a user perspective, OCR management and maintenance works the same as in previous versions, with the exception of OCR recovery, which is covered later in this section. As in previous versions, the OCR is backed up automatically every four hours. However, the new backup location is <GRID_HOME>/cdata/<scan name>.
A single OCR file is stored when an external redundancy disk group is used. It is recommended that for external redundancy disk groups an additional OCR file be created in another disk group for added redundancy. This can be done as follows:
image
image
In an ASM redundancy disk group, the ASM partnership and status table (PST) is replicated on multiple disks. In the same way, there are redundant extents of OCR file stored in an ASM redundancy disk group. Consequently, OCR can tolerate the loss of the same number of disks as are in the underlying disk group, and it can be relocated/rebalanced in response to disk failures. The ASM PST is covered in Chapter 9.
OCR Recovery
When a process (OCR client) that wants to read the OCR incurs a corrupt block, the OCR client I/O will transparently reissue the read to the mirrored extents for a normal- or high-redundancy disk group. In the background the OCR master (nominated by CRS) provides a hint to the ASM layer identifying the corrupt disk. ASM will subsequently start “check disk group” or “check disk,” which takes the corrupt disk offline. This corrupt block recovery is only possible when the OCR is configured in a normal- or high-redundancy disk group.
In a normal- or high-redundancy disk group, users can recover from the corruption by taking either of the following steps:
image   Use the ALTER DISK GROUP CHECK statement if the disk group is already mounted.
image   Remount the disk group with the FORCE option, which also takes the disk offline when it detects the disk header corruption. If you are using an external redundancy disk group, users must restore the OCR from backup to recover from a corruption. Starting in Oracle Clusterware 11.2.0.3, the OCR backup can be stored in a disk group as well.
The workaround is to configure an additional OCR location on a different storage location using the ocrconfig -add command. OCR clients can tolerate a corrupt block returned by ASM, as long as the same block from the other OCR locations (mirrors) is not corrupt. The following guidelines can be used to set up a redundant OCR copy:
image   Ensure that the ASM instance is up and running with the required disk group mounted and/or check ASM alert.log for the status for the ASM instance.
image   Verify that the OCR files were properly created in the disk group, using asmcmd ls. Because the Clusterware stack keeps accessing OCR files, most of the time the error will show up as a CRSD error in the crsd.log. Any errors related to an ocr* command will generate a trace file in the Grid_home/log/<hostname>/client directory; look for kgfo, kgfp, or kgfn at the top of the error stack.
Use Case Example
A customer has an existing three-node cluster with an 11gR1 stack (CRS 11.1.0.7; ASM 11.1.0.7; DB 11.1.0.7). They want to migrate to a new cluster with new server hardware but the same storage. They don’t want to install 11.1.0.7 on the new servers; they just want to install 11.2.0.3. In other words, instead of doing an upgrade, they want to create a new “empty” cluster and then “import” the ASM disks into the 11.2 ASM instance. Is this possible?
Yes. To make this solution work, you will install the GI stack and create a new cluster on the new servers, stop the old cluster, and then rezone the SAN paths to the new servers. During the GI stack install, when you’re prompted in the OUI to configure the ASM disk group for a storage location for the OCR and voting files, use the drop-down box to use an existing disk group. The other option is to create a new disk group for the Clusterware files and then, after the GI installation, discover and mount the old 11.1.0.7 disk group. You will need to do some post-install work to register the databases and services with the new cluster.
The Quorum Failure Group
In certain circumstances, customers might want to build a stretch cluster. A stretch cluster provides protection from site failure by allowing a RAC configuration to be set up across distances greater than what’s typical “in the data center.” In these RAC configurations, a third voting file must be created at a third location for cluster arbitration. In pre-11gR2 configurations, users set up this third voting file on a NAS from a third location. In 11gR2, the third voting file can now be stored in an ASM quorum failure group.
The “Quorum Failgroup” clause was introduced for setups with Extended RAC and/or for setups with disk groups that have only two disks (respectively, only two failure groups) but want to use normal redundancy.
A quorum failure group is a special type of failure group where the disks do not contain user data and are not considered when determining redundancy requirements. Unfortunately, during GI stack installation, the OUI does not offer the capability to create a quorum failure group. However, this can be set up after the installation. In the following example, we create a disk group with a failure group and optionally a quorum failure group if a third array is available:
image
image
If the disk group creation was done using ASMCA, then after we add a quorum disk to the disk group, Oracle Clusterware will automatically change the CSS vote disk location to the following:
image
image
Clusterware Startup Sequence—Bootstrap If OCR Is Located in ASM
Oracle Clusterware 11g Release 2 introduces an integral component called the cluster agents. These agents are highly available, multithreaded daemons that implement entry points for multiple resource types.
ASM has to be up with the disk group mounted before any OCR operations can be performed. OHASd maintains the resource dependency and will bring up ASM with the required disk group mounted before it starts the CRSd. Once ASM is up with the disk group mounted, the usual ocr* commands (ocrcheck, ocrconfig, and so on) can be used. Figure 2-4 displays the client connections into ASM once the entire stack, including the database, is active.
image
image
image
FIGURE 2-4.   Clusterware startup sequence
image
NOTE
This lists the processes connected to ASM using the OS ps command. Note that most of these are bequeath connections.
image
image
The following output displays a similar listing but from an ASM client perspective:
image
image
There will be an ASM client listed for the connection OCR:
image
image
Here, +data.255 is the OCR file number, which is used to identify the OCR file within ASM.
The voting files, OCR, and spfile are processed differently at bootstrap:
image   Voting file   The GPnP profile contains the disk group name where the voting files are kept. The profile also contains the discovery string that covers the disk group in question. When CSS starts up, it scans each disk group for the matching string and keeps track of the ones containing a voting disk. CSS then directly reads the voting file.
image   ASM spfile   The ASM spfile location is recorded in the disk header(s), which has the spfile data. It is always just one AU. The logic is similar to CSS and is used by the ASM server to find the parameter file and complete the bootstrap.
image   OCR file   OCR is stored as a regular ASM file. Once the ASM instance comes up, it mounts the disk group needed by the CRSd.
Disk Groups and Clusterware Integration
Before discussing the relationship of ASM and Oracle Clusterware, it’s best to provide background on CRS modeling, which describes the relationship between a resource, the resource profile, and the resource relationship. A resource, as described previously, is any entity that is being managed by CRS—for example, physical (network cards, disks, and so on) or logical (VIPs, listeners, databases, disk groups, and so on). The resource relationship defines the dependency between resources (for example, state dependencies or proximities) and is considered to be a fundamental building block for expressing how an application’s components interact with each other. Two or more resources are said to have a relationships when one (or both) resource(s) either depends on or affects the other. For example, CRS modeling mandated that the DB instance resource depend on the ASM instance and the required disk groups.
As discussed earlier, because Oracle Clusterware version 11gR2 allows the Clusterware files to be stored in ASM, the ASM resources are also managed by CRS. The key resource managed by CRS is the ASM disk group resource.
Oracle Clusterware 11g Release 2 introduces a new agent concept that makes cluster resource management very efficient and scalable. These agents are multithreaded daemons that implement entry points for multiple resource types and spawn new processes for different users. The agents are highly available and, besides oraagent, orarootagent, and cssdagent/cssdmonitor, there can be an application agent and a script agent. The two main agents are oraagent and orarootagent. As the names suggest, oraagent and orarootagent manage resources owned by Oracle and root, respectively. If the CRS user is different from the ORACLE user, then CRSd would utilize two oraagents and one orarootagent. The main agents perform different tasks with respect to ASM. For example, oraagent performs the start/stop/check/clean actions for ora.asm, database, and disk group resources, whereas orarootagent performs start/stop/check/clean actions for the ora.diskmon and ora.drivers.acfs resources.
The following output shows typical ASM-related CRS resources:
image
image
When the disk group is created, the disk group resource is automatically created with the name, ora.<DGNAME>.dg, and the status is set to ONLINE. The status OFFLINE will be set if the disk group is dismounted, because this is a CRS-managed resource now. When the disk group is dropped, the disk group resource is removed as well. A dependency between the database and the disk group is automatically created when the database tries to access the ASM files. More specifically, a “hard” dependency type is created for the following files types: datafiles, controlfiles, online logs, and SPFile. These are the files that are absolutely needed to start up the database; for all other files, the dependency is set to weak. This becomes important when there are more than two disk groups: one for archive, another for flash or temp, and so on. However, when the database no longer uses the ASM files or the ASM files are removed, the database dependency is not removed automatically. This must be done using the srvctl command-line tool.
The following database CRS profile illustrates the dependency relationships between the database and ASM:
image
image
image
Summary
The tighter integration between ASM and Oracle Clusterware provides the capability for quickly deploying new applications as well as managing changing workloads and capacity requirements. This faster agility and elasticity are key drivers for the Private Database Cloud. In addition, the ASM/Clusterware integration with the database is the platform at the core factor of Oracle’s Engineered Systems.
 

1 comment: