Exadata Storage Server
- Exadata Storage Server is highly optimized storage for Oracle Database.
- It delivers outstanding I/O and SQL processing for database.
- A Single Storage server is also called a Cell. A Cell is building block for a Storage grid.
- Each cell has OS(linux x86_64), CPUs,Memory,a bus,disks and network adapters.
- Storage server backup happens automatically. It use internal USB drive called the Cellboot FlashDrive to take backup of software.
1) High Capacity Storage server- It comes with more storage space and less RPM.The maximum SQL bandwidth for a full Rack database machine(14 cells) is 25GB/s
2) Extreme Flash Storage server- It comes with less storage and high RPM.The maximum SQL bandwidth for a full Rack database machine(14 cells) is 263GB/s. Data permanently resides on high-performance flash drive.
Advantage of Storage Server
2) Extreme Flash Storage server- It comes with less storage and high RPM.The maximum SQL bandwidth for a full Rack database machine(14 cells) is 263GB/s. Data permanently resides on high-performance flash drive.
Advantage of Storage Server
- Database can offload some database processing on storage server which is called Smart Scan.
- It is highly optimized for fast processing of large queries.
- It is intelligent to use high performance flash memory to boost performance
- It use InfiniBand network for higher throughput.
- It supports Hybrid Columnar compression which provide high level of data compression.
- It manage I/O resource through IORM.
- It uses ASM to evenly distribute the storage load for every database.
- We can assign dedicated storage to a single database . Shared storage is not a perfect solution. Running multiple types of workloads and databases on shared storage often leads to performance problems. large parallel queries on one database can impact the performance of critical queries on another database. Also, a data load on an analytics database can impact the performance of critical queries running on it.
Disk Layout
Disk Layout-The disk layout needs some additional explanation because that’s where most of the activities occur. The disks are attached to the storage cells and presented as logical units (LUNs), on which physical volumes are built. Each cell has 12 physical disks. In a high capacity configuration they are about 8TB and in a high performance configuration, they are about 3.2GB each. The disks are used for the database storage. Two of the 12 disks are also used for the home directory and other Linux operating system files.
- The physical disks are divided into multiple partitions. Each partition is then presented as a LUN to the cell. Some LUNs are used to create a file system for the OS. The others are presented as storage to the cell. These are called cell disks. The cell disks are further divided as grid disks, These grid disks are used to build ASM Diskgroups, so they are used as ASM disks. An ASM diskgroup is made up of several ASM disks from multiple storage cells. If the diskgroup is built with normal or high redundancy (which is the usual case), the failure groups are placed in different cells. As a result, if one cell fails, the data is still available on other cells. Finally the database is built on these diskgroups.
- Cell disk and Grid Disk are a logical component of the physical Exadata storage. A cell or Exadata Storage server cell is a combination of Disk Drives put together to store user data.
- Each Cell Disk corresponds to a LUN (Logical Unit) which has been formatted by the Exadata Storage Server Software. Typically, each cell has 12 disk drives mapped to it.
- Grid Disks are created on top of Cell Disks and are presented to Oracle ASM as ASM disks. Space is allocated in chunks from the outer tracks of the Cell disk and moving inwards. One can have multiple Grid Disks per Cell disk.
List the LUNs on your primary Exadata cell/Storage Server.
CellCLI> list lun
- A cell disk is a higher level storage abstraction. Each cell disk is based on a LUN and contains additional attributes and metadata.
CellCLI> list celldisk CD_09_qr01celadm01 detail
- A grid disk defines an area of storage on a cell disk. Grid disks are consumed by ASM and are used as the storage for ASM disk groups.
- Each cell disk can contain a number of grid disks.Examine the grid disks associated with the cell disk
CellCLI> list griddisk where celldisk=CD_09_qr01celadm01 detail
- By default, Exadata Smart Flash Cache is configured across all the flash-based cell disks.Use the LIST FLASHCACHE DETAIL command to confirm that Exadata Smart Flash Cache is configured on your flash-based cell disks.
Storage Objects in Exadata
In an Exadata environment we have the following disk types:
Physical disk is a hard disk on a storage cell. Each storage cell has 12 physical disks, all with the same capacity (600 GB, 2 TB or 3 TB).
Flashdisk is a Sun Flash Accelerator PCIe solid state disk on a storage cell. Each storage cell has 16 flashdisks - 24 GB each in X2 (Sun Fire X4270 M2) and 100 GB each in X3 (Sun Fire X4270 M3) servers.
Celldisk is a logical disk created on every physicaldisk and every flashdisk on a storage cell. Celldisks created on physicaldisks are named CD_00_cellname, CD_01_cellname ... CD_11_cellname. Celldisks created on flashdisks are named FD_00_cellname, FD_01_cellname ... FD_15_cellname.
Griddisk is a logical disk that can be created on a celldisk. In a standard Exadata deployment we create griddisks on hard disk based celldisks only. While it is possible to create griddisks on flashdisks, this is not a standard practice; instead we use flash based celldisks for the flashcashe and flashlog.
ASM Disk - is a grid disk (in Exadata Environment )- used to create ASM Disk Groups.- ASM and Database Instances have access to them.
Auto Disk Management Concept in Exadata
These are the disk operations that are automated in Exadata:
1. Grid disk status change to OFFLINE/ONLINE
If a griddisk becomes temporarily unavailable, it will be automatically OFFLINED by ASM. When the griddisk becomes available, it will be automatically ONLINED by ASM.
2. Grid disk DROP/ADD
If a physical disk fails, all grid disks on that physical disk will be DROPPED with FORCE option by ASM. If a physical disk status changes to predictive failure, all griddisks on that physical disk will be DROPPED by ASM. If a flash disk performance degrades, the corresponding griddisks (if any) will be DROPPED with FORCE option by ASM.
When a physical disk is replaced, the celldisk and griddisks will be recreated by CELLSRV, and the griddisks will be automatically ADDED by ASM.
NOTE: If a griddisk in NORMAL state and in ONLINE mode status, is manually dropped with FORCE option (for example, by a DBA with 'alter diskgroup ... drop disk ... force'), it will be automatically added back by ASM. In other words, dropping a healthy disk with a force option will not achieve the desired effect.
3. Grid disk OFFLINE/ONLINE for rolling Exadata software (storage cells) upgrade
Before the rolling upgrade all griddisks will be inactivated on the storage cell by CELLSRV and OFFLINED by ASM. After the upgrade all griddisks will be activated on the storage cell and ONLINED in ASM.
4. Manual grid disk activation/inactivation
If a gridisk is manually inactivated on a storage cell, by running 'cellcli -e alter griddisk ... inactive', it will be automatically OFFLINED by ASM. When a gridisk is activated on a storage cell, it will be automatically ONLINED by ASM.
5. Grid disk confined ONLINE/OFFLINE
If a grid disk is taken offline by CELLSRV, because the underlying disk is suspected for poor performance, all grid disks on that cell disk will be automatically OFFLINED by ASM. If the tests confirm that the celldisk is performing poorly, ASM will drop all griddisks on that celldisk. If the tests find that the disk is actually fine, ASM will online all grid disks on that celldisk.
Software components
1. Cell Server (CELLSRV)
The Cell Server (CELLSRV) runs on the storage cell and it's the main component of Exadata software. In the context of automatic disk management, its tasks are to process the Management Server notifications and handle ASM queries about the state of griddisks.
2. Management Server (MS)
The Management Server (MS) runs on the storage cell and implements a web service for cell management commands, and runs background monitoring threads. The MS monitors the storage cell for hardware changes (e.g. disk plugged in) or alerts (e.g. disk failure), and notifies the CELLSRV about those events.
3. Automatic Storage Management (ASM)
The Automatic Storage Management (ASM) instance runs on the compute (database) node and has two processes that are relevant to the automatic disk management feature:
Exadata Automation Manager (XDMG) initiates automation tasks involved in managing Exadata storage. It monitors all configured storage cells for state changes, such as a failed disk getting replaced, and performs the required tasks for such events. Its primary tasks are to watch for inaccessible disks and cells and when they become accessible again, to initiate the ASM ONLINE operation.
Exadata Automation Manager (XDWK) performs automation tasks requested by XDMG. It gets started when asynchronous actions such as disk ONLINE, DROP and ADD are requested by XDMG. After a 5 minute period of inactivity, this process will shut itself down.
Working together
All three software components work together to achieve automatic disk management.
In the case of disk failure, the MS detects that the disk has failed. It then notifies the CELLSRV about it. If there are griddisks on the failed disk, the CELLSRV notifies ASM about the event. ASM then drops all griddisks from the corresponding disk groups.
In the case of a replacement disk inserted into the storage cell, the MS detects the new disk and checks the cell configuration file to see if celldisk and griddisks need to be created on it. If yes, it notifies the CELLSRV to do so. Once finished, the CELLSRV notifies ASM about new griddisks and ASM then adds them to the corresponding disk groups.
In the case of a poorly performing disk, the CELLSRV first notifies ASM to offline the disk. If possible, ASM then offlines the disk. One example when ASM would refuse to offline the disk, is when a partner disk is already offline. Offlining the disk would result in the disk group dismount, so ASM would not do that. Once the disk is offlined by ASM, it notifies the CELLSRV that the performance tests can be carried out. Once done with the tests, the CELLSRV will either tell ASM to drop that disk (if it failed the tests) or online it (if it passed the test).
The actions by MS, CELLSRV and ASM are coordinated in a similar fashion, for other disk events.
ASM initialization parameters
The following are the ASM initialization parameters relevant to the auto disk management feature:
_AUTO_MANAGE_EXADATA_DISKS controls the auto disk management feature. To disable the feature set this parameter to FALSE. Range of values: TRUE [default] or FALSE.
_AUTO_MANAGE_NUM_TRIES controls the maximum number of attempts to perform an automatic operation. Range of values: 1-10. Default value is 2.
_AUTO_MANAGE_MAX_ONLINE_TRIES controls maximum number of attempts to ONLINE a disk. Range of values: 1-10. Default value is 3.
All three parameters are static, which means they require ASM instances restart. Note that all these are hidden (underscore) parameters that should not be modified unless advised by Oracle Support.
Files
The following are the files relevant to the automatic disk management feature:
1. Cell configuration file - $OSSCONF/cell_disk_config.xml. An XML file on the storage cell that contains information about all configured objects (storage cell, disks, IORM plans, etc) except alerts and metrics. The CELLSRV reads this file during startup and writes to it when an object is updated (e.g. updates to IORM plan).
2. Grid disk file - $OSSCONF/griddisk.owners.dat. A binary file on the storage cell that contains the following information for all griddisks:
ASM disk name
ASM disk group name
ASM failgroup name
Cluster identifier (which cluster this disk belongs to)
Requires DROP/ADD (should the disk be dropped from or added to ASM)
3. MS log and trace files - ms-odl.log and ms-odl.trc in $ADR_BASE/diag/asm/cell/`hostname -s`/trace directory on the storage cell.
4. CELLSRV alert log - alert.log in $ADR_BASE/diag/asm/cell/`hostname -s`/trace directory on the storage cell.
5. ASM alert log - alert_+ASMn.log in $ORACLE_BASE/diag/asm/+asm/+ASMn/trace directory on the compute node.
6. XDMG and XDWK trace files - +ASMn_xdmg_nnnnn.trc and +ASMn_xdwk_nnnnn.trc in $ORACLE_BASE/diag/asm/+asm/+ASMn/trace directory on the compute node.
Exadata – Change Diskgroup Redundancy from High to Normal
Step 1: Drop Diskgroup
SUCCESS: drop diskgroup RECO_FHDB including contents
Tue Jul 17 22:55:09 2012
NOTE: diskgroup resource ora.RECO_FHDB.dg is dropped
Step 2: Extract DDL (create diskgroup) command for RECO_FHDB from ASM alert log and replace redundancy clause and run create diskgroup command on ASM instance.
SQL> CREATE DISKGROUP RECO_FHDB NORMAL REDUNDANCY DISK
'o/192.168.10.10/RECO_FHDB_CD_00_fhdbcel06',
…….
'o/192.168.10.9/RECO_FHDB_CD_11_fhdbcel05' ATTRIBUTE
'compatible.asm'='11.2.0.2','compatible.rdbms'='11.2.0.2','au_size'='4M','cell.smart_scan_capable'='TRUE' /* ASMCA */
SUCCESS: diskgroup RECO_FHDB was mounted
ASM spfile, OCR and voting disks were located on DATA_FHDB diskgroup and I had to relocate above files from DATA_FHDB to RECO_FHDB to recreate DATA_FHDB diskgroup with normal redundancy.
Step 1: Drop diskgroup will throw following error when ASM SPFILE is located on same diskgroup.
SQL> drop diskgroup DATA_FHDB including contents
NOTE: Active use of SPFILE in group
Wed Jul 18 14:49:29 2012
GMON querying group 1 at 18 for pid 19, osid 9914
Wed Jul 18 14:49:29 2012
NOTE: Instance updated compatible.asm to 11.2.0.2.0 for grp 1
ORA-15039: diskgroup not dropped
ORA-15027: active use of diskgroup "DATA_FHDB" precludes its dismount
Step 2: Move OCR and voting disk to RECO_FHDB
[oracle@fhdbdb01 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3344
Available space (kbytes) : 258776
ID : 1272363019
Device/File Name : +DATA_FHDB
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check bypassed due to non-privileged user
[root@fhdbdb01 cssd]# ocrconfig -add +RECO_FHDB
[root@fhdbdb01 cssd]#
[root@fhdbdb01 cssd]# ocrconfig -delete +DATA_FHDB
[root@fhdbdb01 cssd]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3364
Available space (kbytes) : 258756
ID : 1272363019
Device/File Name : +RECO_FHDB
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
[root@fhdbdb01 ~]$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 75c79c52f88b4fcebf2f84ccad0be646 (o/192.168.10.10/DATA_FHDB_CD_00_fhdbcel06) [DATA_FHDB]
2. ONLINE 14f6d0e1c8b94f3bbf222b821f7f48ab (o/192.168.10.11/DATA_FHDB_CD_00_fhdbcel07) [DATA_FHDB]
3. ONLINE 7aed830fb6ee4f70bf9160b2f39ea64b (o/192.168.10.5/DATA_FHDB_CD_00_fhdbcel01) [DATA_FHDB]
4. ONLINE 9cc87608cabd4fb0bfea7e1f7d403134 (o/192.168.10.6/DATA_FHDB_CD_00_fhdbcel02) [DATA_FHDB]
5. ONLINE 2c6008a2c0864fbfbf4ae1c9cbc60d5c (o/192.168.10.7/DATA_FHDB_CD_00_fhdbcel03) [DATA_FHDB]
[root@fhdbdb01 cssd]# crsctl replace votedisk +RECO_FHDB
Successful addition of voting disk 161fa97cc71e4fffbfe10408e1e32aa0.
Successful addition of voting disk 128fb088bd7c4fe7bf6dff63d946dbc6.
Successful addition of voting disk 804b6348a5974f53bfccb328b92f9350.
Successful deletion of voting disk 75c79c52f88b4fcebf2f84ccad0be646.
Successful deletion of voting disk 14f6d0e1c8b94f3bbf222b821f7f48ab.
Successful deletion of voting disk 7aed830fb6ee4f70bf9160b2f39ea64b.
Successful deletion of voting disk 9cc87608cabd4fb0bfea7e1f7d403134.
Successful deletion of voting disk 2c6008a2c0864fbfbf4ae1c9cbc60d5c.
Successfully replaced voting disk group with +RECO_FHDB.
CRS-4266: Voting file(s) successfully replaced
[root@fhdbdb01 cssd]# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 161fa97cc71e4fffbfe10408e1e32aa0 (o/192.168.10.10/RECO_FHDB_CD_00_fhdbcel06) [RECO_FHDB]
2. ONLINE 128fb088bd7c4fe7bf6dff63d946dbc6 (o/192.168.10.11/RECO_FHDB_CD_00_fhdbcel07) [RECO_FHDB]
3. ONLINE 804b6348a5974f53bfccb328b92f9350 (o/192.168.10.5/RECO_FHDB_CD_00_fhdbcel01) [RECO_FHDB]
Located 3 voting disk(s).
Step 3: Move ASM spfile.
SQL> create pfile='/nfs/zfs/init+ASM.ora' from spfile;
File created.
SQL> create spfile='+RECO_FHDB/fhdb-cluster/asmparameterfile/spfileASM.ora' from pfile='/nfs/zfs/init+ASM.ora';
File created.
echo "SPFILE='+RECO_FHDB/fhdb-cluster/asmparameterfile/spfileASM.ora'" > init+ASM.ora
Step 4: Drop DATA_FHDB diskgroup
SQL> drop diskgroup DATA_FHDB including contents;
drop diskgroup DATA_FHDB including contents
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15027: active use of diskgroup "DATA_FHDB" precludes its dismount
ASMCMD> cd DATA_FHDB/
ASMCMD> ls
fhdb-cluster/
ASMCMD> cd fhdb-cluster
ASMCMD> ls
ASMPARAMETERFILE/
OCRFILE/
ASMCMD> cd ASMPARAMETERFILE/
ASMCMD> ls
REGISTRY.253.788355279
ASMCMD> rm REGISTRY.253.788355279
ORA-15032: not all alterations performed
ORA-15028: ASM file '+DATA_FHDB/fhdb-cluster/ASMPARAMETERFILE/REGISTRY.253.788355279' not dropped; currently being accessed (DBD ERROR: OCIStmtExecute)
SQL> alter diskgroup DATA_FHDB dismount force;
Diskgroup altered.
SQL> drop diskgroup DATA_FHDB force including contents;
Diskgroup dropped.
Step 5: Create DATA_FHDB diskgroup
Configuring Hosts to Access Exadata Cells Configuration files on each database server enable access to Exadata storage.
- cellinit.ora identifies the storage network interfaces/InfiniBand IP address on the database server.
- The cellinit.ora file contains the database server IP address that connects to the storage network. This file is host specific, and contains the IP addresses of the InfiniBand storage network interfaces for that database server. The IP addresses are specified in Classless Inter-Domain Routing (CIDR) format.
- cellip.ora identifies the Exadata cells that are accessible to the database server.
- To ensure that ASM discovers Exadata grid disks, set the ASM_DISKSTRING initialization parameter. A search string with the following form is used to discover Exadata grid disks:
o/<cell IP address>/<grid disk name>
Wildcards may be used to expand the search string. For example, to explicitly discover all the available Exadata grid disks set ASM_DISKSTRING='o/*/*'. To discover a subset of available grid disks having names that begin with data, set
ASM_DISKSTRING='o/*/data*'.
Bear in mind the following general considerations when reconfiguring Exadata storage
Reconfiguring an existing disk group requires the ability to drop disks from the disk group, reconfigure them and then add them back into the disk group. If the amount of free space in the disk group is greater than the REQUIRED_MIRROR_FREE_MB value reported in V$ASM_DISKGROUP, then you can use methods which reconfigure the diskgroup one cell at a time. If the free space is less than REQUIRED_MIRROR_FREE_MB,then you may need to reorganize your storage to create more free space. It may also be possible, though not recommended, to reconfigure the storage one disk at a time.
Best practices recommend that all disks in an ASM disk group should be of equal size and have equal performance characteristics. For Exadata this means that all the grid disks allocated to a disk group should be the same size and occupy the same region on each disk. There should not be a mixture of interleaved and non-interleaved grid disks, likewise there should not be a mixture of disks from high-capacity cells and high-
performance cells. Finally, the grid disks should all occupy the same location on each disk.
- If you try to drop a grid disk without the FORCE option the command will not be processed and an error will be displayed if the grid disk is being used by an ASM disk group. If you remove a disk from an ASM disk group ensure that the resulting rebalance operation completes before attempting to drop the associated grid disk.If you need to use the DROP GRIDDISK command with the FORCE option, use extreme caution since incorrectly dropping an active grid disk could result in data loss.If you try to drop a cell disk without the FORCE option the command will not be processed and an error will be displayed if the cell disk contains any grid disks. It is possible to use the DROP CELLDISK command with the FORCE option to drop a cell disk and all the associated grid disks. Use the FORCE option with extreme caution since incorrectly dropping an active grid disk could result in data loss.
- Clusterware files (cluster registry and voting disks) are stored by default in a special ASM disk group named DBFS_DG. Resizing the DBFS_DG disk group is generally not recommended since the grid disks associated with it are sized specially to match the size of the system areas on the first two disk in each cell. If there is a requirement to alter this disk group, or the underlying grid disks or cell disks, special care must be taken
- to preserve the clusterware files. Reconfiguring Exadata storage on an active system without any downtime is possible, however doing so can be a time-consuming process involving many ASM rebalancing operations. The time required depends on the number of storage cells, the existing disk usage and the load on the system.
How to configure Exadata cell alert notification?
We can configure SMTP alert on cell level using cellcli utility on each cell to get alert notification by providing below required details.
To find out the current situation:
CellCLI> list cell detail
.........
notificationMethod: mail
notificationPolicy: critical,warning,clear
.........
smtpFrom: "Oracle Database Machine"
smtpFromAddr: exadata@company.com
smtpPort: 25
smtpPwd: ******
smtpServer: 192.168.1.11
smtpToAddr: "system@company.com, admin@company.com"
smtpUser: exadata
smtpUseSSL: FALSE
CellCLI> ALTER CELL smtpServer='mailserver.example.com',
smtpFromAddr='exadataalert@easyreliable.com',
smtpFrom='Exadata Alert',
smtpToAddr='exaadtadba@easyreliable.com',
notificationPolicy='maintenance,clear,warning,critical',
notificationMethod='mail,snmp'
Here,
smtpserver - Mail Server Name
smtpFromAddr - Mail id from where alert will be send to
smtpToAddr - Mail id on which alert should go
notificationPolicy - Define notification policy
notificationMethod - method of notification
Validate email notification on cell by executing
CellCLI> ALTER CELL VALIDATE MAIL
We can also change the format of E-mail address by executing commands like
CellCLI> ALTER CELL emailFormat='text'
CellCLI> ALTER CELL emailFormat='html'
To close alert emails:
CellCLI>alter cell notificationMethod=null
To open alert emails:
CellCLI>alter cell notificationMethod='mail,snmp'
or
CellCLI>alter cell notificationMethod='mail'
No comments:
Post a Comment