EasyReliableDBA: linux and solaris receipt for oracle dba

APPENDIX A

RAID Concepts

In the not too distant past, a 1TB-sized database was considered to be pretty big. Currently, 1PB–2PB defines the lower boundary for a large database. In the not too distant future, exabyte, zettabyte, and yottabyte will become commonly bandied terms near the DBA water cooler.

As companies store more and more data, the need for disk space continues to grow. Managing database storage is a key responsibility of every database administrator. DBAs are tasked with estimating the initial size of databases, recognizing growth patterns, and monitoring disk usage. Overseeing these operations is critical to ensuring the availability of company data.

Here are some common DBA tasks associated with storage management:

Determining disk architecture for database applications
Planning database capacity
Monitoring and managing growth of database files

Before more storage is added to a database server, SAs and DBAs should sit down and figure out which disk architecture offers the best availability and performance for a given budget. When working with SAs, an effective DBA needs to be somewhat fluent in the language of disk technologies. Specifically, DBAs must have a basic understanding of RAID disk technology and its implications for database performance and availability.

Even if your opinion isn’t solicited in regard to disk technology, you still need to be familiar with the basic RAID configurations that will allow you to make informed decisions about database tuning and troubleshooting. This appendix discusses the fundamental information a DBA needs to know about RAID.

Understanding RAID

As a DBA, you need to be knowledgeable about RAID designs to ensure that you use an appropriate disk architecture for your database application. RAID, which is an acronym for a Redundant Array of Inexpensive [or Independent] Disks, allows you to configure several independent disks to logically appear as one disk to the application. There are two important reasons to use RAID:

To spread I/O across several disks, thus improving bandwidth
To eliminate a lone physical disk as a single point of failure

If the database process that is reading and writing updates to disk can parallelize I/O across many disks (instead of a single disk), the bandwidth can be dramatically improved. RAID also allows you to configure several disks so that you never have one disk as a single point of failure. For most database systems, it is critical to have redundant hardware to ensure database availability.

The purpose of this section is not to espouse one RAID technology over another. You’ll find bazillions of blogs and white papers on the subject of RAID. Each source of information has its own guru that evangelizes one form of RAID over another. All these sources have valid arguments for why their favorite flavor of RAID is the best for a particular situation.

Be wary of blanket statements regarding the performance and availability of RAID technology. For example, you might hear somebody state that RAID 5 is always better than RAID 1 for database applications. You might also hear somebody state that RAID 1 has superior fault tolerance over RAID 5. In most cases, the superiority of one RAID technology over another depends on several factors, such as the I/O behavior of the database application and the various components of the underlying stack of hardware and software. You may discover that what performs well in one scenario is not true in another; it really depends on the entire suite of technology in use.

The goal here is to describe the performance and fault tolerance characteristics of the most commonly used RAID technologies. We explain in simple terms and with clear examples how the basic forms of RAID technology work. This base knowledge enables you to make an informed disk technology decision dependent on the business requirements of your current environment. You should also be able to take the information contained in this section and apply it to the more sophisticated and emerging RAID architectures.

Defining Array, Stripe Width, Stripe Size, Chunk Size

Before diving into the technical details of RAID, you first need to be familiar with a few terms: array, stripe width, stripe size, and chunk size.

An array is simply a collection of disks grouped together to appear as a single device to the application. Disk arrays allow for increased performance and fault tolerance.

The stripe width is the number of parallel pieces of data that can be written or read simultaneously to an array. The stripe width is usually equal to the number of disks in the array. In general (with all other factors being equal), the larger the stripe width size, the greater the throughput performance of the array. For example, you will generally see greater read/write performance from an array of twelve 32GB drives than from an array of four 96GB drives.

The stripe size is the amount of data you want written in parallel to an array of disks. Determining the optimal stripe size can be a highly debatable topic. Decreasing the stripe size usually increases the number of drives a file will use to store its data. Increasing the stripe size usually decreases the number of drives a file will employ to write and read to an array. The optimal stripe size depends on your database application I/O characteristics, along with the hardware and software of the system.

Note The stripe size is usually a configurable parameter that can be dynamically configured by the storage administrator. Contrast that with the stripe width, which can be changed only by increasing or decreasing the physical number of disks.

The chunk size is the subset of the stripe size. The chunk size (also called the striping unit) is the amount of data written to each disk in the array as part of a stripe size.

Figure A-1 shows a 4KB stripe size that is being written to an array of four disks (a stripe width of 4). Each disk gets a 1KB chunk written to it.

Figure A-1. A 4KB stripe of data is written to four disks as 1KB chunks

The chunk size can have significant performance effects. An inappropriate chunk size can result in I/O being concentrated on single disks within the array. If this happens, you may end up with an expensive array of disks that perform no better than a single disk.

What’s the correct chunk size to use for database applications? It depends somewhat on the average size of I/O your databases generates. Typically, database I/O consists of several simultaneous and small I/O requests. Ideally, each small I/O request should be serviced by one disk, with the multiple I/O requests spread out across all disks in the array. So in this scenario, you want your chunk size to be a little larger than the average database I/O size.

Tip You’ll have to test your particular database and disk configuration to determine which chunk size results in the best I/O distribution for a given application and its average I/O size.

RAID 0

RAID 0 is commonly known as striping, which is a technique that writes chunks of data across an array of disks in a parallel fashion. Data is also read from disks in the same way, which allows several disks to participate in the read/write operations. The idea behind striping is that simultaneous access to multiple disks will have greater bandwidth than I/O to a single disk.

Note One disk can be larger than the other disks in a RAID 0 device (and the additional space is still used). However, this is not recommended because I/O will be concentrated on the large disk where more space is available.

Figure A-2 demonstrates how RAID 0 works. This RAID 0 disk array physically comprises four disks. Logically, it looks like one disk (/mount01) to the application. The stripe of data written to the RAID 0 device consists of 16 bits: 0001001000110100. Each disk receives a 4-bit chunk of the stripe.

Figure A-2. Four-disk RAID 0 striped device

With RAID 0, your realized disk capacity is the number of disks times the size of the disk. For example, if you have four 100GB drives, the overall realized disk capacity available to the application is 400GB. In this sense, RAID 0 is a very cost-effective solution.

RAID 0 also provides excellent I/O performance. It allows for simultaneous reading and writing on all disks in the array. This spreads out the I/O, which reduces disk contention, alleviates bottlenecks, and provides excellent I/O performance.

The huge downside to RAID 0 is that it doesn’t provide any redundancy. If one disk fails, the entire array fails. Therefore, you should never use RAID 0 for data you consider to be critical. You should use RAID 0 only for files that you can easily recover and only when you don’t require a high degree of availability.

Tip One way to remember what RAID 0 means is that it provides “0” redundancy. You get zero fault tolerance with RAID 0. If one disk fails, the whole array of disks fails.

RAID 1

RAID 1 is commonly known as mirroring, which means that each time data is written to the storage device, it is physically written to two (or more) disks. In this configuration, if you lose one disk of the array, you still have another disk that contains a byte-for-byte copy of the data.

Figure A-3 shows how RAID 1 works. The mirrored disk array is composed of two disks. Disk 1b is a copy (mirror) of Disk 1a. As the data bits 0001 are written to Disk 1a, a copy of the data is also written to Disk 1b. Logically, the RAID 1 array of two disks looks like one disk (/mount01) to the application.

Figure A-3. RAID 1 two-disk mirror

Write performance with RAID 1 takes a little longer (than a single disk) because data must be written to each participating mirrored disk. However, read bandwidth is increased because of parallel access to data contained in the mirrored array.

RAID 1 is popular because it is simple to implement and provides fault tolerance. You can lose one mirrored disk and still continue operations as long as there is one surviving member. One downside to RAID 1 is that it reduces the amount of realized disk space available to the application. Although typically there are only two disks in a mirrored array, you can have more than two disks in a mirror. The realized disk space in a mirrored array is the size of the disk.

Here’s the formula for calculating realized disk space for RAID 1:

Number of mirrored arrays * Disk Capacity

For example, suppose that you have four 100GB disks and you want to create two mirrored arrays with two disks in each array. The realized available disk space is calculated as shown here:

2 arrays * 100 gigabytes = 200 gigabytes

Another way of formulating it is as follows:

(Number of disks available / number of disks in the array) * Disk Capacity

This formula also shows that the amount of disk space available to the application is 200GB:

(4 / 2) * 100 gigabytes = 200 gigabytes

Tip One way to remember the meaning of RAID 1 is that it provides 100% redundancy. You can lose one member of the RAID 1 array and still continue operations.

Generating Parity

Before discussing the next levels of RAID, it is important to understand the concept of parity and how it is generated. RAID 4 and RAID 5 configurations use parity information to provide redundancy against a single disk failure. For a three-disk RAID 4 or RAID 5 configuration, each write results in two disks being written to in a striped fashion, with the third disk storing the parity information.

Parity data contains the information needed to reconstruct data in the event one disk fails. Parity information is generated from an XOR (exclusive OR) operation.

Table A-1 describes the inputs and outputs of an XOR operation. The table reads as follows: if one and only one of the inputs is a 1, the output will be a 1; otherwise, the output is a 0.

Table A-1. Behavior of an XOR Operation

Input A	Input B	Output
1	1	0
1	0	1
0	1	1
0	0	0

For example, from the first row in Table A-1, if both bits are a 1, the output of an XOR operation is a 0. From the second and third rows, if one bit is a 1 and the other bit is a 0, the output of an XOR operation is a 1. The last row shows that if both bits are a 0, the output is a 0.

A slightly more complicated example will help clarify this concept. In the example shown in Figure A-4, there are three disks. Disk 1 is written 0110, and Disk 2 is written 1110. Disk 3 contains the parity information generated by the output of an XOR operation on data written to Disk 1 and Disk 2.

Figure A-4. Disk 1 XOR Disk 2 = Disk 3 (parity)

How was the 1000 parity information calculated? The first two bits of the data written to Disk 1 and Disk 2 are a 0 and a 1; therefore, the XOR output is a 1. The second two bits are both 1, so the XOR output is a 0. The third sets of bits are both 1, and the output is a 0. The fourth bits are both zeros, so the output is a 0.

This discussion is summarized here in equation form:

Disk1 XOR Disk2 = Disk3 (parity disk)
----- --- -----   -----
0110  XOR 1110  = 1000

How does parity allow for the recalculation of data in the event of a failure? For this example, suppose that you lose Disk 2. The information on Disk 2 can be regenerated by taking an XOR operation on the parity information (Disk 3) with the data written to Disk 1. An XOR operation of 0110 and 1000 yields 1110 (which was originally written to Disk 2). This discussion is summarized here in equation form:

Disk1 XOR Disk3 = Disk2
----- --- -----   -----
0110  XOR 1000  = 1110

You can perform an XOR operation with any number of disks. Suppose that you have a four-disk configuration. Disk 1 is written 0101, Disk 2 is written 1110, and Disk 3 is written 0001. Disk 4 contains the parity information, which is the result of Disk 1 XOR Disk 2 XOR Disk 3:

Disk1 XOR Disk2 XOR Disk3 = Disk4 (parity disk)
----- --- ----  --- -----   -----
0101  XOR 1110  XOR 0001  = 1010

Suppose that you lose Disk 2. To regenerate the information on Disk 2, you perform an XOR operation on Disk 1, Disk 3, and the parity information (Disk 4), which results in 1110:

Disk1 XOR Disk3 XOR Disk4 = Disk2
----- --- ----- --- -----   -----
0101  XOR 0001  XOR 1010  = 1110

You can always regenerate the data on the drive that becomes damaged by performing an XOR operation on the remaining disks with the parity information. RAID 4 and RAID 5 technologies use parity as a key component for providing fault tolerance. These parity-centric technologies are described in the next two sections.

RAID 4

RAID 4, which is sometimes referred to as dedicated parity, writes a stripe (in chunks) across a disk array. One drive is always dedicated for parity information. A RAID 4 configuration minimally requires three disks: two disks for data and one for parity. The term RAID 4 does not mean there are four disks in the array; there can be three or more disks in a RAID 4 configuration.

Figure A-5 shows a four-disk RAID 4 configuration. Disk 4 is the dedicated parity disk. The first stripe consists of the data 000100100011. Chunks of data 0001, 0010, and 0011 are written to Disks 1, 2, and 3, respectively. The parity value of 0000 is calculated and written to Disk 4.

Figure A-5. Four-disk RAID 4 dedicated parity device

RAID 4 uses an XOR operation to generate the parity information. For each stripe in Figure A-5, the parity information is generated as follows:

Disk1 XOR Disk2 XOR Disk3 = Parity
----- --- ----- --- -----   ------
0001  XOR 0010  XOR 0011  = 0000
0100  XOR 0101  XOR 0110  = 0111
0111  XOR 1000  XOR 1001  = 0110
1010  XOR 1011  XOR 1100  = 1101

Tip Refer to the previous “Generating Parity” section for details on how an XOR operation works.

RAID 4 requires that parity information be generated and updated for each write, so the writes take longer in a RAID 4 configuration than a RAID 0 write. Reading from a RAID 4 configuration is fast because the data is spread across multiple drives (and potentially multiple controllers).

With RAID 4, you get more realized disk space than you do with RAID 1. The RAID 4 amount of disk space available to the application is calculated with this formula:

(Number of disks – 1) * Disk Capacity

For example, if you have four 100GB disks, the realized disk capacity available to the application is calculated as shown here:

(4 -1) * 100 gigabytes = 300 gigabytes

In the event of a single disk failure, the remaining disks of the array can continue to function. For example, suppose that Disk 1 fails. The Disk 1 information can be regenerated with the parity information, as shown here:

Disk2 XOR Disk3 XOR Parity = Disk1
----- --- ----- --- ------   -----
0010  XOR 0011  XOR 0000   = 0001
0101  XOR 0110  XOR 0111   = 0100
1000  XOR 1001  XOR 0110   = 0111
1011  XOR 1100  XOR 1101   = 1010

During a single disk failure, RAID 4 performance will be degraded because the parity information is required for generating the data on the failed drive. Performance will return to normal levels after the failed disk has been replaced and its information regenerated. In practice, RAID 4 is seldom used because of the inherent bottleneck with the dedicated parity disk.

RAID 5

RAID 5, which is sometimes referred to as distributed parity, is similar to RAID 4 except that RAID 5 interleaves the parity information among all the drives available in the disk array. A RAID 5 configuration minimally requires three disks: two for data and one for parity. The term RAID 5 does not mean there are five disks in the array; there can be three or more disks in a RAID 5 configuration.

Figure A-6 shows a four-disk RAID 5 array. The first stripe of data consists of 000100100011. Three chunks of 0001, 0010, and 0011 are written to Disks 1, 2, and 3; the parity of 0000 is written to Disk 4. The second stripe writes its parity information to Disk 1, the third stripe writes its parity to Disk 2, and so on.

Figure A-6. Four-disk RAID 5 distributed parity device

RAID 5 uses an XOR operation to generate the parity information. For each stripe in Figure A-6, the parity information is generated as follows:

0001    XOR 0010    XOR 0011    = 0000
0100    XOR 0101    XOR 0110    = 0111
0111    XOR 1000    XOR 1001    = 0110
1010    XOR 1011    XOR 1100    = 1101

Tip Refer to the previous “Generating Parity” section for details on how an XOR operation works.

Like RAID 4, RAID 5 writes suffer a slight write performance hit because of the additional update required for the parity information. RAID 5 performs better than RAID 4 because it spreads the load of generating and updating parity information to all disks in the array. For this reason, RAID 5 is almost always preferred over RAID 4.

RAID 5 is popular because it combines good I/O performance with fault tolerance and cost effectiveness. With RAID 5, you get more realized disk space than you do with RAID 1. The RAID 5 amount of disk space available to the application is calculated with this formula:

(Number of disks – 1) * Disk Capacity

Using the previous formula, if you have four 100GB disks, the realized disk capacity available to the application is calculated as follows:

(4 -1) * 100 gigabytes = 300 gigabytes

RAID 5 provides protection against a single disk failure through the parity information. If one disk fails, the information from the failed disk can always be recalculated from the remaining drives in the RAID 5 array. For example, suppose that Disk 3 fails; the remaining data on Disk 1, Disk 2, and Disk 4 can regenerate the required Disk 3 information as follows:

DISK1 XOR DISK2 XOR DISK4 = DISK3
----- --- ----- --- -----   -----
0001  XOR 0010  XOR 0000  = 0011
0111  XOR 0100  XOR 0110  = 0101
0111  XOR 0110  XOR 1001  = 1000
1010  XOR 1011  XOR 1100  = 1101

During a single disk failure, RAID 5 performance will be degraded because the parity information is required for generating the data on the failed drive. Performance will return to normal levels after the failed disk has been replaced and its information regenerated.

Building Hybrid (Nested) RAID Devices

The RAID 0, RAID 1, and RAID 5 architectures are the building blocks for more sophisticated storage architectures. Companies that need better availability can combine these base RAID technologies to build disk arrays with better fault tolerance. Some common hybrid RAID architectures are as follows:

RAID 0+1 (striping and then mirroring)
RAID 1+0 (mirroring and then striping)
RAID 5+0 (RAID 5 and then striping)

These configurations are sometimes referred to as hybrid or nested RAID levels. Much like Lego blocks, you can take the underlying RAID architectures and snap them together for some interesting configurations that have performance, fault tolerance, and cost advantages and disadvantages. These technologies are described in detail in the following sections.

Note Some degree of confusion exists about the naming standards for various RAID levels. The most common industry standard for nested RAID levels is that RAID A+B means that RAID level A is built first and then RAID level B is layered on top of RAID level A. This standard is not consistently applied by all storage vendors. You have to carefully read the specifications for a given storage device to ensure that you understand which level of RAID is in use.

RAID 0+1

RAID 0+1 is a disk array that is first striped and then mirrored (a mirror of stripes). Figure A-7 shows an eight-disk RAID 0+1 configuration. Disks 1 through 4 are written to in a striped fashion. Disks 5 through 8 are a mirror of Disks 1 through 4.

Figure A-7. RAID 0+1 striped and then mirrored device

RAID 0+1 provides the I/O benefits of striping while providing the sturdy fault tolerance of a mirrored device. This is a relatively expensive solution because only half the disks in the array comprise your usable disk space. The RAID 0+1 amount of disk space available to the application is calculated with this formula:

(Number of disks in stripe) * Disk Capacity

Using the previous formula, if you have eight 100GB drives with four drives in each stripe, the realized disk capacity available to the application is calculated as follows:

4 * 100 gigabytes = 400 gigabytes

The RAID 0+1 configuration can survive multiple disk failures only if the failures occur within one stripe. RAID 0+1 cannot survive two disk failures if one failure is in one stripe (/dev01) and the other disk failure is in the second stripe (/dev02).

RAID 1+0

RAID 1+0 is a disk array that is first mirrored and then striped (a stripe of mirrors). Figure A-8 displays an eight-disk RAID 1+0 configuration. This configuration is also commonly referred to as RAID 10.

Figure A-8. RAID 1+0 mirrored and then striped device

RAID 1+0 combines the fault tolerance of mirroring with the performance benefits of striping. This is a relatively expensive solution because only half the disks in the array comprise your usable disk space. The RAID 1+0 amount of disk space available to the application is calculated with this formula:

(Number of mirrored devices) * Disk Capacity

For example, if you start with eight 100GB drives, and you build four mirrored devices of two disks each, the overall realized capacity to the application is calculated as follows:

4 * 100 gigabytes = 400 gigabytes

Interestingly, the RAID 1+0 arrangement provides much better fault tolerance than RAID 0+1. Analyze Figure A-8 carefully. The RAID 1+0 hybrid configuration can survive a disk failure in each stripe and can also survive one disk failure within each mirror. For example, in this configuration, Disk 1a, Disk 2b, Disk 3a, and Disk 4b could fail; but the overall device would continue to function because of the mirrors in Disk 1b, Disk 2a, Disk 3b, and Disk 4a.

Likewise, an entire RAID 1+0 stripe could fail, and the overall device would continue to function because of the surviving mirrored members. For example, Disk 1b, Disk 2b, Disk 3b, and Disk 4b could fail; but the overall device would continue to function because of the mirrors in Disk 1a, Disk 2a, Disk 3a, and Disk 4a.

Many articles, books, and storage vendor documentation confuse the RAID 0+1 and RAID 1+0 configurations (they refer to one when really meaning the other). It is important to understand the differences in fault tolerance between the two architectures. If you’re architecting a disk array, ensure that you use the one that meets your business needs.

Both RAID 0+1 and RAID 1+0 architectures possess the excellent performance attributes of striped storage devices without the overhead of generating parity. Does RAID 1+0 perform better than RAID 0+1 (and vice versa)? Unfortunately, we have to waffle a bit (no pun intended) on the answer to this question: it depends. Performance characteristics are dependent on items such as the configuration of the underlying RAID devices, amount of cache, number of controllers, I/O distribution of the database application, and so on. We recommend that you perform an I/O load test to determine which RAID architecture works best for your environment.

RAID 5+0

RAID 5+0 is a set of disk arrays placed in a RAID 5 configuration and then striped. Figure A-9 displays the architecture of an eight-disk RAID 5+0 configuration.

Figure A-9. RAID 5+0 (RAID 5 and then striped) device

RAID 5+0 is sometimes referred to as striping parity. The read performance is slightly less than the other hybrid (nested) approaches. The write performance is good, however, because each stripe consists of a RAID 5 device. Because this hybrid is underpinned by RAID 5 devices, it is more cost effective than the RAID 0+1 and RAID 1+0 configurations. The RAID 5+0 amount of disk space available to the application is calculated with this formula:

(Number of disks - number of disks used for parity) * Disk Capacity

For example, if you have eight 100GB disks with four disks in each RAID 5 device, the total realized capacity would be calculated as shown here:

(8 - 2) * 100 gigabytes = 600 gigabytes

RAID 5+0 can survive a single disk failure in either RAID 5 device. However, if there are two disk failures in one RAID 5 device, the entire RAID 5+0 device will fail.

Determining Disk Requirements

Which RAID technology is best for your environment? It depends on your business requirements. Some storage gurus recommend RAID 5 for databases; others argue that RAID 5 should never be used. There are valid arguments on both sides of the fence. You may be part of a shop that already has a group of storage experts who predetermine the underlying disk technology without input from the DBA team. Ideally, you want to be involved with architecture decisions that affect the database, but realistically that does not always happen.

Or you might be in a shop that is constrained by cost and might conclude that a RAID 5 configuration is the only viable architecture. For your database application, you’ll have to determine the cost–effective RAID solution that performs well while also providing the required fault tolerance. This will most likely require you to work with your storage experts to monitor disk performance and I/O characteristics.

Tip Refer to Chapter 8 for details on how to use tools such as iostat and sar to monitor disk I/O behavior.

Table A-2 summarizes the various characteristics of each RAID technology. These are general guidelines, so test the underlying architecture to ensure that it meets your business requirements before you implement a production system.

Table A-2. Comparison of RAID Technologies

Table A-2 is intended only to provide general heuristics for determining the appropriate RAID technology for your environment. There will be some technologists who might disagree with some of these general guidelines. From our experience, there are often two very opposing RAID opinions, and both have valid points of view.

Some variables that are unique to a particular environment also influence the decision about the best solution. For this reason, it can be difficult to determine exactly which combination of chunk, stripe size, stripe width, underlying RAID technology, and storage vendor will work best over a wide variety of database applications. If you have the resources to test every permutation under every type of I/O load, you probably can determine the perfect combination of the previously mentioned variables.

Realistically, few shops have the time and money to exercise every possible storage architecture for each database application. You’ll have to work with your SA and storage vendor to architect a cost-effective solution for your business that performs well over a variety of database applications.

Caution Using RAID technology doesn’t eliminate the need for a backup and recovery strategy. You should always have a strategy in place to ensure that you can restore and recover your database. You should periodically test your backup and recovery strategy to make sure it protects you if all disks fail (because of a fire, earthquake, tornado, avalanche, grenade, hurricane, and so on).

Capacity Planning

DBAs are often involved with disk storage capacity planning. They have to ensure that adequate disk space will be available, both initially and for future growth, when the database server disk requirements are first spec’ed out (specified). When using RAID technologies, you have to be able to calculate the actual amount of disk space that will be available given the available disks.

For example, when the SA says that there are x number of type Y disks configured with a given RAID level, you have to calculate whether there will be enough disk space for your database requirements.

Table A-3 details the formulas used to calculate the amount of available disk space for each RAID level.

Table A-3. Calculating the Amount of RAID Disk Space Realized

Disk Technology	Realized Disk Capacity
RAID 0 (striped)	Num Disks in Stripe * Disk Size
RAID 1 (mirrored)	Num Mirrored Arrays * Disk Size
RAID 4 (dedicated parity)	(Num Disks – 1) * Disk Size
RAID 5 (distributed parity)	(Num Disks – 1) * Disk Size
RAID 0+1 (striped and then mirrored)	Num Disks in Stripe * Disk Size
RAID 1+0 (mirrored and then striped)	Num Mirrored Arrays * Disk Size
RAID 5+0 (RAID 5 and then striped)	(Num Disks – Num Parity Disks) * Disk Size

Be sure to include future database growth requirements in your disk space calculations. Also consider the amount of disk space needed for files such as database transaction logs and database binaries, as well as the space required for database backups (keep in mind that you may want to keep multiple days’ worth of backups on disk).

Tip A good rule of thumb is to always keep one database backup on disk, back up the database backup files to tape, and then move the backup tapes offsite. You will have the good performance that is required for routine backup and recovery tasks and protection against complete disasters.

APPENDIX B

Server Log Files

Server log files contain informational messages about the kernel, applications, and services running on a system. These files can be very useful for troubleshooting and debugging system-level issues. DBAs often look in the system log files as a first step in diagnosing server issues. Even if you’re working with competent SAs, you can still save time and gain valuable insights into the root cause of a problem by inspecting these log files.

This appendix covers managing Linux and Solaris log files. You’ll learn about the basic information contained in the log files and the tools available to rotate the logs.

Managing Linux Log Files

Most of the system log files are located in the /var/log directory. There is usually a log file for a specific application or service. For example, the cron utility has a log file named cron (no surprise) in the /var/log directory. Depending on your system, you may need root privileges to view certain log files.

The log files will vary somewhat by the version of the OS and the applications running on your system. Table B-1 contains the names of some of the more common log files and their descriptions.

Table B-1. Typical Linux Log Files and Descriptions

Log File Name	Purpose
/var/log/boot.log	System boot messages
/var/log/cron	cron utility log file
/var/log/maillog	Mail server log file
/var/log/messages	General system messages
/var/log/secure	Authentication log file
/var/log/wtmp	Login records
/var/log/yum.log	yum utility log file

Note Some utilities can have their own subdirectory under the /var/log directory.

Rotating Log Files

The system log files will continue to grow unless they are somehow moved or removed. Moving and removing log files is known as rotating the log files, which means that the current log file is renamed, and a new log file is created.

Most Linux systems use the logrotate utility to rotate the log files. This tool automates the rotation, compression, removal, and mailing of log files. Typically, you’ll rotate your log files so that they don’t become too large and cluttered with old data. You should delete log files that are older than a certain number of days.

By default, the logrotate utility is automatically run from the cron scheduling tool on most Linux systems. Here’s a typical listing of the contents of the /etc/crontab file:

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

Notice that the /etc/crontab uses the run-parts utility to run all scripts located within a specified directory. For example, when run-parts inspects the /etc/cron.daily directory, it finds a file named logrotate that calls the logrotate utility. Listed here are the contents of a typical logrotate script:

#!/bin/sh
/usr/sbin/logrotate /etc/logrotate.conf
EXITVALUE=$?
if [ $EXITVALUE != 0 ]; then
    /usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
fi
exit 0

The behavior of the logrotate utility is governed by the /etc/logrotate.conf file. Here’s a listing of a typical /etc/logrotate.conf file:

# see "man logrotate" for details
# rotate log files weekly
weekly
# keep 4 weeks worth of backlogs
rotate 4
# create new (empty) log files after rotating old ones
create
# uncomment this if you want your log files compressed
#compress
# RPM packages drop log rotation information into this directory
include /etc/logrotate.d
# no packages own wtmp -- we’ll rotate them here
/var/log/wtmp {
    monthly
    create 0664 root utmp
    rotate 1
}
# system-specific logs may be also be configured here.

By default, the logs are rotated weekly on most Linux systems, and four weeks’ worth of logs are preserved. These are designated by the lines weekly and rotate 4 in the /etc/ logrotate.conf file. You can change the values within the /etc/logrotate.conf file to suit the rotating requirements of your environment.

If you list the files in the /var/log directory, notice that some log files end with an extension of .1 or .gz. This indicates that the logrotate utility is running on your system.

You can manually run the logrotate utility to rotate the log files. Use the -f option to force a rotation, even if logrotate doesn’t think it is necessary:

# logrotate -f /etc/logrotate.conf

Application–specific logrotate configurations are stored in the /etc/logrotate.d directory. Change directories to the /etc/logrotate.d directory and list some typical application logs on a Linux server:

# cd /etc/logrotate.d
# ls
acpid  cups  mgetty  ppp  psacct  rpm  samba  syslog  up2date  yum

Setting Up a Custom Log Rotation

The logrotate utility is sometimes perceived as a utility only for SAs. However, any user on the system can use logrotate to rotate log files for applications for which they have read/write permissions on the log files. For example, as the oracle user, you can use logrotate to rotate your database alert.log file.

Here are the steps for setting up a job to rotate the alert log file of an Oracle database:

Create a configuration file named alert.conf in the directory $HOME/config (create the config directory if it doesn’t already exist):
```
/oracle/RMDB1/admin/bdump/*.log {
daily
missingok
rotate 7
compress
mail oracle@localhost
}
```
In the preceding configuration file, the first line specifies the location of the log file. The asterisk (wildcard) tells logrotate to look for any file with the extension of .log in that directory. The daily keyword specifies that the log file should be rotated on a daily basis. The missingok keyword specifies that logrotate should not throw an error if it doesn’t find any log files. The rotate 7 keyword specifies that the log files should be kept for seven days. The compress keyword compresses the rotated log file. Finally, a status e-mail is sent to the local oracle user on the server.
Create a cron job to automatically run the job on a daily basis:
```
0 9 * * * /usr/sbin/logrotate -f -s /home/oracle/config/alrotate.status
/home/oracle/config/alert.conf
```
Note The previous two lines of code should be one line in your cron table (the code didn’t fit nicely on this page on one line).
The cron job runs the logrotate utility every day at 9 a.m. The -s (status) option directs the status file to the specified directory and file. The configuration file used is /home/oracle/config/alert.conf.
Manually test the job to see whether it rotates the alert log correctly. Use the -f switch to force logrotate to do a rotation:
```
$ /usr/sbin/logrotate -f -s /home/oracle/config/alrotate.status \
/home/oracle/config/alert.conf
```

As shown in the previous steps, you can use the logrotate utility to set up log rotation jobs.

Compare using logrotate instead of writing a custom shell script such as the one described in recipe 10-8.

Monitoring Log Files

Many Linux systems have graphical interfaces for monitoring and managing the log files. As a DBA, you often need to look only at a specific log file when trying to troubleshoot a problem. In these scenarios, it is usually sufficient to manually inspect the log files with a text editor such as vi or a paging utility such as more or less.

You can also monitor the logs with the logwatch utility. You can modify the default behavior of logwatch by modifying the logwatch.conf file. Depending on your Linux system, the logwatch.conf file is usually located in a directory named /etc/log.d. To print the default log message details, use the --print option:

# logwatch --print

Many SAs set up a daily job to be run that automatically e-mails the logwatch report to a specified user. Usually this functionality is implemented as a script located in the /etc/cron.daily directory. The name of the script will vary by Linux system. Typically, these scripts are named something like 0logwatch or 00-logwatch.

Managing Solaris Log Files

The Solaris OS logs can be found under the /var directory. Table B-2 documents the names and purpose of commonly used log files in a Solaris environment.

Table B-2. Typical Solaris Log Files

Log File Name	Purpose
/var/adm/messages	General-purpose, catch-all file for system messages
/var/adm/sulog	Records each attempt to use the su command
/var/cron/log	Contains entries for cron jobs running on the server
/var/log/syslog	Logging output from various system utilities (e.g., mail)

Viewing System Message Log Files

The syslogd daemon automatically records various system errors, warnings, and faults in message log files. You can use the dmesg command to view the most recently generated system-level messages. For example, run the following as the root user:

# dmesg

Here’s some sample output:

Apr  1 12:27:56 sb-gate su: [ID 810491 auth.crit] ’su root’ failed for mt...
Apr  2 11:14:09 sb-gate sshd[15969]: [ID 800047 auth.crit] monitor fatal: protocol error...

The /var/adm directory contains several log directories and files. The most recent system log entries are in the /var/adm/messages file. Periodically (typically every 10 days), the contents of the messages file are rotated and renamed to messages.N. For example, you should see a messages.0, messages.1, messages.2, and messages.3 file (older files are deleted). Use the following command to view the current messages file:

# more /var/adm/messages

If you want to view all logged messages, enter the following command:

# more /var/adm/messages*

Rotating Solaris Log Files

You can rotate logs in a Solaris environment via the logadm utility, which is a very flexible and powerful tool that you can use to manage your log files. The logadm utility is called from the root user’s cron table. Here’s an example:

10 3 * * * /usr/sbin/logadm

This code shows that the logadm utility is called once per day at 3:10 a.m. The logadm utility will rotate files based on information in the /etc/logadm.conf file. Although you can manually modify this file, the recommended approach to modifying the /etc/logadm.conf file is via the logadm utility.

A short example will help illustrate how to add an entry. This next line of code instructs the logadm utility to add an entry with the -w switch:

# logadm -w /orahome/logs/mylog.log -C 8 -c -p 1d -t ’/orahome/logs/mylog.log.$n’ -z 1

Now if you inspect the contents of the /etc/logadm.conf file, the prior line has been added to the file:

/orahome/logs/mylog.log -C 8 -c -p 1d -t ’/orahome/logs/mylog.log.$n’ -z 1

The preceding line of code instructs logadm to rotate the /orahome/logs/mylog.log file. The -C 8 switch specifies that it should keep eight old versions before deleting the oldest file. The -c switch instructs the file to be copied and truncated (and not moved). The -p 1d switch specifies that the log file should be rotated on a daily basis. The -t switch provides a template for the rotated log file name. The -z 1 switch specifies that the number 1 rotated log should be compressed.

You can validate your entry by running logadm with the -V switch. Here’s an example:

# logadm -V

You can also force an immediate execution of the entry via the -p now switch:

# logadm -p now /orahome/logs/mylog.log

After running the preceding command, you should see that your log has been rotated:

# cd /orahome/logs
# ls -altr
-rw-r--r--   1 root     root           0 Apr  5 16:40 mylog.log.0
-rw-r--r--   1 root     root           0 Apr  5 16:40 mylog.log

To remove an entry from the /etc/logadm.conf file, use the -r switch. Here’s an example:

# logadm -r /orahome/logs/mylog.log

Summary

Server log files are often the first places to look when you experience performance and security issues. These files contain messages that help diagnose and troubleshoot problems. Because log files tend to grow very fast, it is important to understand how to rotate the logs, which ensures that they are archived, compressed, and deleted at regular intervals.

On Linux systems, use the logrotate utility to rotate log files; on Solaris servers, use the logadm utility.

EasyReliableDBA

Friday, 6 July 2018

linux and solaris receipt for oracle dba

No comments:

Post a Comment

Search This Blog