Monday, 27 November 2017

RAID Configuration

There is significant confusion in many organizations using ORACLE and RAID technologies. I will attempt to make this information very clear and understandable.
What Is RAID?
RAID is the technology for expanding the capacity of the I-O system and providing the capability for data redundancy. It stands for Redundant Array of Inexpensive Disks or Redundant Array of Independent Disks.
Conceptually, RAID is the use of 2 or more physical disks, to create 1 logical disk, where the physical disks operate in tandem to provide greater size and more bandwidth.  RAID has become an indispensable part of the I-O fabric of any system today and is the foundation for storage technologies, supported by many mass storage vendors. The use of RAID technology has re-defined the design methods used for building storage systems that support Oracle databases.

The 3 Main Concepts In RAID
When you talk about RAID, there are 3 that terms are important and relevant.
•    Striping
•    Mirroring
•    Parity.

1- What Is Striping? 
Striping is the process of breaking down data into pieces and distributing it across multiple disks that support a logical volume – “Divide, Conquer & Rule”.  This often results in a “logical volume” that is larger and has greater I-O bandwidth than a single disk. It is purely based on the linear power of incrementally adding disks to a volume to increase the size and I-O bandwidth of the logical volume.  The increase in bandwidth is a result of how read/write operations are done on a striped volume.
Imagine that you are in a grocery store. With you are about two hundred of your closest friends and neighbors all shopping for the week’s groceries. Now consider what it’s like when you get to the checkout area and find that only one checkout line is open. That poor clerk can only deal with a limited number of customers per hour. The line starts to grow progressively.  The same is true of your I-O sub-system. A given disk can process a specific number of I-O operations per second. Anything more than that and the requests start to queue up. Now stop and think about how great it feels when you get to the front of the store and find that all 20 lines are open. You find your way to the shortest line and your headed out the door in no time.
Striping has a similar effect to your I-O system. By creating a single volume from pieces of data on several disks, we can increase the capacity to handle I-O requests in a linear fashion, by combining each disk’s I-O bandwidth. Now, when multiple I-O requests for a file on a striped volume is processed, they can be serviced by multiple drives in the volume, as the requests are sub-divided across several disks. This way all drives in the striped volume can engage and service multiple I/O requests in a more efficient manner.  This “cohesive and independent” functioning of all the drives in a logical volume is relevant for both read and writes operations.  It must be noted that striping by itself, does not reduce “response time” for servicing I-O requests.  However, it does provide predictable response times and facilitates the notion of better performance, by balancing I-O requests across multiple drives in the striped volume.
Figure 1 depicts a 4-way striped volume (v1) with 4 disks (1-4).  A given stripe of data (Data1) in a file on v1 will be split/striped across the 4 disks, into 4 pieces (Data11-Data14).
Disk1Disk2Disk3Disk4
Data11Data12Data13Data14
Data21Data22Data23Data2
Figure 1

2-What Is Mirroring?
Mirroring is the process of writing the same data, to another “member” of the same volume simultaneously. Mirroring provides protection for data by writing exactly the same information to every member in the volume. Additionally, mirroring can provide enhanced read operations because the read requests can be serviced from either “member” of the volume. If you have ever made a photocopy of a document before mailing the original then you have mirrored data.  One of the common myths with mirroring, is that it takes “twice as long” to write.  But in many performance measurements and benchmarks, the overhead has been observed to be around 15-20%.
Figure 2 illustrates a 4-way striped mirrored volume (v1) with 8 Disks (1-8).  A given stripe of data (Data1) in a file on v1 will be split/striped across the Disks (1-4) and then mirrored across Disks (5-8).  Disks (1-4) and (5-8) are called “Mirror Members” of the volume v1.
Disk1Disk2Disk3Disk4Disk5Disk6Disk7Disk8
Data11Data12Data13Data14Data11Data12Data13Data14
Data21Data22Data23Data24Data21Data22Data23Data24
Figure 2

3- What Is Parity?
Parity is the term for error checking. Some levels of RAID, perform calculations when reading and writing data.  The calculations are primarily done on write operations.  However, if one or more disks in a volume are unavailable, then depending on the level of RAID, even read operations would require parity operations to rebuild the pieces on the failed disks.  Parity is used to determine the write location and validity of each stripe that is written in a striped volume.  Parity is implemented on those levels of RAID that do not support mirroring.
Parity algorithms contain Error Correction Code (ECC) capabilities, which calculates parity for a given ‘stripe or chunk’ of data within a RAID volume. The size of a chunk is operating system (O-S) and hardware specific.  The codes generated by the parity algorithm are used to recreate data in the event of disk failure(s).  Because the algorithm can reverse this parity calculation, it can rebuild data, lost as a result of disk failures.  It’s just like solving a math problem what you know the answer (checksum) and one part of the question e.g. 2+X =5, what is X?  Of course, X=3.
Figure 3 depicts a 4-way striped RAID 3 volume with parity – v1 with 5 Disks (1-5).  A given stripe of data (Data1) in a file on v1 will be split/striped across the Disks (1-4) and the parity for Data1 will be stored on Disk 5.  There are other levels of RAID that store parity differently and those will be covered in the following sections.
Disk1Disk2Disk3Disk4Disk5
Data11Data12Data13Data14Parity1
Data21Data22Data23Data24Parity2
Figure 3

Putting It All Together
Striping yields better I-O performance, mirroring provides protection and parity (when applicable) is a way to check the work. With these 3 aspects of RAID, we can achieve scalable, protected, highly available I-O performance.


Types of RAID available
Vendors typically offer the following choices for RAID configurations (Nice example HERE)
RAID0
Disk stripping. RAID0 requires at least two physical disks. Data is read and written across multiple drives, so disk I/O is relatively evenly spread. Writes can occur in a block or streaming manner (similar to non-RAIDed disks) as requested by the operating system. Disk failure results in lost data. Compared to a single disk drive, RAID0 has the following attributes:
- Better read performance
- Better write performance
- Inexpensive in cost
- Not fault-tolerant
- Storage equivalent to sum of physical drive storage in the array
- Readily available from most vendors
RAID1 Shadowing/Mirroring
Disk mirroring. RAID1 requires two physical disks. Logical writes are done by physically writing the data to both disks simultaneously, and can typically be done in a block manner or streaming manner, as requested by the operating system. Reads can be done using either disk. In the event of a disk failure, data can still be retrieved and written to the surviving disk. Compared to a single disk drive, RAID1 has the following attributes:
- Better read performance
- Similar write performance
- Expensive
- Fault-tolerant
- Storage equivalent to 1/2 the sum of the physical drive storage in the mirrored set.
- Readily available from most vendors
RAID5 Striping with Rotating Parity
Disk stripping with parity. RAID5 requires at least three physical disks. On a logical write, a block of data is physically written to disk, parity information is calculated using the block just written plus blocks already existing on disk, then the parity information is written to disk. In RAID5, the parity information is rotated among the physical disks to prevent bottlenecks caused by a dedicated parity disk. Note that writes occur in a block manner regardless of whether the O/S is sending a stream of data to be written or requests to write whole blocks. On a logical read, data is read from multiple disks in a manner very similar to RAID0. In the event of a disk failure, data can be reconstructed on the fly using the parity information. Compared to a single disk drive, RAID5 has the following attributes:
- Data is stripped across multiple physical disks and parity data is stripped across storage equivalent to one disk.
- Better read performance
- Poorer write performance
- Inexpensive
- Fault-tolerant
- Storage equivalent to N - 1 times the number of physical drives in the array.
- Readily available from most vendors
RAID10 (or RAID0+1)
Mirrored stripe sets. RAID10 requires at least 4 physical drives, and combines the performance gains of RAID0 with the fault-tolerance and expense of RAID1. Data is written simultaneously to two mirrored sets of striped disks in blocks or streams. Reads can be performed against either striped set. In the event of a failure of a disk drive in one striped set, data can be written to and read from the surviving striped set. Compared to a single disk drive, RAID10 has the following attributes:
- Better read performance
- Better write performance
- Expensive
- Fault-tolerant
- Storage is 1/2 of the sum of the physical drives' storage
- Currently available from only a few vendors (at the time of this writing)

Possible configurations using 4 physical disks:

ConfigurationNumber of disksAvailable spaceMax Reads/SecMax Writes/Sec
Single disk14 GB6060
RAID0416 GB240240
RAID148 GB240 (2 arrays)120 (2 arrays)
RAID5412 GB18060
Possible configurations using 6 physical disks:

ConfigurationNumber of disksAvailable spaceMax Reads/SecMax Writes/Sec
Single disk14 GB6060
RAID0624 GB360360
RAID1612 GB360 (3 arrays)180 (3 arrays)
RAID5620 GB30090
 
As can be seen from the charts, RAID0 offers good read and write performance, but no fault tolerance.
RAID1 offers good read performance, and half as much write performance, but provides fault-tolerance.
RAID5 reclaims most of the space lost to RAID1, provides fault-tolerance, offers reasonably good read performance, but poor write performance. (In fact, RAID5 requires 4 disks to regain the same write performance as a single disk). Also, note that streaming logical writes, as well as block-level logical writes, to RAID5 arrays are handled as block-level physical writes. Finally, read or write workload capacity can be increased in any RAID configuration by adding physical disks.
The RAID "hierarchy" begins with RAID 0 (striping) and RAID 1 (mirroring).  Combining RAID 0 and RAID 1 is called  RAID-0+1 or RAID-1+0, depending on how you combine them.  (RAID 0+1 is also called RAID-01, and RAID-1+0 is also called RAID-10.)  The performance of RAID-10 and RAID-01 are identical, but they have different levels of data integrity.
RAID-01 (or RAID 0+1) is a mirrored pair (RAID-1) made from two stripe sets (RAID-0); hence the name RAID 0+1, because it is created by first creating two RAID-0 sets and adding RAID-1.  If you lose a drive on one side of a RAID-01 array, then lose another drive on the other side of that array before the first side is recovered, you will suffer complete data loss.  It is also important to note that all drives in the surviving mirror are involved in rebuilding the entire damaged stripe set, even if only a single drive was damaged.  Performance during recovery is severely degraded during recovery unless the RAID subsystem allows adjusting the priority of recovery. However, shifting the priority toward production will lengthen recovery time and increase the risk of the kind of the catastrophic data loss mentioned earlier.

RAID-10 (or RAID 1+0) is a stripe set made up from N mirrored pairs.  Only the loss of both drives in the same mirrored pair can result in any data loss and the loss of that particular drive is 1/Nth as likely as the loss of some drive on the opposite mirror in RAID-01.  Recovery only involves the replacement drive and its mirror so the rest of the array performs at 100% capacity during recovery.  Also since only the single drive needs recovery bandwidth requirements during recovery are lower and recovery takes far less time reducing the risk of catastrophic data loss.
The most appropriate RAID configuration for a specific filesystem or database table space must be determined based on data access patterns and cost versus performance tradeoffs.  RAID-0 offers no increased reliability.  It can, however, supply performance acceleration at no increased storage cost.  RAID-1 provides the highest performance for redundant storage, because it does not require read-modify-write cycles to update data, and because multiple copies of data may be used to accelerate read-intensive applications.  Unfortunately, RAID-1 requires at least double the disk capacity of RAID-0.  Also, since more than two copies of the data exist, RAID-1 arrays may be constructed to endure loss of multiple disks without interruption.  Parity RAID allows redundancy with less total storage cost.  The read-modify-write it requires, however, will reduce total throughput in any small write operations (read-only or extremely read-intensive applications are fine). The loss of a single disk will cause read performance to be degraded while the system reads all other disks in the array and recomputes the missing data. Additionally, it does not support losing multiple disks, and cannot be made redundant.

ORACLE database files on RAID
Given the information regarding the advantages and disadvantages of various RAID configurations, how does this information apply to an ORACLE instance? The discussion below will provide information about how database files are used by an ORACLE instance under OLTP and DSS classifications of workload.
Note that the perspectives presented below are very sensitive to the number of users: if your organization has a 10-20 user OLTP system (and thus, a low throughput requirement), then you may get very acceptable performance with all database files stored on RAID5 arrays. On the other hand, if your organization has a 100 user OLTP system (resulting in a higher throughput requirement), then a different RAID configuration may be absolutely necessary. An initial configuration can be outlined by estimating the number of transactions (based on the number of users), performing adjustments to encompass additional activity (such as hot backups, nightly batch jobs, etc.), then performing the necessary mathematical calculations.
You definitely want to keep rollback segments, temp tablespaces and redo logs off from RAID5 since that is too slow for these write-intensive Oracle files. They are sequentially accessed. Redo logs should have their *own* dedicated drives. 

OLTP (On-line transaction processing) workloads
Characterized by multi-user concurrent INSERTS, UPDATES, and DELETES during normal working hours, plus possibly some mixture of batch jobs nightly. Large SELECTS may generate reports, but the reports will typically be "pre-defined" reports rather than ad-hoc queries. The focus, though, is on enabling update activity that completes within an acceptable response time. Ideally, each type of database file would be spread out over it's own private disk subsystem, although grouping certain types of files together (when the number of disks, arrays, and controllers is less than ideal) may yield adequate performance. (Please see the article on Instance tuning for information regarding groupings of database files in an OLTP system.)
Redo logs.
During update activity, redo logs are written to in a continuous and sequential manner, and are not read under normal circumstances. RAID5 would be the worst choice for performance. Oracle Corporation recommends placing redo logs on single non-RAIDed disk drives, under the assumption that this configuration provides the best overall performance for simple sequential writes. Redo logs should always be multiplexed at the ORACLE software level, so RAID1 provides few additional benefits. Since non-RAID and RAID0 configurations can vary with hardware from different vendors, the organization should contact their hardware vendor to determine whether non-RAIDed disks or RAID0 arrays will yield the best performance for continuous sequential writes. Note that even if redo logs are placed on RAID1 arrays that the redo logs should still be mirrored at the ORACLE level.
Archive logs
As redo logs are filled, archive logs are written to disk one whole file at a time (assuming, of course, that the database is running in archivelog mode), and are not read under normal circumstances. Any RAID or non-RAID configuration could be used, depending upon the performance requirements and size of the redo logs. For instance, if the redo logs are large, then they will become full and be archived less often. If an archive log is likely to be written no more than once per minute, then RAID5 may provide acceptable performance. If RAID5 proves too slow, then a different RAID configuration can be chosen, or the redo logs can simply be made larger. Note that a fault-tolerant configuration is advisable: if the archive log destination becomes unavailable, the database will halt.
Rollback Segments
As modifications are made to the database tables, undo information is written to the buffer cache in memory. These rollback segments are used to to maintain commitment control and read consistency. Rollback segment data is periodically flushed to disk by checkpoints. Consequently, the changes to the rollback segments are also recorded in the redo logs. However, a smaller amount of information is typically written to the rollback segments than to the redo logs, so the write rate is less stringent. A fault-tolerant configuration is advisable, since the database cannot operate without rollback segments, and recovery of common rollback segments will typically require an instance shutdown. If the transaction rate is reasonably small, RAID5 may provide adequate performance. If it does not, then RAID1 (or RAID10) should be considered.
User tables and indexes
As updates are performed, these changes are stored in memory. Periodically, a checkpoint will flush the changes to disk. Checkpoints occur under two normal circumstances: a redo log switch occurred, or the time interval for a checkpoint expired. (There are a variety of other situations that trigger a checkpoint. Please check the ORACLE documentation for more detail.) Like redo log switches and generation of archive logs, checkpoints can normally be configured so that they occur approximately once per minute. Recovery can be performed up to the most recent checkpoint, so the interval should not be too large for an OLTP system. If the volume of updated data written to disk at each checkpoint is reasonably small (ie. the transaction rate is not extremely large), then RAID5 may provide acceptable performance. Additionally, analysis should be performed to determine the ratio of reads to writes. Recalling that RAID5 offers reasonably good read performance, if the percentage of reads is much larger than the percentage of writes (for instance, 80% to 20%), then RAID5 may offer acceptable performance for small, medium, and even some large installations. A fault-tolerant configuration is preferable to maximize availability (assuming availability is an objective of the organization), although only failures damaging datafiles for the SYSTEM tablespace (and active rollback segments) require the instance to be shutdown. Disk failures damaging datafiles for non-SYSTEM tablespaces can be recovered with the instance on-line, meaning that only the applications using data in tablespaces impacted by the failure will be unavailable. With this in mind, RAID0 could be considered if RAID5 does not provide the necessary performance. If high availability and high performance on a medium to large system are explicit requirements, then RAID1 or RAID10 should be considered.
Temp segments
Sorts too large to be performed in memory are performed on disk. Sort data is written to disk in a block-oriented. Sorts do not normally occur with INSERT/UPDATE/DELETE activity. Rather, SELECTS with ORDER BY or GROUP BY clauses and aggregate functions (ie. operational reports) , index rebuilds, etc., will use TEMP segments only if the sort is too large to perform in memory. Temp segments are good candidates for non-RAIDed drives or RAID0 arrays. Fault-tolerance is not critical: if a drive failure occurs and datafiles for a temp segment are lost, then the temp segment can either be recovered in the normal means (restore from tape and perform a tablespace recovery), or the temp segment can simply be dropped and re-created since there is no permanent data stored in the temp segment. Note that while a temp segment is unavailable, certain reports or index creations may not execute without errors, but update activity will typically not be impacted. With this in mind, RAID1 arrays are a bit unnecessary for temp segments, and should be used for more critical database files. RAID5 will provide adequate performance if the sort area hit ratios are such that very few sorts are performed on disk rather than in memory.
Control files
Control files are critical to the instance operation, as they contain the structural information for the database. Control files are updated periodically (at a checkpoint and at structural changes), but the data written to the control files is a very small quantity compared to other database files. Control files, like redo logs, should be multiplexed at the ORACLE software level onto different drives or arrays. Non-RAIDed drives or or any RAID configuration would be acceptable for control files, although most organizations will typically distribute the multiple copies of the control files with the other database files, given that the read and write requirements are so minimal. For control files, maintaining multiple copies in different locations should be favored over any other concern.
Software and static files
The ORACLE software, configuration files, etc. are very good candidates for RAID5 arrays. This information is not constantly updated, so the RAID5 write penalty is of little concern. Fault-tolerance is advisable: if the database software (or O/S software) becomes unavailable due to a disk failure, then the database instance will abort. Also, recovery will include restore or re-installation of ORACLE software (and possible operating system software) as well as restore and recovery of the database files. RAID5 provides the necessary fault-tolerance to prevent this all-inclusive recovery, and good read performance for dynamic loading and unloading of executable components at the operating system level.

DSS (Decision Support System) workloads
In comparison to OLTP systems, DSS or data warehousing systems are characterized by primarily SELECT activity during normal working hours, and batch INSERT, UPDATE, and DELETE activity run on a periodic basis (nightly, weekly, or monthly). There will typically be a large amount of variability in the number of rows accessed by any particular SELECT, and the queries will tend to be of a more ad-hock nature. The number of users will typically be smaller than their ajoining OLTP systems (where the data originates). The focus is on enabling SELECT activity that completes within an acceptable response time, while insuring that the batch update activity still has capacity to complete in it's allowable time window. Note now that there are two areas of performance over which to be concerned: periodic refreshes and ad-hock read activity. The general level directive in this case should be to configure the database such that read-only performed by end users is as good as it can get without rendering refreshes incapable of completion. As with OLTP systems, each type of database file would ideally have it's own private disk subsystem (disks, arrays, and controller channel), but with less than ideal resources certain grouping tend to work well for DSS systems. (Please see the article on Instance tuning for information on these groupings.)
Redo logs
Redo logs are only written to while update activity is occurring. In a DSS-oriented system, a significant portion of data entered interactively during the day may loaded into the DSS database during only a few hours. Given this characteristic, redo logging may tend to be more of a bottleneck on periodic refresh processes of a DSS database than on it's ajoining OLTP systems. If nightly loads are taking longer than their allowance, then redo logging should be the first place to look. The same RAID/non-RAID suggestions that apply to redo logging in OLTP also apply with DSS systems. As with OLTP systems, redo logs should always be mirrored at the ORACLE software level, even if they are stored on fault-tolerant disk arrays.
Archive logs
Like redo logging, archive logs are only written out during update activity. If the archive log destination appears to be over-loaded with I/O requests, then consider changing the RAID configuration, or simply increase the size of the redo logs. Since there is a large volume of data being entered in a short period of time, it may be very reasonable to make the redo logs for the DSS or data warehouse much larger (10 or more times) than the redo logs used by the OLTP system. A reasonable rule of thumb is to target about one log switch per hour. With this objective met, then the disk configuration and fault-tolerance can be chosen based on the same rules used for OLTP systems.
Rollback Segments
Again like redo logging, rollback segments will be highly utilized during the periodic refreshes, and virtually unused during the normal work hours. Use the same logic for determining RAID or non-RAID configurations on the DSS database that would be used for the OLTP systems.
User tables and indexes
Writes are done to tablespaces containing data and indexes during periodic refreshes, but during the normal work hours read activity on the table and indexes will typically far exceed the update work performed on a refresh. A fault-tolerant RAID configuration is suggested to sustain availability. However, in most cases the business can still operate if the DSS system is unavailable for several hours due to a disk failure. Information for strategic decisions may not be available, but orders can still be entered. If the DSS has high availability requirements, select a fault-tolerant disk configuration. If RAID5 arrays can sustain the periodic refresh updates, then it is typically a reasonably good choice due to it's good read performance. As seen above, the read and write workload capacities can be adjusted by adding physical drives to the array.
Temp segments
In a decision support system or data warehouse, expect temp segment usage to be much greater than what would be found in a transaction system. Recalling that temp segments do not store any permanent data and are not absolutely necessary for recovery, RAID0 may be a good choice. Keep in mind, though, that the loss of a large temp segment due to drive failure may render the DSS unusable (unable to perform sorts to answer large queries) until the failed drives are replaced. If availability requirements are high, then a fault-tolerant solution should be selected, or at least considered. If the percentage of sorts on disk is low, then RAID5 may offer acceptable performance; if this percentage is high, RAID1 or RAID10 may be required.
Control files
As with OLTP systems, control files should always be mirrored at the ORACLE software level regardless of any fault-tolerant disk configurations. Since reads and writes to these files are minimal, any disk configuration should be acceptable. Most organizations will typically disperse control files onto different disk arrays and controller cards, along with other database files.
Software and static files
Like OLTP systems, these files should be placed on fault-tolerant disk configurations. Since very little write activity is present, these are again good candidates for RAID5.
Taking the above information into consideration, can an organization run an entire ORACLE database instance on a single RAID5 array? The answer is "yes". Will the organization get a good level of fault-tolerance? Again, the answer is "yes". Will the organization get acceptable performance? The answer is "it depends". This dependency includes the type of workload, the number of users, the throughput requirements, and a whole host of other variables. If the organization has an extremely limited budget, then it can always start with a single RAID5 array, perform the necessary analysis to see where improvement is needed, and proceed to correct the deficiencies.
 
Summary

RAID 1 is mirroring - blocks are written simultaneously to both members of the raid set so if one member fails, the other still has a full set of data.

RAID 5 is striping with distributed parity - data blocks are written in a stripe across all except one of the disks and a parity block is written to the last disk. Each data write is written to a different set of n-1 disks so that the data and parity are scattered equally amongst the drives.

RAID 1 gives marginal read speed increase with no real write overhead RAID 5 gives quite a high speed increase for reads but invokes a high overhead for write operations (normally the RAID5 controller will have a write back cache attached to alleviate this)

For high write volumes R5 is far from ideal - much better for low writes but high reads. Don't put actively written files on an R5 set - especially things like redo logs !

To really tune the I/Os you need to know the chunksize of your RAID controller and tweak db_file_multiblock_read_count appropriately

Distribution of Oracle Files
The following table shows what Oracle suggests for RAID usage:
RAIDType of RaidControl FileDatabase FileRedo Log FileArchive Log File
0StripingAvoidOKAvoidAvoid
1ShadowingBestOKBestBest
0 + 1Striping and ShadowingOKBestAvoidAvoid
3Striping with static parityOKOKAvoidAvoid
5Striping with rotating parityOKBest if RAID0-1 not availableAvoidAvoid

Create AWR and ADDM Reports

First, let's check how we can generate AWR reports. To be able to get AWR reports in plain text, we can use:

Syntax (for Oracle 10.2)

DBMS_WORKLOAD_REPOSITORY.AWR_REPORT_TEXT(
   l_dbid IN NUMBER,
   l_inst_num IN NUMBER,
   l_bid IN NUMBER,
   l_eid IN NUMBER,
   l_options IN NUMBER DEFAULT 0)
RETURN awrrpt_text_type_table PIPELINED;

If we want the report in HTML, we can use:

DBMS_WORKLOAD_REPOSITORY.AWR_REPORT_HTML(
   l_dbid IN NUMBER,
   l_inst_num IN NUMBER,
   l_bid IN NUMBER,
   l_eid IN NUMBER,
l_options IN NUMBER DEFAULT 0)
RETURN awrrpt_text_type_table PIPELINED;

As you see both functions accept same arguments, only difference is the format of the report. Arguments of these functions:
l_dbid: The database identifier
l_insT_num: The instance number
l_bid: The 'Begin Snapshot' ID
l_eid: The 'End Snapshot' ID
l_options: A flag to specify to control the output of the report. Default vaule is 0, if we set it to 8, report will include the ADDM specific sections include the Buffer Pool Advice, Shared Pool Advice, and PGA Target Advice.


One of the important points is, we need to determine the database ID, instance number and snapshots dynamically to be able to define it as a job in OEM Grid Control. I query DBA_HIST_SNAPSHOT to find the beginning and end snapshot IDs according to variables starttime and endtime. I read the database ID and instance number from GV$DATABASE.

Here's the simple PL/SQL block to create AWR report in HTML and mail it:
DECLARE
   dbid           NUMBER;
   inst_id        NUMBER;
   bid            NUMBER;
   eid            NUMBER;
   db_unique_name VARCHAR2(30);  
   host_name      VARCHAR2(64);
   starttime      CHAR(5);
   endtime        CHAR(5);
   v_from         VARCHAR2(80);
   v_recipient    VARCHAR2(80) := 'someguy@myemailaddress.com';
   v_mail_host    VARCHAR2(30) := 'somesmtpserver';
   v_mail_conn    UTL_SMTP.connection;
BEGIN
   starttime:= '06:00';
   endtime:= '10:00';
 
   SELECT MIN(snap_id), MAX(snap_id) INTO bid, eid
     FROM dba_hist_snapshot
    WHERE TO_CHAR(begin_interval_time, 'hh24:mi') >= starttime
      AND TO_CHAR(end_interval_time, 'hh24:mi') <= endtime
      AND TRUNC(begin_interval_time) = TRUNC (SYSDATE)
      AND TRUNC(end_interval_time) = TRUNC (SYSDATE);
 
   SELECT dbid, inst_id, db_unique_name INTO dbid, inst_id, db_unique_name
     FROM gv$database;
 
   SELECT host_name INTO host_name
     FROM v$instance;
 
   v_from := db_unique_name ||  '@' || host_name;
 
   v_mail_conn := UTL_SMTP.OPEN_CONNECTION(v_mail_host, 25);
   UTL_SMTP.HELO(v_mail_conn, v_mail_host);
   UTL_SMTP.MAIL(v_mail_conn, v_from);
   UTL_SMTP.RCPT(v_mail_conn, v_recipient);
   UTL_SMTP.OPEN_DATA(v_mail_conn );
   UTL_SMTP.WRITE_DATA( v_mail_conn, 'From:'|| v_from || UTL_TCP.CRLF );
   UTL_SMTP.WRITE_DATA( v_mail_conn, 'To:' || v_recipient || UTL_TCP.CRLF );
   UTL_SMTP.WRITE_DATA( v_mail_conn, 'Subject:' ||'AWR Report of ' || v_from || ' '  || SYSDATE || ' ' || starttime || '-' || endtime || UTL_TCP.CRLF  );
   UTL_SMTP.WRITE_DATA( v_mail_conn, 'Content-Type: text/html; charset=utf8' || UTL_TCP.CRLF || UTL_TCP.CRLF);    
  
   FOR c1_rec IN (SELECT output FROM TABLE (DBMS_WORKLOAD_REPOSITORY.AWR_REPORT_HTML(dbid,inst_id,bid, eid, 8 )))
   LOOP
      UTL_SMTP.WRITE_DATA(v_mail_conn, c1_rec.output || UTL_TCP.CRLF );
   END LOOP;
    
   UTL_SMTP.CLOSE_DATA(v_mail_conn); 
   UTL_SMTP.QUIT(v_mail_conn);
    
EXCEPTION
   WHEN UTL_SMTP.TRANSIENT_ERROR OR UTL_SMTP.PERMANENT_ERROR THEN
      RAISE_APPLICATION_ERROR(-20000, 'Unable to send mail: ' || SQLERRM);
END;
/

To be able to create ADDM Reports, we can use DBMS_ADVISOR package:
  DBMS_ADVISOR.CREATE_TASK('ADDM',tid,tname,'Name of the ADDM Report');
  DBMS_ADVISOR.SET_TASK_PARAMETER(tname,'START_SNAPSHOT',bid);


ADDM reports are created as jobs, so we wait until they're completed and get the report:
SELECT DBMS_ADVISOR.GET_TASK_REPORT(tname) FROM DUAL;


Here's the simple PL/SQL block to create ADDM report and mail it:
DECLARE
   dbid           NUMBER;
   inst_id        NUMBER;
   bid            NUMBER;
   eid            NUMBER;
   db_unique_name VARCHAR2(30);  
   host_name      VARCHAR2(64);
   status         VARCHAR2(11);
   starttime      CHAR(5);
   endtime        CHAR(5);
   output         VARCHAR2(32000);
   v_from         VARCHAR2(80);
   v_recipient    VARCHAR2(80) := 'someguy@myemailaddress.com'; 
   v_mail_host    VARCHAR2(30) := 'somesmtpserver'; 
   v_mail_conn    UTL_SMTP.connection;
   tname          VARCHAR2(50);
   tid  number;  
BEGIN
   starttime := '01:00';
   endtime := '12:00';
 
   SELECT MIN(snap_id),MAX(snap_id) INTO bid, eid
     FROM dba_hist_snapshot
     WHERE TO_CHAR(begin_interval_time,'hh24:mi') >= starttime
       AND TO_CHAR(end_interval_time, 'hh24:mi') <= endtime       
       AND TRUNC(begin_interval_time)= TRUNC (SYSDATE)
       AND TRUNC(end_interval_time) = TRUNC (SYSDATE);
 
   SELECT dbid, inst_id, db_unique_name INTO dbid, inst_id, db_unique_name
     FROM gv$database;
 
   SELECT host_name INTO host_name
      FROM v$instance;
      
   DBMS_ADVISOR.CREATE_TASK('ADDM',tid,tname,'ADDM Report( ' || bid || ',' || eid || ' )');
   DBMS_ADVISOR.SET_TASK_PARAMETER(tname,'START_SNAPSHOT',bid);
   DBMS_ADVISOR.SET_TASK_PARAMETER(tname,'END_SNAPSHOT',eid);
   DBMS_ADVISOR.EXECUTE_TASK(tname);    
     
   status := 0;
   while status <> 'COMPLETED' loop
      select status into status from dba_advisor_tasks where task_id = tid;
      dbms_lock.sleep(5);
   end loop;
 
   v_from := db_unique_name ||  '@' || host_name;
 
   v_mail_conn := UTL_SMTP.OPEN_CONNECTION (v_mail_host, 25);
   UTL_SMTP.HELO(v_mail_conn, v_mail_host);
   UTL_SMTP.MAIL(v_mail_conn, v_from);
   UTL_SMTP.RCPT(v_mail_conn, v_recipient);
   UTL_SMTP.OPEN_DATA(v_mail_conn);
   UTL_SMTP.WRITE_DATA(v_mail_conn, 'From:' || v_from || UTL_TCP.CRLF );
   UTL_SMTP.WRITE_DATA(v_mail_conn, 'To:' || v_recipient || UTL_TCP.CRLF );
   UTL_SMTP.WRITE_DATA(v_mail_conn, 'Subject: '  || 'ADDM Report of ' || v_from || ''    || SYSDATE || ' ' || starttime || '-' || endtime || UTL_TCP.CRLF || UTL_TCP.CRLF );
 
   SELECT DBMS_ADVISOR.GET_TASK_REPORT(tname) INTO output FROM DUAL;
   UTL_SMTP.WRITE_DATA(v_mail_conn, output);   
   UTL_SMTP.CLOSE_DATA(v_mail_conn); 
   UTL_SMTP.QUIT(v_mail_conn);
    
EXCEPTION
   WHEN UTL_SMTP.TRANSIENT_ERROR OR UTL_SMTP.PERMANENT_ERROR THEN
      RAISE_APPLICATION_ERROR(-20000, 'Unable to send mail: ' || SQLERRM);
END;
/

I have tested these scripts on both 10.2 and 11.2, and they worked fine. By the way, don't forget to set ACL if you'll run them in 11g+
BEGIN
  DBMS_NETWORK_ACL_ADMIN.create_acl(acl  => 'smtpserver.xml',
                                    description => 'Connection to SMTP',
                                    principal  => 'GOKHAN',    
                                    is_grant   => TRUE, 
                                    privilege  => 'connect',
                                    start_date => SYSTIMESTAMP,
                                    end_date  => NULL);
  COMMIT;
END;
/
 
BEGIN
  DBMS_NETWORK_ACL_ADMIN.assign_acl(acl => 'smtpserver.xml',  
                                    host => 'oursmtpserver', 
                                    lower_port  => 25, 
                                    upper_port => 25 );
  COMMIT;
END;
/

Now we can create the job in Grid Control. Login to OEM Grid Control, click the Jobs tab then choose SQL Script as job type and click Go. Give the new job a name and description, then click Add to add the target databases. Click to "Parameters" tab, and copy paste one of the above scripts into the textbox labeled "SQL Script". Set the credentials and schedule as you wish and click submit when all done.

Saturday, 25 November 2017

Step_by_Step_RAC_on_Linux

http://www.pafumi.net/Step_by_Step_RAC_on_Linux3.pdf

New parameters in expdp Utility

New parameters in expdp Utility
ATTACH Attach the client session to existing data pump jobs
CONTENT Specify what to export(ALL, DATA_ONLY, METADATA_ONLY)
DIRECTORY Location to write the dump file and log file.
ESTIMATE Show how much disk space each table in the export job consumes.
ESTIMATE_ONLY It estimate the space, but does not perform export
EXCLUDE List of objects to be excluded
INCLUDE List of jobs to be included
JOB_NAME Name of the export job
KEEP_MASTER Specify Y not to drop the master table after export
NETWORK_LINK Specify dblink to export from remote database
NOLOGFILE Specify Y if you do not want to create log file
PARALLEL Specify the maximum number of threads for the export job
VERSION DB objects that are incompatible with the specified version will not be exported.
ENCRYPTION_PASSWORD The table column is encrypted, then it will be written as clear text in the dump file set when the password is not specified. We can define any string as a password for this parameter.
COMPRESSION Specifies whether to compress metadata before writing to the dump file set. Values are: ALL, (METADATA_ONLY), DATA_ONLY and NONE.
SAMPLE - Allows you to specify a percentage of data to be sampled and unloaded from the source database. The sample_percent indicates the probability that a block of rows will be selected as part of the sample.
REMAP_DATA = Used to Mask (hide) data


Equivalent imp & impdp parameters
These below parameters are equivalent parameters between imp & impdp
imp Commandimpdp Command
DATAFILESTRANSPORT_DATAFILES
DESTROYREUSE_DATAFILES
FEEDBACKSTATUS
FILEDUMPFILE
FROMUSERSCHEMAS, REMAP_SCHEMAS
IGNORETABLE_EXISTS_ACTION(SKIP,APPEND,TRUNCATE,REPLACE)
INDEXFILE, SHOWSQLFILE
LOGLOGFILE
TOUSERREMAP_SCHEMA

Single Instance Vs RAC Instance

Single Instance Vs RAC Instance
As can be seen from the below diagram Oracle RAC database has more than one instance, whereas the database is single and shared across all the instances. Thus focus of RAC is on share everything concept. This ensures availability of Oracle database and also the scalability. Each RAC instance is installed on different machine (called nodes) thus scaling the computing power and maximizing availability.

Oracle RAC

What is Clustering?
Clustering is the grouping of individual computers in such a way that they can act as a single computer system. These combined resources can then be presented to applications as a single system.  One benefit of a cluster is that the application does not need to be cluster “aware”, and requires no special coding to take advantage of most cluster related services.

Oracle clustering is a shared everything architecture that involves sharing of storage by the systems of the cluster.

Oracle RAC:
Oracle RAC stands for Real Application Cluster. It was introduced by Oracle with 9i release. Latest version is 11gR2.
Oracle RAC is a cluster database with a shared cache architecture that overcomes the limitations of traditional shared-nothing and shared-disk approaches to provide highly scalable and available database solutions for all your business applications. Oracle RAC is a key component of Oracle’s private cloud architecture.

Tuesday, 23 May 2017

Configuration Document - SAN Storage Migration for Oracle 11g RAC database with ASM
SAN Storage Migration for Oracle 11g RAC database with ASM
Introduction
We have a requirement to migrate our Storage from SORCE to TARGET as our organization migrating all our storage to TARGET.  Presently we have (NAME) storage. We have RAC database.
We prepared and implemented the below plan to achieve our migration.
Solution
Following steps needs to be executed to fulfill this requirement.
Cluster DB server Details
1: - Node 1 (5 GB RAM,60 GB HDD,1 Core) (OEL5.4)
2: - Node 2 (5 GB RAM,60 GB HDD,1 Core) (OEL5.4)
3: -  Storage (Software SAN) (Openfiler) (1 GB RAM,10+100 GB HDD,1 Core)
Before Migration 
·         Please note down the server configuration and also take the backup.
·         Note the server configuration and OS details.
·         Stop the application.
·         Backup the databases.
·         Make sure the corresponding hardware infrastructure (Eg: Host HBA, Storage Switches) is in place before migration.
·         Make sure that the storage has enough space to fit the legacy storage data.
·         Create the number of volumes and their sizes to be equal or greater on the storage.
·         Make sure that the new volumes are discovered, multipath is configured in the target nodes.
·         New Raw devices are configured and ASM stamping is done for the disks wherever appropriate.
·         Appropriate permissions set for the newly discovered Volumes.
Oracle RAC Database Cluster includes the below critical components.
·         Votedisk : The voting disk is a file that manages information about node membership.
·         OCR: It is a file that manages cluster and Oracle RAC database configuration information.
·         Datafiles : Datafiles are physical files of the operating system that store the data of all logical structures in the database.
Storage Migration of Oracle RAC database involves the following steps.
·         Migrating Oracle Cluster Registry (OCR) to the new storage device
·         Migrating Cluster heartbeat (Voting disk) to the new storage device
·         Migrating Database to the new storage.
Backup procedure for OCR, Voting Disk and Data
Before migrating the data to a new Storage, it is always recommended to backup the critical components of Oracle RAC using the below mentioned steps.
OCR Backup
Below command will help you to check backups
[root@rac1 ~]# ocrconfig -showbackup
 
rac1     2016/07/14 12:18:09    
 /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr
 
rac1     2016/07/14 08:18:06    
 /u01/app/11.2.0/grid/cdata/rac-cluster/backup01.ocr
 
rac1     2016/07/14 04:18:05     
/u01/app/11.2.0/grid/cdata/rac-cluster/backup02.ocr
 
rac1     2016/07/13 02:56:40     
/u01/app/11.2.0/grid/cdata/rac-cluster/day.ocr
 
rac1     2016/07/13 02:56:40    
 /u01/app/11.2.0/grid/cdata/rac-cluster/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available
[root@rac1 ~]# ocrconfig -manualbackup
 
rac1     2016/07/14 13:15:45     
/u01/app/11.2.0/grid/cdata/rac-cluster/backup_20160714_131545.ocr
[root@rac1 ~]#

Voting Disk Backup:
Run below command to check the configured voting devices for backup.
[root@rac1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   9366acf106e74f4bbfd6cf281ffc5b3d (ORCL:NOCR) [NOCR]
Located 1 voting disk(s).
[root@rac1 ~]#

Take a backup of any one of the above configured voting disk using the below command
root@rac1 ~]# dd if=/dev/rdsk/c0t6000402002F4550C627487C700000000d0s6
 of=/dev/rdsk/c0t6000402002F4550C627487C700000000d0s6.bkp

Database Backup:
Perform a full database backup including control files, redologs, database files, init files and password files using RMAN.


SQL> select name,TOTAL_MB,FREE_MB  from  v$asm_disk;
 
NAME                             TOTAL_MB    FREE_MB
------------------------------ ---------- ----------
DATA                                38148      26293
OCR                                  9537       4018
 
SQL> SELECT name FROM v$asm_diskgroup;
 
NAME
------------------------------
ADATA
OCR_VD
 
SQL> SELECT file_name FROM dba_data_files;
 
FILE_NAME
-----------------------------------------------------
+ADATA/prod/users01.dbf
+ADATA/prod/undotbs01.dbf
+ADATA/prod/sysaux01.dbf
+ADATA/prod/system01.dbf
+ADATA/prod/undotbs02.dbf

Set up an iSCSI Target and Initiator

iscsiadm -m discovery -t st -p 147.43.0.10

#service iscsi restart

Creating  ASM disks.

# oracleasm createdisk NDATA /dev/sdc1
 
# oracleasm createdisk NOCR /dev/sdc2

Creating Diskgroup

SQL> CREATE DISKGROUP NDATA
  EXTERNAL REDUNDANCY
  DISK 'NDATA';
 
SQL>CREATE DISKGROUP NOCR
  EXTERNAL REDUNDANCY
  DISK 'NOCR';

Stop RAC Database

$srvctl stop database -d prod

Startup mount

srvctl start database -d prod -o mount

SQL> SELECT name FROM v$asm_diskgroup;
 
NAME
------------------------------
ADATA
OCR_VD
NDATA
NOCR
 
SQL>
 
 
SQL> select name,TOTAL_MB,FREE_MB  from  v$asm_disk;
 
NAME                             TOTAL_MB    FREE_MB
------------------------------ ---------- ----------
DATA                                38148      36470
OCR                                  9537       9141
NDATA                               28610      28515
NOCR                                 9538       9443
 

Make a copy the Oracle ASM database file to be moved.

[oracle@rac1 ~]$ rman target /
 
RMAN> COPY DATAFILE '+ADATA/prod/system01.dbf' TO '+NDATA';
RMAN> COPY DATAFILE '+ADATA/prod/users01.dbf' TO '+NDATA';
RMAN> COPY DATAFILE '+ADATA/prod/undotbs01.dbf' TO '+NDATA';
RMAN> COPY DATAFILE '+ADATA/prod/sysaux01.dbf' TO '+NDATA';
RMAN> COPY DATAFILE '+ADATA/prod/undotbs02.dbf' TO '+NDATA';

Alter the Database

SQL> ALTER DATABASE RENAME FILE '+ADATA/prod/system01.dbf' TO 
'+NDATA/prod/datafile/system.256.917015899' ;
 
Database altered.
 
SQL> ALTER DATABASE RENAME FILE '+ADATA/prod/users01.dbf' TO 
'+NDATA/prod/datafile/users.257.917016043';
 
Database altered.
 
SQL> ALTER DATABASE RENAME FILE '+ADATA/prod/undotbs01.dbf' TO 
'+NDATA/prod/datafile/undotbs1.258.917016077';
 
Database altered.
 
SQL> ALTER DATABASE RENAME FILE '+ADATA/prod/sysaux01.dbf' TO 
'+NDATA/prod/datafile/sysaux.259.917016145';
 
Database altered.
 
SQL> ALTER DATABASE RENAME FILE '+ADATA/prod/undotbs02.dbf' TO 
'+NDATA/prod/datafile/undotbs2.260.917016877';
 
Database altered.


Switch Datafiles by RMAN

[oracle@rac1 ~]$ rman target /
 
Recovery Manager: Release 11.2.0.3.0 - Production on Tue Jul 12 15:02:52 2016
 
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
 
connected to target database: PROD (DBID=275885043, not open)
 
RMAN> SWITCH DATAFILE '+NDATA/prod/datafile/system.256.917015899' TO COPY;
 
using target database control file instead of recovery catalog
datafile 1 switched to datafile copy "+NDATA/prod/datafile/system.256.917015899"
 
RMAN> SWITCH DATAFILE '+NDATA/prod/datafile/users.257.917016043' TO COPY;
 
datafile 4 switched to datafile copy "+NDATA/prod/datafile/users.257.917016043"
 
RMAN> SWITCH DATAFILE '+NDATA/prod/datafile/undotbs1.258.917016077' TO COPY;
 
datafile 3 switched to datafile copy 
"+NDATA/prod/datafile/undotbs1.258.917016077"
 
RMAN> SWITCH DATAFILE '+NDATA/prod/datafile/sysaux.259.917016145' TO COPY;
 
datafile 2 switched to datafile copy "+NDATA/prod/datafile/sysaux.259.917016145"
 
RMAN> SWITCH DATAFILE '+NDATA/prod/datafile/undotbs2.260.917016877' TO COPY;
 
datafile 5 switched to datafile copy 
"+NDATA/prod/datafile/undotbs2.260.917016877"
 
RMAN>

Recovery the new ASM database file.

RECOVER DATAFILE '+NDATA/prod/datafile/system.256.917015899’;
RECOVER DATAFILE '+NDATA/prod/datafile/users.257.917016043’;
RECOVER DATAFILE '+NDATA/prod/datafile/undotbs1.258.917016077’;
RECOVER DATAFILE '+NDATA/prod/datafile/sysaux.259.917016145’;
RECOVER DATAFILE '+NDATA/prod/datafile/undotbs2.260.917016877’;
 


Now we going to move OCR and VD

We go with the rebalance power of 11 which is full throttle because it is planned maintenance.
ALTER DISKGROUP OCR_VD add DISK 'ORCL:NOCR' NAME NOCR REBALANCE POWER 11;

NOCR is a new disk in ASM we adding this to OCR_VD and after rebalancing. We can Check rebalance status from Enterprise Manager or v$ASM_OPERATION
When rebalance completes, drop the old disks.
Delete old disk from group.
ALTER DISKGROUP OCR_VD drop DISK 'ORCL:OCR' NAME OCR REBALANCE POWER 11;

Moving the tmp tablespace.
SQL> alter tablespace temp add tempfile '+ndata/prod/datafile/temp02.dbf' 
size 1000m reuse autoextend on next 20m maxsize 5G;
 
SQL> alter tablespace temp drop datafile  '+ADATA/prod/temp01.dbf';





Creating redo log
alter database add logfile thread 1 group 5 ('+ndata/prod/datafile/redo05.log')
 size  100M;
alter database add logfile thread 1 group 6 ('+ndata/prod/datafile/redo06.log')
 size 100M;
alter database add logfile thread 1 group 7 ('+ndata/prod/datafile/redo09.log')
 size 100M;
alter database add logfile thread 1 group 8 ('+ndata/prod/datafile/redo10.log')
 size 100M;

delete the old redo log
alter database drop logfile group 1;
alter database drop logfile group 2;
alter database drop logfile group 3;
alter database drop logfile group 4;

for the check all data moved or not go to ASMCMD
ASMCMD> lsof
DB_Name  Instance_Name  Path
+ASM     +ASM1          +nocr_vd.255.4294967295
prod     prod1          +ndata/prod/control01.ctl
prod     prod1          +ndata/prod/control02.ctl
prod     prod1          +ndata/prod/redo05.log
prod     prod1          +ndata/prod/redo06.log
prod     prod1          +ndata/prod/redo07.log
prod     prod1          +ndata/prod/redo08.log
prod     prod1          +ndata/prod/sysaux01.dbf
prod     prod1          +ndata/prod/system01.dbf
prod     prod1          +ndata/prod/temp01.dbf
prod     prod1          +ndata/prod/undotbs01.dbf
prod     prod1          +ndata/prod/undotbs02.dbf
prod     prod1          +ndata/prod/users01.dbf
ASMCMD>


Now SAN migration has been completed we can Shutdown the old SAN