Oracle Consulting Oracle Training Development

Remote DBA

Remote DBA Plans  

Remote DBA Service

Remote DBA RAC

   
Remote DBA Oracle Home
Remote DBA Oracle Training
Remote DBA SQL Tuning Consulting
Remote DBA Oracle Tuning Consulting
Remote DBA Data Warehouse Consulting
Remote DBA Oracle Project Management
Remote DBA Oracle Security Assessment
Remote DBA Unix Consulting
Burleson Books
Burleson Articles
Burleson Web Courses
Burleson Qualifications
Oracle Links
Remote DBA Oracle Monitoring
Remote DBA Support Benefits
Remote DBA Plans & Prices
Our Automation Strategy
What We Monitor
Oracle Apps Support
Print Our Brochure
Contact Us (e-mail)
Oracle Job Opportunities
Oracle Consulting Prices





   

 

 

 

Remote DBA services

Remote DBA Support

Remote DBA RAC

Remote DBA Reasons

Remote Oracle Tuning

Remote DBA Links

Oracle DBA Support

Oracle DBA Forum

Oracle Disaster

Oracle Training

Oracle Tuning

Oracle Training

 Remote DBA SQL Server

Remote MSSQL Consulting

Oracle DBA Hosting

Oracle License Negotiation

 

 


 

 

 

        
 

 Block Access, Grants and Interrupts
Oracle Tips by Burleson Consulting

Oracle 11g Grid & Real Application Clusters by Rampant TechPress is written by four of the top Oracle database experts (Steve Karam, Bryan Jones, Mike Ault and Madhu Tumma).  The following is an excerpt from the book.

The GCS maintains the status of the resources. It also keeps an inventory of the access requests for the data blocks. After the blocks are transferred from one instance to another to meet requests, the requesting processes need to be notified that the block is actually available. Therefore, processes utilize interrupts to inform of the arrival or completion of block transfers. The GCS uses various interrupts to manage resource allocation. These interrupts are:

Blocking Interrupt - When exclusive access is needed for a requestor, the GCS sends a blocking interrupt to a process that currently owns the shared resource, notifying it that a request for an exclusive resource is waiting

Acquisition Interrupt - When the requested access (e.g., exclusive) is made available after releasing an earlier access mode, an acquisition interrupt is sent to alert the process that has requested the exclusive resource. The acquisition interrupt helps to notify the requesting process.

Block Arrival Interrupt - When a process requests a block from the GCS, the request is forwarded to the instance holding the block. Then the requested block is sent to the requesting process, and the process informs the GCS that it has received the block. This notification is called block arrival interrupt.

The block requests are granted for many processes at the same time, but they follow a queuing mechanism. The GCS maintains two types of queues for resource requests. If the GCS is unable to grant a resource request immediately, then the GCS puts it in the convert queue. The GCS then tracks all waiting requests. Once a resource is granted to the requesting process, it is kept in the granted queue. The GCS tracks resource requests in the granted queue.

Cache Fusion and Recovery

In the RAC system, whenever there is a node failure, the instance running on the failed node crashes and becomes unusable. There can be several reasons for such a failure. In this section, focus will be placed on the changes that take place in the global cache and how the recovery of the failed instance is undertaken by one of the surviving instances.

Recovery Features

Only the cache resources that reside on the failed nodes or are mastered by the GCS on the failed nodes need to be rebuilt or re-mastered. Rebuilt or re-master does not mean building a block; the lock ownership is merely changed and this is explained later with examples.

 

All resources previously mastered at the failed instance are redistributed across the remaining instances. These resources are reconstructed at their new master instance. All other resources previously mastered at surviving instances remain unaffected.

 

The cluster manager first detects the node and instance failure. It communicates the failure status to the GCS by way of the LMON process. At this stage, any surviving instance in the cluster initiates the recovery process.

 

Remember, instance recovery does not include restarting the failed instance or recovering applications that were running on that instance. Also note that, even after a node failure and instance loss, the redo log file of the failed instance is still available to the other recovering instance since the redo log file is located on the shared cluster file system or shared raw partition. This is an important feature of the RAC system.

 

Because of past images, instance recovery is performed differently in the RAC implementation. The SMON process of a surviving instance performs recovery of the failed instance or thread. However, note that the foreground process performs recovery in a stand-alone instance.

Recovery Methodology and Steps

Oracle performs the following steps to recover:

  1. In the initial phase of recovery, GES enqueues are reconfigured and the global resource directory is frozen. All GCS resource requests and writes are temporarily halted.

  2. GCS resources are reconfigured among the surviving instances. One of the surviving instances becomes the recovering instance. The SMON process of the recovering instance starts a first pass of the redo log read of the failed instance's redo thread.

  3. Block resources that need to be recovered are identified and the global resource directory is reconstructed. Pending requests or writes are cancelled or replayed.

  4. Resources identified in the previous log read phase are defined as recovery resources. Buffer space for recovery is allocated.

  5. Assuming that there are past images of blocks to be recovered in other caches in the cluster, source buffers are requested from other instances. The resource buffers are the starting point of recovery for a particular block.

  6. All resources and enqueues required for subsequent processing have been acquired and the global resource directory is now unfrozen. Any data blocks that are not in recovery can now be accessed. At this time, the system is partially available.

  7. The SMON merges the redo thread order by SCN to ensure that changes are written in an orderly fashion. This process is important for multiple simultaneous failures. If multiple instances die simultaneously, neither the PI buffers nor the current buffers for a data block can be found in any surviving instance's cache. Then a log merger of the failed instances is performed.

  8. Now the second pass of recovery begins and redo is applied to data files, releasing the recovery resources immediately after block recovery, so that more and more blocks become available as cache recovery proceeds.

  9. After all blocks have been recovered and recovery resources have been released, the system is available for normal use.

Figure 2.16 shows the basic steps in the recovery.

 

Figure 2.16:  Online Instance Recovery Steps

     

Remote DBA Service
 

Oracle Tuning Book

 

Advance SQL Tuning Book 

BC Oracle support

Oracle books by Rampant

Oracle monitoring software

 

 

 

 

 

 

 

 

 

BC Remote Oracle Support

Remote DBA

Remote DBA Services

Copyright © 1996 -  2013 by Burleson. All rights reserved.

Oracle® is the registered trademark of Oracle Corporation.