Oracle 11gR2 RAC 進程說明

一. 先看Oracle 官方文檔

參考：

http://download.oracle.com/docs/cd/E11882_01/rac.112/e16794/intro.htm#CWADD91998

Oracle Clusterware Software Concepts and Requirements

Oracle Clusterware uses voting disk files to provide fencing and cluster node membership determination. OCR provides cluster configuration information. You can place the Oracle Clusterware files on either Oracle ASM or on shared common disk storage. If you configure Oracle Clusterware on storage that does not provide file redundancy, then Oracle recommends that you configure multiple locations for OCR and voting disks. The voting disks and OCR are described as follows:

· Voting Disks

Oracle Clusterware uses voting disk files to determine which nodes are members of a cluster. You can configure voting disks on Oracle ASM, or you can configure voting disks on shared storage.

If you configure voting disks on Oracle ASM, then you do not need to manually configure the voting disks. Depending on the redundancy of your disk group, an appropriate number of voting disks are created.

If you do not configure voting disks on Oracle ASM, then for high availability, Oracle recommends that you have a minimum of three voting disks on physically separate storage. This avoids having a single point of failure. If you configure a single voting disk, then you must use external mirroring to provide redundancy.

You should have at least three voting disks, unless you have a storage device, such as a disk array that provides external redundancy. Oracle recommends that you do not use more than five voting disks. The maximum number of voting disks that is supported is 15.

· Oracle Cluster Registry

Oracle Clusterware uses the Oracle Cluster Registry (OCR) to store and manage information about the components that Oracle Clusterware controls, such as Oracle RAC databases, listeners, virtual IP addresses (VIPs), and services and any applications. OCR stores configuration information in a series of key-value pairs in a tree structure. To ensure cluster high availability, Oracle recommends that you define multiple OCR locations. In addition:

o You can have up to five OCR locations

o Each OCR location must reside on shared storage that is accessible by all of the nodes in the cluster

o You can replace a failed OCR location online if it is not the only OCR location

o You must update OCR through supported utilities such as Oracle Enterprise Manager, the Server Control Utility (SRVCTL), the OCR configuration utility (OCRCONFIG), or the Database Configuration Assistant (DBCA)

See Also:

Chapter 2, "Administering Oracle Clusterware" for more information about voting disks and OCR

Oracle Clusterware Network Configuration Concepts

Oracle Clusterware enables a dynamic Grid Infrastructure through the self-management of the network requirements for the cluster. Oracle Clusterware 11g release 2 (11.2) supports the use of dynamic host configuration protocol (DHCP) for all private interconnect addresses, as well as for most of the VIP addresses. DHCP provides dynamic configuration of the host's IP address, but it does not provide an optimal method of producing names that are useful to external clients.

When you are using Oracle RAC, all of the clients must be able to reach the database. This means that the VIP addresses must be resolved by the clients. This problem is solved by the addition of the Oracle Grid Naming Service (GNS) to the cluster. GNS is linked to the corporate domain name service (DNS) so that clients can easily connect to the cluster and the databases running there. Activating GNS in a cluster requires a DHCP service on the public network.

Implementing GNS

To implement GNS, you must collaborate with your network administrator to obtain an IP address on the public network for the GNS VIP. DNS uses the GNS VIP to forward requests for access to the cluster to GNS. The network administrator must delegate a subdomain in the network to the cluster. The subdomain forwards all requests for addresses in the subdomain to the GNS VIP.

GNS and the GNS VIP run on one node in the cluster. The GNS daemon listens on the GNS VIP using port 53 for DNS requests. Oracle Clusterware manages the GNS and the GNS VIP to ensure that they are always available. If the server on which GNS is running fails, then Oracle Clusterware fails GNS over, along with the GNS VIP, to another node in the cluster.

With DHCP on the network, Oracle Clusterware obtains an IP address from the server along with other network information, such as what gateway to use, what DNS servers to use, what domain to use, and what NTP server to use. Oracle Clusterware initially obtains the necessary IP addresses during cluster configuration and it updates the Oracle Clusterware resources with the correct information obtained from the DHCP server.

Single Client Access Name (SCAN)

Oracle RAC 11g release 2 (11.2) introduces the Single Client Access Name (SCAN). The SCAN is a single name that resolves to three IP addresses in the public network. When using GNS and DHCP, Oracle Clusterware configures the VIP addresses for the SCAN name that is provided during cluster configuration.

The node VIP and the three SCAN VIPs are obtained from the DHCP server when using GNS. If a new server joins the cluster, then Oracle Clusterware dynamically obtains the required VIP address from the DHCP server, updates the cluster resource, and makes the server accessible through GNS.

Example 1-1 shows the DNS entries that delegate a domain to the cluster.

Example 1-1 DNS Entries

# Delegate to gns on mycluster

mycluster.example.com NS myclustergns.example.com

#Let the world know to go to the GNS vip

myclustergns.example.com. 10.9.8.7

See Also:

Oracle Grid Infrastructure Installation Guide for details about establishing resolution through DNS

Configuring Addresses Manually

Alternatively, you can choose manual address configuration, in which you configure the following:

· One public host name for each node.

· One VIP address for each node.

You must assign a VIP address to each node in the cluster. Each VIP address must be on the same subnet as the public IP address for the node and should be an address that is assigned a name in the DNS. Each VIP address must also be unused and unpingable from within the network before you install Oracle Clusterware.

· Up to three SCAN addresses for the entire cluster.

Note:

The SCAN must resolve to at least one address on the public network. For high availability and scalability, Oracle recommends that you configure the SCAN to resolve to three addresses.

See Also:

Your platform-specific Oracle Grid Infrastructure Installation Guide installation documentation for information about system requirements and configuring network addresses

Overview of Oracle Clusterware Platform-Specific Software Components

When Oracle Clusterware is operational, several platform-specific processes or services run on each node in the cluster. This section describes these various processes and services.

The Oracle Clusterware Stack

Oracle Clusterware consists of two separate stacks: an upper stack anchored by the Cluster Ready Services (CRS) daemon (crsd) and a lower stack anchored by the Oracle High Availability Services daemon (ohasd). These two stacks have several processes that facilitate cluster operations. The following sections describe these stacks in more detail:

· The Cluster Ready Services Stack

· The Oracle High Availability Services Stack

The Cluster Ready Services Stack

The list in this section describes the processes that comprise CRS. The list includes components that are processes on Linux and UNIX operating systems, or services on Windows.

· Cluster Ready Services (CRS): The primary program for managing high availability operations in a cluster.

The CRS daemon (crsd) manages cluster resources based on the configuration information that is stored in OCR for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes. When you have Oracle RAC installed, the crsd process monitors the Oracle database instance, listener, and so on, and automatically restarts these components when a failure occurs.

· Cluster Synchronization Services (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interface with your clusterware to manage node membership information.

The cssdagent process monitors the cluster and provides I/O fencing. This service formerly was provided by Oracle Process Monitor Daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure may result in Oracle Clusterware restarting the node.

· Oracle ASM: Provides disk management for Oracle Clusterware and Oracle Database.

· Cluster Time Synchronization Service (CTSS): Provides time management in a cluster for Oracle Clusterware.

· Event Management (EVM): A background process that publishes events that Oracle Clusterware creates.

· Oracle Notification Service (ONS): A publish and subscribe service for communicating Fast Application Notification (FAN) events.

· Oracle Agent (oraagent): Extends clusterware to support Oracle-specific requirements and complex resources. This process runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g release 1 (11.1).

· Oracle Root Agent (orarootagent): A specialized oraagent process that helps crsd manage resources owned by root, such as the network, and the Grid virtual IP address.

The Cluster Synchronization Service (CSS), Event Management (EVM), and Oracle Notification Services (ONS) components communicate with other cluster component layers on other nodes in the same cluster database environment. These components are also the main communication links between Oracle Database, applications, and the Oracle Clusterware high availability components. In addition, these background processes monitor and manage database operations.

The Oracle High Availability Services Stack

This section describes the processes that comprise the Oracle High Availability Services stack. The list includes components that are processes on Linux and UNIX operating systems, or services on Windows.

· Cluster Logger Service (ologgerd): Receives information from all the nodes in the cluster and persists in a CHM Repository-based database. This service runs on only two nodes in a cluster.

· System Monitor Service (osysmond): The monitoring and operating system metric collection service that sends the data to the cluster logger service. This service runs on every node in a cluster.

· Grid Plug and Play (GPNPD): Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile.

· Grid Interprocess Communication (GIPC): A support daemon that enables Redundant Interconnect Usage.

· Multicast Domain Name Service (mDNS): Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX, and a service on Windows.

· Oracle Grid Naming Service (GNS): Handles requests sent by external DNS servers, performing name resolution for names defined by the cluster.

二. 查看OHASD 資源

Oracle High Availability Services Daemon (OHASD) ：This process anchors the lower part of the Oracle Clusterware stack, which consists of processes that facilitate cluster operations.

在11gR2裏面啓動CRS的時候，會提示ohasd已經啓動。那麼這個OHASD到底包含哪些資源。我們可以通過如下命令來查看：

[grid@racnode1 ~]$ crsctl stat res -init -t

---------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

---------------------------------------------------------

Cluster Resources

---------------------------------------------------------

ora.asm

1 ONLINE ONLINE racnode1 Started

ora.crsd

1 ONLINE ONLINE racnode1

ora.cssd

1 ONLINE ONLINE racnode1

ora.cssdmonitor

1 ONLINE ONLINE racnode1

ora.ctssd

1 ONLINE ONLINE racnode1 OBSERVER

ora.diskmon

1 ONLINE ONLINE racnode1

ora.drivers.acfs

1 ONLINE UNKNOWN racnode1

ora.evmd

1 ONLINE ONLINE racnode1

ora.gipcd

1 ONLINE ONLINE racnode1

ora.gpnpd

1 ONLINE ONLINE racnode1

ora.mdnsd

1 ONLINE ONLINE racnode1

在10g平臺下，RAC的一些資源，在我的Blog：

RAC 的一些概念性和原理性的知識

http://blog.csdn.net/tianlesoftware/archive/2010/02/27/5331067.aspx

裏已經做了相關的說明。

分別看下這些進程：

（1）ora.asm：這個是asm 實例的進程。在10g裏， OCR和Voting disk 是放在其他共享設備上的。 11gR2裏面，默認是放在ASM裏面。在Clusterware啓動的時候需要讀取這些信息，所以在集羣啓動的時候需要先啓動ASM實例。

（2）ora.crsd，ora.cssd 和 ora.evmd：

這三個進程是Clusterware中最重要的3個進程.

在10g中，在安裝clusterware的最後階段，會要求在每個節點執行root.sh 腳本，這個腳本會在/etc/inittab 文件的最後把這3個進程加入啓動項，這樣以後每次系統啓動時，Clusterware 也會自動啓動，其中EVMD和CRSD 兩個進程如果出現異常，則系統會自動重啓這兩個進程，如果是CSSD 進程異常，系統會立即重啓。

在11gR2中，只會將ohasd 寫入/etc/inittab 文件。

[grid@racnode1 init.d]$ cat /etc/inittab

h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null

所以在10g中常用的/etc/init.d/init.crs 之類的命令都沒有了。就剩下一個/etc/init.d/init.ohasd 進程。

OCSSD ：這個進程是Clusterware最關鍵的進程，如果這個進程出現異常，會導致系統重啓，這個進程提供CSS(Cluster Synchronization Service)服務。 CSS服務通過多種心跳機制實時監控集羣狀態，提供腦裂保護等基礎集羣服務功能。

CRSD：是實現"高可用性(HA)"的主要進程，它提供的服務叫作CRS(Cluster Ready Service) 服務。所有需要高可用性的組件，都會在安裝配置的時候，以CRS Resource的形式登記到OCR中，而CRSD 進程就是根據OCR中的內容，決定監控哪些進程，如何監控，出現問題時又如何解決。也就是說，CRSD 進程負責監控CRS Resource 的運行狀態，並要啓動，停止，監控，Failover這些資源。默認情況下，CRS 會自動嘗試重啓資源5次，如果還是失敗，則放棄嘗試。

CRS Resource 包括GSD(Global Serveice Daemon),ONS(Oracle Notification Service),VIP, Database, Instance 和 Service.

EVMD：負責發佈CRS 產生的各種事件(Event). 這些Event可以通過2種方式發佈給客戶：ONS 和 Callout Script.

這三個進程各自的作用，具體參考

RAC 的一些概念性和原理性的知識

http://blog.csdn.net/tianlesoftware/archive/2010/02/27/5331067.aspx

中的說明。

（3）Grid Plug and Play (GPNPD):

Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile.

（4）Grid Interprocess Communication (GIPC):

A support daemon that enables Redundant Interconnect Usage.

（5）ora.mdns

Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX, and a service on Windows.

（6）Cluster Time Synchronization Service (CTSS):

Provides time management in a cluster for Oracle Clusterware. 在上面的查詢結果中，我們看到CTSS 的狀態是OBSERVER。即旁觀者。

在11gR2中，RAC在安裝的時候，時間同步可以用兩種方式來實現，一是NTP，還有就是CTSS. 當安裝程序發現 NTP 協議處於非活動狀態時，安裝集羣時間同步服務將以活動模式自動進行安裝並通過所有節點的時間。如果發現配置了 NTP，則以觀察者模式啓動集羣時間同步服務，Oracle Clusterware 不會在集羣中進行活動的時間同步。

（7）Automatic Storage Management Cluster File System (Oracle ACFS)：

Oracle Automatic Storage Management Cluster File System (Oracle ACFS) is a multi-platform, scalable file system, and storage management technology that extends Oracle Automatic Storage Management (Oracle ASM) functionality to support customer files maintained outside of Oracle Database. Oracle ACFS supports many database and application files, including executables, database trace files, database alert logs, application reports, BFILEs, and configuration files. Other supported files are video, audio, text, images, engineering drawings, and other general-purpose application file data.

An Oracle ACFS file system is a layer on Oracle ASM and is configured with Oracle ASM storage, as shown in Figure 5-1. Oracle ACFS leverages Oracle ASM functionality that enables:

· Oracle ACFS dynamic file system resizing

· Maximized performance through direct access to Oracle ASM disk group storage

· Balanced distribution of Oracle ACFS across Oracle ASM disk group storage for increased I/O parallelism

· Data reliability through Oracle ASM mirroring protection mechanisms

三. 查看CRS資源

在11.2中，對CRSD資源進行了重新分類： Local Resources 和 Cluster Resources。 OHASD 指的就是Cluster Resource.

[grid@racnode1 ~]$ crsctl stat res -t

---------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

---------------------------------------------------------

Local Resources

---------------------------------------------------------

ora.CRS.dg

ONLINE ONLINE racnode1

ONLINE ONLINE racnode2

ora.DATA.dg

ONLINE ONLINE racnode1

ONLINE ONLINE racnode2

ora.FRA.dg

ONLINE ONLINE racnode1

ONLINE ONLINE racnode2

ora.LISTENER.lsnr

ONLINE ONLINE racnode1

ONLINE ONLINE racnode2

ora.asm

ONLINE ONLINE racnode1 Started

ONLINE ONLINE racnode2 Started

ora.eons

ONLINE ONLINE racnode1

ONLINE ONLINE racnode2

ora.gsd

OFFLINE OFFLINE racnode1

OFFLINE OFFLINE racnode2

ora.net1.network

ONLINE ONLINE racnode1

ONLINE ONLINE racnode2

ora.ons

ONLINE ONLINE racnode1

ONLINE ONLINE racnode2

ora.registry.acfs

ONLINE UNKNOWN racnode1

ONLINE ONLINE racnode2

---------------------------------------------------------

Cluster Resources

---------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE racnode2

ora.oc4j

1 OFFLINE OFFLINE

ora.racdb.db

1 ONLINE ONLINE racnode1 Open

2 ONLINE ONLINE racnode2 Open

ora.racnode1.vip

1 ONLINE ONLINE racnode1

ora.racnode2.vip

1 ONLINE ONLINE racnode2

ora.scan1.vip

1 ONLINE ONLINE racnode2

[grid@racnode1 ~]$

從上面的查詢結果可以看出，在11gR2中把network,disgroup,eons,和 asm 也作爲了一種資源。

還有一點需要注意：就是gsd 和 oc4j 這兩資源，他們是offlie的。說明如下：

ora.gsd is OFFLINE by default if there is no 9i database in the cluster.

ora.oc4j is OFFLINE in 11.2.0.1 as Database Workload Management(DBWLM) is unavailable. these can be ignored in 11gR2 RAC.

也可用如下命令來查看進程：

[root@racnode1 ~]# crs_stat -t

Name Type Target State Host

------------------------------------------------------------

ora.CRS.dg ora....up.type ONLINE ONLINE racnode1

ora.DATA.dg ora....up.type ONLINE ONLINE racnode1

ora.FRA.dg ora....up.type ONLINE ONLINE racnode1

ora....ER.lsnr ora....er.type ONLINE ONLINE racnode1

ora....N1.lsnr ora....er.type ONLINE ONLINE racnode2

ora.asm ora.asm.type ONLINE ONLINE racnode1

ora.eons ora.eons.type ONLINE ONLINE racnode1

ora.gsd ora.gsd.type OFFLINE OFFLINE

ora....network ora....rk.type ONLINE ONLINE racnode1

ora.oc4j ora.oc4j.type OFFLINE OFFLINE

ora.ons ora.ons.type ONLINE ONLINE racnode1

ora.racdb.db ora....se.type ONLINE ONLINE racnode1

ora....SM1.asm application ONLINE ONLINE racnode1

ora....E1.lsnr application ONLINE ONLINE racnode1

ora....de1.gsd application OFFLINE OFFLINE

ora....de1.ons application ONLINE ONLINE racnode1

ora....de1.vip ora....t1.type ONLINE ONLINE racnode1

ora....SM2.asm application ONLINE ONLINE racnode2

ora....E2.lsnr application ONLINE ONLINE racnode2

ora....de2.gsd application OFFLINE OFFLINE

ora....de2.ons application ONLINE ONLINE racnode2

ora....de2.vip ora....t1.type ONLINE ONLINE racnode2

ora....ry.acfs ora....fs.type ONLINE ONLINE racnode2

ora.scan1.vip ora....ip.type ONLINE ONLINE racnode1

ora.scan2.vip ora....ip.type ONLINE ONLINE racnode2

[root@racnode1 ~]#

四. 查看各種資源之間的依賴關係

比如DG resource依賴於ASM，VIP依賴於network。這些可以從資源的詳細屬性看出：

[root@racnode1 ~]# crsctl stat res ora.DATA.dg -p

NAME=ora.DATA.dg

TYPE=ora.diskgroup.type

ACL=owner:grid:rwx,pgrp:oinstall:rwx,other::r--

ACTION_FAILURE_TEMPLATE=

ACTION_SCRIPT=

AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%

ALIAS_NAME=

AUTO_START=never

CHECK_INTERVAL=300

CHECK_TIMEOUT=600

DEFAULT_TEMPLATE=

DEGREE=1

DESCRIPTION=CRS resource type definition for ASM disk group resource

ENABLED=1

LOAD=1

LOGGING_LEVEL=1

NLS_LANG=

NOT_RESTARTING_TEMPLATE=

OFFLINE_CHECK_INTERVAL=0

PROFILE_CHANGE_TEMPLATE=

RESTART_ATTEMPTS=5

SCRIPT_TIMEOUT=60

START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)

START_TIMEOUT=900

STATE_CHANGE_TEMPLATE=

STOP_DEPENDENCIES=hard(intermediate:ora.asm)

STOP_TIMEOUT=180

UPTIME_THRESHOLD=1d

USR_ORA_ENV=

USR_ORA_OPI=false

USR_ORA_STOP_MODE=

VERSION=11.2.0.1.0

[grid@racnode1 ~]$ crsctl stat res ora.racnode1.vip -p

NAME=ora.racnode1.vip

TYPE=ora.cluster_vip_net1.type

ACL=owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:grid:r-x

ACTION_FAILURE_TEMPLATE=

ACTION_SCRIPT=

ACTIVE_PLACEMENT=1

AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%

AUTO_START=restore

CARDINALITY=1

CHECK_INTERVAL=1

DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=vip)

DEGREE=1

DESCRIPTION=Oracle VIP resource

ENABLED=1

FAILOVER_DELAY=0

FAILURE_INTERVAL=0

FAILURE_THRESHOLD=0

HOSTING_MEMBERS=racnode1

LOAD=1

LOGGING_LEVEL=1

NLS_LANG=

NOT_RESTARTING_TEMPLATE=

OFFLINE_CHECK_INTERVAL=0

PLACEMENT=favored

PROFILE_CHANGE_TEMPLATE=

RESTART_ATTEMPTS=0

SCRIPT_TIMEOUT=60

SERVER_POOLS=*

START_DEPENDENCIES=hard(ora.net1.network) pullup(ora.net1.network)

START_TIMEOUT=0

STATE_CHANGE_TEMPLATE=

STOP_DEPENDENCIES=hard(ora.net1.network)

STOP_TIMEOUT=0

UPTIME_THRESHOLD=1h

USR_ORA_ENV=

USR_ORA_VIP=racnode1-vip

VERSION=11.2.0.1.0

Oracle 11gR2 RAC 進程說明

Oracle Clusterware Software Concepts and Requirements

Oracle Clusterware Network Configuration Concepts

Implementing GNS

Single Client Access Name (SCAN)

Configuring Addresses Manually

Overview of Oracle Clusterware Platform-Specific Software Components

The Oracle Clusterware Stack

The Cluster Ready Services Stack

The Oracle High Availability Services Stack

DRDB原理

MySQL企業級高可用性

MySQL High Availability with Oracle Clusterware

Heartbeat中ipfail的使用

如何添加和調整mysql innodb log文件

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結