fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# CDDL HEADER START
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# The contents of this file are subject to the terms of the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Common Development and Distribution License (the "License").
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# You may not use this file except in compliance with the License.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# See the License for the specific language governing permissions
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# and limitations under the License.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# When distributing Covered Code, include this CDDL HEADER in each
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# If applicable, add the following below this CDDL HEADER, with the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# fields enclosed by brackets "[]" replaced with your own identifying
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# information: Portions Copyright [yyyy] [name of copyright owner]
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# CDDL HEADER END
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Copyright 2008 Sun Microsystems, Inc. All rights reserved.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Use is subject to license terms.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# NWS DataServices within SunCluster reconfiguration script.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Description:
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# This script is called from /usr/cluster/lib/sc/run_reserve at
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# appropriate times to start and stop the NWS DataServices as SunCluster
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# disk device groups are brought online or taken offline.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# SNDR configuration requires that a resource group to be configured.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# 1. The resource group name should be same as device group name with -stor-rg
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# added. e.g. if device group name is abc-dg then resource group name
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# would be abc-dg-stor-rg.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# 2. It should have 2 resources in it, unless one of the resource types is the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# SUNW.GeoCtlAVS. One of type SUNW.LogicalHostname and either SUNW.HAStorage
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# or SUNW.HAStoragePlus types. Resource type versioning is ignored.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# HAStorage type resource, should have ServicePaths property set to
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# device group name. HAStoragePlus type resource, should have either the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# FilesystemMountPoints pointing to a files system associated with the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# device group name, or GlobalDevicePaths property set to device group name.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# LogicalHostname type resource should have a failoverIP address in it and
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# it will be used by SNDR to communicate with the secondary side.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# As SNDR requires that the LogicalHost (failover) IP address which is a
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# part of resource group for SNDR, to be hosted on the same node where the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# device group is, it tries to move the resource group also alongwith the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# device group, in become_primary case of run_reserve script. While
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# in primary_to_secondary case, it will try to kill the switchover function
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# if it is still running in background, after stopping NWS data services.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# /usr/cluster/sbin/dscfg_reconfigure { start | stop } diskgroup
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Configuration:
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Scripts to be run should have been symlinked into $NWS_START_DIR and
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# $NWS_STOP_DIR. Note that the scripts are processed in lexical order,
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# and that unlike /etc/rc?.d/ there is no leading S or K character.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Exit status:
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# 0 - success
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Global variables
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# this program
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# directory full of start scripts
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Fortetypeset -r NWS_START_DIR=/usr/cluster/lib/dscfg/start
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# directory full of stop scripts
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Fortetypeset -r NWS_STOP_DIR=/usr/cluster/lib/dscfg/stop
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# the syslog facility to use.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# - conceptually this should be based on the output of
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# "scha_cluster_get -O SYSLOG_FACILITY", but that won't work early
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# during boot.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Variables for retrying scswitch of Resource group for SNDR
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Since the switchover of the resource group is called in background,
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# the stop action of the reconfig script will kill the background switchover
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# if it is running. Since we are stopping the NWS services on the node, there
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# is no need to switch the resource group, so it is killed.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# The pid of the process is kept in file /var/run/scnws/$dg.pid.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Input: dg - device group
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Output: Nothing, kills the process
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Get the status of the resource group on this node, using scha commands.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Input: resource group - $1
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Output: Status
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte rgstat=`scha_resourcegroup_get -O RG_STATE -G $rg`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# This function is called in background from do_scswitch function, to
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# switch the resource group to this node, which is becoming primary for
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# the diskgroup. If the status of resource group is Offline, it will use
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# scswitch command to switch the resource group to this node. If it has
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# become Online, cleanup pid file. If it is Pending, the resource group
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# is in the state of becoming online, so wait for sometime to become Online..
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# scswitch may fail, so the function retries $retry_num times, waiting for
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# $retry_interval seconds.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Input: resource group - $1, Diskgroup/Diskset - $2
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Output: 0 - success, 1 - failure
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" `gettext "scswitch of resource group"` "$rg"
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" `gettext "pending online of resource group"` "$rg"
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" `gettext "Improper resource group status for Remote Mirror"` "$rgstat"
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" "Did not switch resource group for Remote Mirror. System Administrator intervention required"
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# This function calls switchfunc function in background, to switch the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# resource group for SNDR. It validates the diskgroup/diskset is configured
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# for SNDR, checks if the resource group is in Managed state etc.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# If it detects a mis-configuration, it will disable SNDR for the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# device group being processed. This is to prevent cluster hangs and panics.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# The ServicePaths extension property of HAStorage type resource or the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# GlobalDevicePaths extension property of HAStoragePlus, both of which
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# specify the device group, serve as a link or mapping to retrieve the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# resource group associated with the SNDR configured device group.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Switchfunc is called in the background to avoid the deadlock situation arising
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# out of switchover of resource group from within device group switchover.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# In run_reserve context, we are doing the device group switchover, trying to
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# bring it online on the node. Device group is not completely switched online,
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# until the calling script run_reserve returns. In the process, we are calling
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# the associated SNDR resource group switchover using scswitch command.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Resource group switchover will trigger the switchover of device group also.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# If resource group switchover is called in foreground, before the device
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# group has become online, then it will result in switching the device group
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# again, resulting in deadlock. Resource group can not become online until
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# the device group is online and the device group can not become online until the
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# script returns, causing this circular dependency resulting in deadlock.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Calling the resource group switch in background allows current run_reserve
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# script to return immediately, allowing device group to become online.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# If the device group is already online on the node, then the resource group
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# does not cause the device group switchover again.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Input: Device group dg - $1
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Output: 0 - success
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# 1 - either dg not applicable for SNDR or error
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# 2 - SNDR mis-configuration
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -o ! -x /usr/cluster/bin/scha_resourcegroup_get ]
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# hard coded rg name from dg
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte scha_resourcegroup_get -O rg_description -G $rgname > /dev/null
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte if [ $? != 0 ]
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# There is no device group configured in cluster for SNDR with this cluster tag
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Check the state of resource group
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -o "$rgstat" = "UNMANAGED" -o "$rgstat" = "ERROR_STOP_FAILED" ]
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte `gettext "Improper Remote Mirror resource group state"` "$rgstat"
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Check whether resources are of proper type and they are enabled
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte rs_list=`scha_resourcegroup_get -O resource_list -G $rgname`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte `gettext "No resources in Remote Mirror resource group <$rgname>"`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte rs_type=`scha_resource_get -O type -R $rs -G $rgname | cut -d':' -f1`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte rs_enb=`scha_resource_get -O ON_OFF_SWITCH -R $rs -G $rgname`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte count_LogicalHostname=$(($count_LogicalHostname + 1))
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte rs_enb=`scha_resource_get -O ON_OFF_SWITCH -R $rs -G $rgname`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte count_HAStoragePlus=$(($count_HAStoragePlus + 1))
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" `gettext "Missing Enabled Logical Host in resource group <$rgname> for Remote Mirror"`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" `gettext "Too Many Enabled Logical Host in resource group <$rgname> for Remote Mirror"`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" `gettext "Missing Enabled HAStoragePlus in resource group <$rgname> for Remote Mirror"`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" `gettext "Too Many Enabled HAStoragePlus in resource group <$rgname> for Remote Mirror"`
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Invoke switchfunc to switch the resource group.
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" "usage: $ARGV0 { start | stop } diskgroup"
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Input: arg1) $NWS_START_DIR - location of NWS scripts
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# arg2) start / stop
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# arg3 ) device group - $2
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# arg4) sndr_ena / sndr_dis
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte# Output: Nothing. Log error if seen
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte # process scripts in the directories in lexical order
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte if [ -s $f ] && [ $f != $RDC ]
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte # not reached
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte logger -p ${SYSLOG_FACILITY}.notice -t "NWS.[$ARGV0]" "starting: $ARGV0 $*"
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte -t "NWS.[$ARGV0]" "**FATAL ERROR** Remote Mirror is mis-configured and DISABLED for devicegroup <"$2"> "
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte # Disable SNDR
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte logger -p ${SYSLOG_FACILITY}.notice -t "NWS.[$ARGV0]" "stopping: $ARGV0 $*"
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Forte # not reached
fcf3ce441efd61da9bb2884968af01cb7c1452ccJohn Fortelogger -p ${SYSLOG_FACILITY}.notice -t "NWS.[$ARGV0]" "completed: $ARGV0 $*"