Remove a Node
How to remove a cluster node
In this example we will remove a host from the cluster. We can always use the “scconf” command, to see, what still need to done, before we can finally remove the node. In this example it’s a three way cluster, the steps are similar inside a two node cluster.
# scconf –r –h <nodename> scconf: Failed to remove node (nodename) - node is still cabled or otherwise in use. scconf: The node is still cabled. scconf: The node is still in use by resource group "nfs-rg". scconf: The node is still in use by resource group "pt1-db-rg". scconf: The node is still in use by resource group "pt1-ci-rg". scconf: The node is still in use by resource group "pt1-ai30-rg". scconf: The node is still in use by resource group "pi1-db-rg". scconf: The node is still in use by device group "ora1-ds". scconf: The node is still in use by device group "dsk/d31". scconf: The node is still in use by device group "dsk/d30". scconf: The node is still in use by device group "dsk/d29". scconf: The node is still in use by device group "dsk/d28". scconf: The node is still in use by device group "dsk/d27". scconf: The node is still in use by device group "dsk/d26". scconf: The node is still in use by device group "dsk/d25". scconf: The node is still in use by device group "dsk/d24". scconf: The node is still in use by device group "dsk/d23". scconf: The node is still in use by device group "dsk/d22". scconf: The node is still in use by device group "dsk/d21". scconf: The node is still in use by device group "dsk/d20". scconf: The node is still in use by device group "dsk/d19". scconf: The node is still in use by device group "dsk/d18". scconf: The node is still in use by device group "dsk/d17". scconf: The node is still in use by device group "dsk/d16". scconf: The node is still in use by device group "dsk/d15". scconf: The node is still in use by device group "dsk/d14". scconf: The node is still in use by device group "ora-ds". scconf: The node is still in use by device group "sap-ds". scconf: The node is still in use by device group "dsk/d8". scconf: The node is still in use by device group "dsk/d7". scconf: The node is still in use by device group "dsk/d6". scconf: The node is still in use by quorum device "d28".
1) To remove all services running on the host (the node to be removed), let’s evacuate the service from this host:
# scswitch –S –h <nodename>
2) To see if the switch worked, look into the Web BUI or issue the command:
# scstat –g
3) Remove the node from the configured cluster-services. To see:
# scrgadm -pv | grep -i nodelist.*<nodename> (nfs-rg) Res Group Nodelist: node1 node2 node3 (pt1-db-rg) Res Group Nodelist: node1 node2 node3 (pt1-ci-rg) Res Group Nodelist: node1 node2 node3 (pt1-ai30-rg) Res Group Nodelist: node1 node2 node3 (pi1-db-rg) Res Group Nodelist: node1 node2 node3 #
Redefine the possible primaries for each group:
# scrgadm -c -g nfs-rg -h node1,node2 # scrgadm -c -g pt1-db-rg -h node1,node2 # scrgadm -c -g pt1-ci-rg -h node1,node2 # scrgadm -c -g pt1-ai30-rg -h node1,node2 # scrgadm -c -g pi1-db-rg -h node1,node2
4) If Metasets are used, we need to remove the host from the disksets. This step is not necessary with ZFS Pools
# metaset -s sap-ds -d -h node3 # metaset -s ora-ds -d -h node3
Let’s remove the unnecessary mediator hosts (only needed with dual storage configuration)
# metaset -s sap-ds -d -m node1 node2 # metaset -s ora-ds -d -m node1 node2
5) Remove the node from the remaining IPMP Configurations for each LH-res:
# scrgadm -pvv | grep ":NetIfList) Res property value" (nfs-rg:lh6:NetIfList) Res property value: ipmpA@1 ipmpB@2 (pt1-db-rg:lh1:NetIfList) Res property value: ipmpC@1 ipmpD@2 ipmpE@3 (pt1-ci-rg:lh2:NetIfList) Res property value: ipmpA@1 ipmpB@2 ipmpF@3 (pt1-ci-rg:lh3:NetIfList) Res property value: ipmpC@1 ipmpD@2 ipmpE@3 (pt1-ai30-rg:lh4:NetIfList) Res property value: ipmpA@1 ipmpB@2 ipmpF@3 # scrgadm -c -j lh3 -x netiflist=ipmpC@1,ipmpD@2 # scrgadm -c -j lh2 -x netiflist=ipmpA@1,ipmpB@2
6) Change the localonly flag of those that are in the output of both commands:
# scconf -pvv | grep -i local_disk (dsk/d21) Device group type: Local_Disk (dsk/d20) Device group type: Local_Disk # scconf -r -h node3 scconf: Failed to remove node (node3) - node is still cabled or otherwise in use. scconf: The node is still cabled. scconf: The node is still in use by device group "ora1-ds". scconf: The node is still in use by device group "dsk/d31". [...] scconf: The node is still in use by device group "dsk/d6". scconf: The node is still in use by quorum device "d28". # scconf -c -D name=dsk/d20,localonly=false # scconf -c -D name=dsk/d21,localonly=false
7) Remove the node from all the devices displayed in this output:
# scconf -r -h node3 scconf: Failed to remove node (node3) - node is still cabled or otherwise in use. scconf: The node is still cabled. scconf: The node is still in use by device group "ora-ds". scconf: The node is still in use by device group "dsk/d31". [...] scconf: The node is still in use by device group "dsk/d6". scconf: The node is still in use by quorum device "d28". # for x in dsk/d30 dsk/d29 dsk/d28 dsk/d27 dsk/d26 dsk/d25 dsk/d24 dsk/d23 dsk/d22 dsk/d21 dsk/d20 dsk/d19 dsk/d18 dsk/d17 dsk/d16 dsk/d15 dsk/d14 dsk/d8 dsk/d7 dsk/d6; \ > do scconf -r -D name=$x,nodelist=node3; > done # metaset -s ora1-ds Set name = ora1-ds, Set number = 1 Host Owner node1 node2 Drive Dbase d29 Yes d30 Yes # metaset -s ora1-ds -a -h node3 Proxy command to: node1
8) Shutdown the node which should be removed by:
# init 0
9) Put the node into maintenance state:
# scconf -c -q node=node3,maintstate
10) Display the current transport configuration:
# scstat -W -- Cluster Transport Paths -- Endpoint Endpoint Status -------- -------- ------ Transport path: node1:ce5 node2:ce5 Path online Transport path: node1:ce0 node2:ce0 Path online Transport path: node1:ce5 node3:bge1 faulted Transport path: node1:ce0 node3:bge0 faulted Transport path: node2:ce0 node3:bge0 faulted Transport path: node2:ce5 node3:bge1 faulted
11) Remove the remaining interconnects for the node:
# scsetup 4) Cluster interconnect -> 4) Remove a transport cable -> name + adapter (for each interconnect)
12) Now we have to remove the Quorum (only in two way cluster)
# scconf -c -q installmode # scstat -q # scconf -r -q globaldev=d#
13) Now we can remove the node from the cluster
# scconf -r -h node3
14) In a three way cluster, you have a “chicken and egg” problem... in order to remove a third node attached to a quorum device, you are required to both remove the quorum device and to have a quorum device....
To solve that problem: Remove the Quroum:
# scconf -r -q globaldev=d28 add a bogusnode node: # scconf -a -h dummy remove the third node: # scconf -r -h node3 scrub the scsi reservation: # /usr/cluster/lib/sc/scsi –c scrub –d /dev/rdsk/c#t#d#s2 now you can add the quorum back: # scconf -a -q globaldev=d28 and remove the dummy-node: # scconf -r -h dummy