Skip to main content

We've Moved!

Product Documentation has moved to docs.hitachivantara.com
Hitachi Vantara Knowledge

Disaster recovery of global-active device

It is useful to understand the global-active device (GAD) failure locations and the SIMs issued for GAD failures, so that you can inititiate the recovery procedures for GAD failures.

You can use a volume in an external storage system or a disk in a server for a quorum disk.

Failure locations

The following figure and table describe the locations where GAD failures can occur, the SIMs that are issued, and whether the P-VOL and S-VOL are accessible. All GAD-related SIMs are described in SIMs related to GAD.

Figure 1: Failure locations
GUID-26D34294-9329-4390-A715-15822CB26DEE-low.png

#

Failure location

SIM reference codes

GAD volume accessible?1

Primary storage system

Secondary storage system

P-VOL

S-VOL

1

Server

None (normal)

None (normal)

Yes

Yes

2

Path between the server and the storage system

Path between the server and the primary storage system

None (normal)

None (normal)

No

Yes2

3

Path between the server and the secondary storage system

None (normal)

None (normal)

Yes3

No

4

GAD pair volume

P-VOL

3A0xxx

DD1xyy

DFAxxx

DFBxxx

EF9xxx

DD1xyy

No

Yes2

5

S-VOL

DD1xyy

3A0xxx

DD1xyy

DFAxxx

DFBxxx

EF9xxx

Yes3

No

6

Pool for GAD pair4

Pool for P-VOL

622xxx

62axxx

DD1xyy

DD1xyy

No

Yes2

7

Pool for S-VOL

DD1xyy

622xxx

62axxx

DD1xyy

Yes3

No

8

Path between storage systems

Remote path from the primary to secondary storage system

2180xx

DD0xyy

DD3xyy

Yes3

No

9

Remote path from the secondary to primary storage system

DD3xyy

2180xx

DD0xyy

No

Yes2

10

Storage system

Primary storage system

Depends on the failure type5

2180xx

DD0xyy

DD3xyy

No

Yes2

11

Secondary storage system

2180xx

DD0xyy

DD3xyy

Depends on the failure type5

Yes3

No

12

Quorum disk

Path between the primary storage system and quorum disk

21D0xx

21D2xx

DD2xyy

DEF0zz

EF5xyy

EFD000

FF5xyy

DD2xyy

Yes3

No

13

Path between the secondary storage system and quorum disk

DD2xyy

21D0xx

21D2xx

DD2xyy

DEF0zz

EF5xyy

EFD000

FF5xyy

Yes3

No

14

Quorum disk

21D0xx

21D2xx

DD2xyy

DEF0zz

EF5xyy

EFD000

FF5xyy

21D0xx

21D2xx

DD2xyy

DEF0zz

EF5xyy

EFD000

FF5xyy

Yes3

No

15

External storage system

21D0xx

21D2xx

DD2xyy

DEF0zz

EF5xyy

EFD000

FF5xyy

21D0xx

21D2xx

DD2xyy

DEF0zz

EF5xyy

EFD000

FF5xyy

Yes3

No

Notes:

  1. Pairs are not suspended and do not become inaccessible for:
    • Failure in hardware used for redundancy in the storage system, such as drives, cache, front-end director (CHB), back-end director (BED), and MPU
    • Failure in redundant physical paths
  2. The volume is not accessible if a failure occurs while the S-VOL pair status is COPY, SSUS, or PSUE.
  3. The volume is not accessible if a failure occurs while the P-VOL pair status is PSUS or PSUE and the I/O mode is BLOCK.
  4. A failure occurs due to a full pool for a GAD pair.
  5. The SIM might not be viewable, depending on the failure (for example, all cache failure, all MP failure, storage system failure).

SIMs related to GAD

The following table shows SIMs related to global-active device operations. All SIMs in the following table are reported to the service processor (SVP) of the storage system or recorded in the storage system, depending on your storage system.

SIM reference code

Description

2180xx

Logical path(s) on the remote copy connections was logically blocked (due to an error condition)

21D0xx

External storage system connection path blocking

21D2xx

Threshold over by external storage system connection path response time-out

3A0xyy

LDEV blockade (effect of microcode error)

622xxx

The DP POOL FULL

62axxx

Actual DP pool use rate reaches upper limit

DD0xyy

GAD for this volume was suspended (due to an unrecoverable failure on the remote copy connections)

DD1xyy

GAD for this volume was suspended (due to a failure on the volume)

DD2xyy

GAD for this volume was suspended (due to an internal error condition detected)

DD3xyy

Status of the P-VOL was not consistent with the S-VOL

DEE0zz

Quorum disk restore

DEF0xx

Quorum disk blocked

DFAxxx

LDEV blockade (drive path: boundary 0/effect of drive port blockade)

DFBxxx

LDEV blockade (drive path: boundary 1/effect of drive port blockade)

EF5xyy

Abnormal end of write processing in external storage system

EF9xxx

LDEV blockade (effect of drive blockade)

EFD000

External storage system connection device blockade

FF5xyy

Abnormal end of read processing in external storage system

Pair condition before failure

The pair status and I/O mode of a GAD pair, the accessibility of the server, and the storage location of the latest data depend on the status before a failure occurs.

The following table shows pair status and I/O mode, the volumes accessible from the server, and the location of the latest data before a failure occurs. You can compare this information with the changes that take place after a failure occurs, as described in the following topics.

Pair status and I/O mode

Volume accessible from the server

Volume with latest data

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

OK

OK

Both P-VOL and S-VOL

PAIR (Mirror (RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

COPY (Mirror (RL))

COPY (Block)

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Pair condition and recovery: server failures

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when a server failure occurs.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server*

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PAIR (Mirror (RL))

OK

OK

Both P-VOL and S-VOL

PAIR (Mirror (RL))

PAIR (Block)

PAIR (Mirror (RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

COPY (Mirror (RL))

COPY (Block)

COPY (Mirror (RL))

COPY (Block)

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

* If failures occur in all servers that access the P-VOL or S-VOL, then you cannot access either volume.

SIMs
  • Primary storage system: None
  • Secondary storage system: None

Procedure

  1. Recover the server.

  2. Recover the path from the server to the pair volumes.

Pair condition and recovery: path failure between the server and storage system

If a server cannot access a pair volume whose status is PAIR, though no SIM has been issued, a failure might have occurred between the server and the storage system. The following topics provide procedures for recovering of the physical path between the server and the storage systems.

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use a physical path between the server and a storage system.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server*

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PAIR (Mirror (RL))

OK

OK

Both P-VOL and S-VOL

PAIR (Mirror (RL))

PAIR (Block)

PAIR (Mirror (RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

COPY (Mirror (RL))

COPY (Block)

COPY (Mirror (RL))

COPY (Block)

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

* If failures occur in all servers that access the P-VOL or S-VOL, then you cannot access either volume.

SIMs
  • Primary storage system: None
  • Secondary storage system: None

Procedure

  1. Recover the path between the server and the storage system.

  2. Recover the path from the server to the pair volume.

Recovering from a path failure: server to primary storage system

The following figure shows the failure area and recovery when the path between the server and the primary storage system fails.

GUID-9B5BB5A0-FB8B-40B7-8EED-127CE8A9A1DD-low.png

Procedure

  1. Recover the path.

    1. Using the alternate path software and other tools, identify the path that cannot be accessed from the server.

    2. Using the SAN management software, identify the failure location; for example, a host bus adapter, FC cable, switch, or other location.

    3. Remove the cause of failure and recover the path.

  2. Using the alternate path software, resume I/O from the server to the recovered path (I/O might resume automatically).

Recovering from a path failure: server to secondary storage system

The following figure shows the failure area and recovery when the path between the server and secondary storage system fails.

GUID-07C199D9-63DE-466E-B8A1-B0E611EC9662-low.png

Procedure

  1. Recover the path.

    1. Using the alternate path software or other tools, identify the path that cannot be accessed from the server.

    2. Using SAN management software, identify the failure location; for example, a host bus adapter, FC cable, switch, or other location.

    3. Remove the cause of failure and recover the path.

  2. Using the alternate path software, resume I/O from the server to the recovered path (I/O might resume automatically).

Pair condition and recovery: P-VOL failure (LDEV blockade)

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use the P-VOL due to LDEV blockade.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server1

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

PAIR (Mirror (RL))

PAIR (Block)

PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

COPY (Mirror (RL))

COPY (Block)

PSUE (Local)

PSUE (Block)

NG

NG

None1

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

NG

NG

None2

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Notes:

  1. Recover the data from ShadowImage, Thin Image, or other backup data.
  2. Recover the data using the S-VOL data that is not the latest, ShadowImage, Thin Image, or other backup data.
SIMs
  • Primary storage system: 3A0xyy, DD1xyy, DFAxxx, DFBxxx, EF9xxx
  • Secondary storage system: DD1xyy

Procedure

  1. Recover the P-VOL.

  2. Re-create the pair.

Recovering the P-VOL (DP-VOL) (pair status: PAIR)

The following figure shows the failure area and recovery when a pair is suspended due to a P-VOL failure and the P-VOL is a DP-VOL.

NoteIn this example no consistency group is specified.
GUID-74CAAE52-F516-4250-B2BB-571317915EBA-low.png

Procedure

  1. Delete the alternate path (logical path) to the volume that cannot be accessed from the server.

    1. Using the alternate path software, identify the volume that cannot be accessed.

    2. Confirm whether the volume (P-VOL) is blocked, and the pool ID (B_POOLID) of the pool to which the P-VOL is associated.

      raidcom get ldev -ldev_id 0x2222 -IH0
      (snip)
      B_POOLID : 0
      (snip)
      STS : BLK
      (snip)
    3. Display the status of the volumes configuring the pool (pool volume) to identify the blocked volume.

      raidcom get ldev -ldev_list pool -pool_id 0 -IH0
      (snip)
      LDEV : 16384
      (snip) 
      STS : BLK 
      (snip) 

      For a blocked volume, BLK is indicated in the STS column.

    4. Using the alternate path software, delete the alternate path to the volume that cannot be accessed from the server.

    Go to the next step even if the alternate path cannot be deleted.

  2. Delete the pair.

    1. From the secondary storage system, delete the pair specifying the actual LDEV ID of the S-VOL.

      pairsplit -g oraHA -R -d dev1 -IH1
      NoteTo delete the pair specifying the S-VOL, use the -R option of the pairsplit command. Specify the actual LDEV ID (device name) of the S-VOL in the -d option.
    2. Confirm that the pair is deleted.

      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU), Seq#,   LDEV#.P/S,Status,Fence,   %,    P-LDEV# M   CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)      (CL1-C-1, 0,   0)522222  4444.SMPL  ---- ------,  ----- -----   -   -   -   -  -        -      -       -/-
      oraHA   dev1(R)      (CL1-A-0, 0,   0)511111  2222.SMPL  ---- ------,  ----- -----   -   -   -   -  -        -      -       -/-
  3. Remove the failure.

    The following example shows recovery from a pool-volume failure.

    1. Recover a pool volume that configures the P-VOL (DP-VOL).

    2. Display the status of the pool volumes to confirm that the pool volume has been recovered.

      raidcom get ldev -ldev_list pool -pool_id 0 -IH0
       (snip) 
      LDEV : 16384 
      (snip) 
      STS : NML 
      (snip)

      For a normal volume, NML is indicated in the STS column.

  4. If the volume cannot be recovered, follow the procedure below to re-create the P-VOL:

    1. At the primary storage system, delete the LU path to the P-VOL.

    2. Delete the P-VOL.

    3. Create a new volume.

    4. Set an LU path to the new volume.

  5. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  6. Re-create the pair.

    1. If you created a volume in step 4, set the GAD reserve attribute to the created volume.

      raidcom map resource -ldev_id 0x2222 -virtual_ldev_id reserve -IH0
    2. From the secondary storage system, create the pair specifying the S-VOL's actual LDEV ID.

      paircreate -g oraHA -f never -vl -jq 0 -d dev1 -IH1
      NoteTo create the pair specifying the S-VOL, specify the actual LDEV ID (device name) of the S-VOL in the -d option of the paircreate command.

      The volume of the primary storage system changes to an S-VOL, and the volume of the secondary storage system changes to a P-VOL.

    3. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  7. Using the alternate path software, add an alternate path from the server to the S-VOL (P-VOL before the failure).

  8. Using the alternate path software, resume I/O from the server to the S-VOL (P-VOL before the failure).

    NoteI/O from the server might resume automatically.
  9. Reverse the P-VOL and the S-VOL if necessary.

Recovering the P-VOL (other than DP-VOL) (pair status: PAIR)

The following figure shows the failure area and recovery when a pair is suspended due to a P-VOL failure and the P-VOL is not a DP-VOL.

For details about storage system support (models, microcode) for volumes other than DP-VOLs, see Requirements and restrictions.

NoteIn this example no consistency group is specified.
GUID-4ACAD18C-F3F2-4689-8139-F059624B5EC5-low.png

Procedure

  1. Delete the alternate path (logical path) to the volume that cannot be accessed from the server.

    1. Using the alternate path software, identify the volume that cannot be accessed.

    2. Confirm whether the volume (P-VOL) is blocked.

      raidcom get ldev -ldev_id 0x2222 -IH0
      (snip)
      STS : BLK
      (snip)

      For a blocked volume, BLK is indicated in the STS column.

    3. Using the alternate path software, delete the alternate path to the volume that cannot be accessed from the server.

    Go to the next step even if the alternate path cannot be deleted.

  2. Delete the pair.

    1. From the secondary storage system, delete the pair specifying the actual LDEV ID of the S-VOL.

      pairsplit -g oraHA -R -d dev1 -IH1
      NoteTo delete the pair specifying the S-VOL, use the -R option of the pairsplit command. Specify the actual LDEV ID (device name) of the S-VOL in the -d option.
    2. Confirm that the pair is deleted.

      pairdisplay -g oraHA -fxce -IH1Group PairVol(L/R) (Port#,TID, LU), Seq#, LDEV#.P/S,
      Status,Fence, %, P-LDEV# M CTG JID AP EM ESeq# E-LDEV# R/W
      oraHA dev1(L) (CL1-C-1, 0, 0)522222 4444.SMPL ---- ------, ----- ----- - - - - - - - -/-
      oraHA dev1(R) (CL1-A-0, 0, 0)511111 2222.SMPL ---- ------, ----- ----- - - - - - - - -/-
  3. Remove the failure. The following example shows recovery from a volume failure.

    1. Recover the P-VOL.

    2. Display the status of the P-VOL to confirm that the pool volume has been recovered.

      raidcom get ldev -ldev_id 0x2222 -IH0
      (snip)
      STS : NML
      (snip)

      For a normal volume, NML is indicated in the STS column.

  4. If the volume cannot be recovered, follow the procedure below to re-create the P-VOL:

    1. At the primary storage system, delete the LU path to the P-VOL.

    2. Delete the P-VOL.

    3. Create a new volume.

    4. Set an LU path to the new volume.

  5. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  6. Re-create the pair.

    1. If you created a volume in step 4, set the GAD reserve attribute to the created volume.

      raidcom map resource -ldev_id 0x2222 -virtual_ldev_id reserve -IH0
    2. From the secondary storage system, create the pair specifying the S-VOL's actual LDEV ID.

      paircreate -g oraHA -f never -vl -jq 0 -d dev1 -IH1
      NoteTo create the pair specifying the S-VOL, specify the actual LDEV ID (device name) of the S-VOL in the -d option of the paircreate command. The volume in the primary storage system changes to an S-VOL, and the volume in the secondary storage system changes to a P-VOL.
    3. Confirm that the P-VOL and S-VOL pair statuses have changed to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status, Fence, 
      %,P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# R/W
      oraHA dev1(L) (CL1-A-0, 0, 0)511111 2222.P-VOL PAIR NEVER , 100 4444 - - 0 - - - - L/M
      oraHA dev1(R) (CL1-C-1, 0, 0)522222 4444.S-VOL PAIR NEVER , 100 2222 - - 0 - - - - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status, Fence, 
      %,P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# R/W
      oraHA dev1(L) (CL1-C-1, 0, 0)522222 4444.S-VOL PAIR NEVER , 100 2222 - - 0 - - - - L/M
      oraHA dev1(R) (CL1-A-0, 0, 0)511111 2222.P-VOL PAIR NEVER , 100 4444 - - 0 - - - - L/M
  7. Using the alternate path software, add an alternate path from the server to the S-VOL (P-VOL before the failure).

  8. Using the alternate path software, resume I/O from the server to the S-VOL (P-VOL before the failure).

    NoteI/O from the server might resume automatically.
  9. Reverse the P-VOL and the S-VOL if necessary.

Pair condition and recovery: S-VOL failure (LDEV blockade)

The following table shows the transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use the S-VOL due to LDEV blockade.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

PAIR (Mirror (RL))

PAIR (Block)

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

COPY (Mirror (RL))

COPY (Block)

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

NG

None*

* Recover data using the P-VOL data that is not the latest, ShadowImage, Thin Image, or other backup data.

SIMs
  • Primary storage system: DD1xyy
  • Secondary storage system: 3A0xyy, DD1xyy, DFAxxx, DFBxxx, EF9xxx

Procedure

  1. Recover the S-VOL.

  2. Re-create the pair.

Recovering the S-VOL (DP-VOL) (pair status: PAIR)

The following figure shows the failure area and recovery when a pair is suspended due to an S-VOL failure and the S-VOL is a DP-VOL.

NoteIn this example no consistency group is specified.
GUID-BE112E9E-0CFD-4548-9124-EF36EC7ADFF8-low.png

Procedure

  1. Delete the alternate path (logical path) to the volume that cannot be accessed from the server.

    1. Using the alternate path software, identify the volume that cannot be accessed.

    2. Confirm whether the volume (S-VOL) is blocked, and the pool ID (B_POOLID) of the pool to which the S-VOL is associated.

      raidcom get ldev -ldev_id 0x4444 -IH1 
      (snip) 
      B_POOLID : 0 
      (snip) 
      STS : BLK 
      (snip)
    3. Display the status of the volumes configuring the pool (pool volume) to identify the blocked volume.

      raidcom get ldev -ldev_list pool -pool_id 0 -IH1 
      (snip) 
      LDEV : 16384 
      (snip) 
      STS : BLK 
      (snip)

      For the blocked volume, BLK is indicated in the STS column.

    4. Using the alternate path software, delete the alternate path to the volume.

    Go to the next step even if the alternate path cannot be deleted.

  2. Delete the pair.

    1. From the primary storage system, delete the pair specifying the P-VOL's actual LDEV ID.

      pairsplit -g oraHA -S -d dev1 -IH0
      NoteTo delete the pair specifying the P-VOL, use the -S option of the pairsplit command. Specify the actual LDEV ID (device name) of the P-VOL in the -d option.
    2. Confirm that the pair is deleted.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU), Seq#,   LDEV#.P/S,Status,
      Fence,   %,    P-LDEV# M   CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)      (CL1-A-0, 0,   0)511111  2222.SMPL  ---- ------,  
      ----- -----   -   -   -   -  -        -      -       -/-
      oraHA   dev1(R)      (CL1-C-1, 0,   0)522222  4444.SMPL  ---- ------,  
      ----- -----   -   -   -   -  -        -      -       -/- 
  3. Remove the failure.

    The following example shows recovery from a pool-volume failure.

    1. Recover a pool volume that configures the S-VOL (DP-VOL).

    2. Display the status of the pool volumes to confirm that the pool volume has been recovered.

      raidcom get ldev -ldev_list pool -pool_id 0 -IH1 
      (snip) 
      LDEV : 16384 
      (snip) 
      STS : NML 
      (snip)

      For a normal volume, NML is indicated in the STS column.

  4. If the volume cannot be recovered, follow the procedure below to create the S-VOL again:

    1. At the secondary storage system, delete the LU path to the S-VOL.

    2. Delete the S-VOL.

    3. Create a new volume.

    4. Set an LU path to the new volume.

  5. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  6. Re-create the pair.

    1. If you created a volume in step 4, set the GAD reserve attribute to the created volume.

      raidcom map resource -ldev_id 0x4444 -virtual_ldev_id reserve -IH1
    2. From the primary storage system, create the pair specifying the P-VOL's actual LDEV ID.

      paircreate -g oraHA -f never -vl -jq 0 -d dev1 -IH0
      NoteTo create the pair specifying the P-VOL, specify the actual LDEV ID (device name) of the P-VOL in the -d option of the paircreate command.
    3. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  7. Using the alternate path software, add an alternate path from the server to the S-VOL.

  8. Using the alternate path software, resume I/O from the server to the S-VOL.

    NoteI/O from the server might resume automatically.

Recovering the S-VOL (other than DP-VOL) (pair status: PAIR)

The following figure shows the failure area and recovery when a pair is suspended due to an S-VOL failure and the S-VOL is not a DP-VOL.

For details about storage system support (models, microcode) for volumes other than DP-VOLs, see Requirements and restrictions.

NoteIn this example no consistency group is specified.
GUID-1BD22C97-B7FB-4DC0-94EE-FFC14A265BFE-low.png

Procedure

  1. Delete the alternate path (logical path) to the volume that cannot be accessed from the server.

    1. Using the alternate path software, identify the volume that cannot be accessed.

    2. Confirm whether the volume (S-VOL) is blocked.

      raidcom get ldev -ldev_id 0x4444  – IH1
      (snip)
      STS : BLK
      (snip)

      For a blocked volume, BLK is indicated in the STS column.

    3. Using the alternate path software, delete the alternate path to the volume that cannot be accessed from the server.

    Go to the next step even if the alternate path cannot be deleted.

  2. Delete the pair.

    1. From the primary storage system, delete the pair specifying the actual LDEV ID of the P-VOL.

      pairsplit -g oraHA -R -d dev1 -IH1
      NoteTo delete the pair specifying the P-VOL, use the -S option of the pairsplit command. Specify the actual LDEV ID (device name) of the P-VOL in the -d option.
    2. Confirm that the pair is deleted.

      pairdisplay -g oraHA -fxce  – IH0
      Group PairVol(L/R) (Port#,TID, LU), Seq#, LDEV#.P/S,Status,
      Fence, %, P-LDEV# M CTG JID AP EM ESeq# E-LDEV# R/W
      oraHA dev1(L) (CL1-A-0, 0, 0)511111 2222.SMPL ---- ------, 
      ----- ----- - - - - - - - -/-
      oraHA dev1(R) (CL1-C-1, 0, 0)522222 4444.SMPL ---- ------, 
      ----- ----- - - - - - - - -/-
  3. Remove the failure. The following example shows recovery from a volume failure.

    1. Recover an S-VOL.

    2. Display the status of the P-VOL to confirm that the pool volume has been recovered.

      raidcom get ldev -ldev_id 0x4444  – IH1
      (snip)
      STS : NML
      (snip)

      For a normal volume, NML is indicated in the STS column.

  4. If the volume cannot be recovered, follow the procedure below to re-create the S-VOL:

    1. At the primary storage system, delete the LU path to the S-VOL.

    2. Delete the S-VOL.

    3. Create a new volume.

    4. Set an LU path to the new volume.

  5. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  6. Re-create the pair.

    1. If you created a volume in step 4, set the GAD reserve attribute to the created volume.

      raidcom map resource -ldev_id 0x4444 -virtual_ldev_id reserve –IH1
    2. From the primary storage system, create the pair specifying the P-VOL's actual LDEV ID.

      paircreate -g oraHA -f never -vl -jq 0 -d dev1 -IH1
      NoteTo create the pair specifying the P-VOL, specify the actual LDEV ID (device name) of the P-VOL in the -d option of the paircreate command.
    3. Confirm that the P-VOL and S-VOL pair statuses have changed to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status, 
      Fence, %,P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# R/W
      oraHA dev1(L) (CL1-A-0, 0, 0)511111 2222.P-VOL PAIR NEVER , 
      100 4444 - - 0 - - - - L/M
      oraHA dev1(R) (CL1-C-1, 0, 0)522222 4444.S-VOL PAIR NEVER , 
      100 2222 - - 0 - - - - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status, 
      Fence, %,P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# R/W
      oraHA dev1(L) (CL1-C-1, 0, 0)522222 4444.S-VOL PAIR NEVER , 
      100 2222 - - 0 - - - - L/M
      oraHA dev1(R) (CL1-A-0, 0, 0)511111 2222.P-VOL PAIR NEVER , 
      100 4444 - - 0 - - - - L/M
  7. Using the alternate path software, add an alternate path from the server to the S-VOL.

  8. Using the alternate path software, resume I/O from the server to the S-VOL.

    NoteI/O from the server might resume automatically.

Pair condition and recovery: full pool for the P-VOL

When the P-VOL cannot be used due to a full pool, the GAD pair is suspended.

The following table shows the transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use the P-VOL due to full pool.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

PAIR (Mirror (RL))

PAIR (Block)

PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

COPY (Mirror (RL))

COPY (Block)

PSUE (Local)

PSUE (Block)

NG

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

NG

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

SIMs
  • Primary storage system: 622xxx, 62axxx, DD1xyy
  • Secondary storage system: DD1xyy

Procedure

  1. Increase the available pool capacity to the P-VOL.

  2. Resynchronize the pair.

Recovering a full pool for the P-VOL (pair status: PAIR)

The following figure shows the failure area and recovery when a pair is suspended due to a full pool of the P-VOL.

GUID-C2492FBA-AE9A-47A1-BE6F-F868FD0023DC-low.png

Procedure

  1. Increase the available capacity to the pool on which the full pool was detected.

    For details on how to increase an available pool capacity, see the Provisioning Guide for the storage system.

  2. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  3. Resynchronize a GAD pair.

    1. Confirm that the I/O mode of the S-VOL is Local.

      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL SSWS 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/L
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - B/B
    2. At the secondary storage system, resynchronize the pair.

      pairresync -g oraHA -swaps -IH1

      The volume of the primary storage system changes to an S-VOL, and the volume of the secondary storage system changes to a P-VOL.

    3. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.S-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.P-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.P-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.S-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  4. Using the alternate path software, resume I/Os to the S-VOL that was a P-VOL before the failure (I/O might resume automatically).

  5. Reverse the P-VOL and the S-VOL if necessary.

Pair condition and recovery: full pool for the S-VOL

When the S-VOL cannot be used due to a full pool, the GAD pair is suspended.

The following table shows the transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use the S-VOL due to full pool.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

PAIR (Mirror (RL))

PAIR (Block)

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

COPY (Mirror (RL))

COPY (Block)

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

NG

S-VOL

SIMs
  • Primary storage system: DD1xyy
  • Secondary storage system: 622xxx, 62axxx, DD1xyy

Procedure

  1. Increase an available pool capacity to the S-VOL.

  2. Resynchronize the pair.

Recovering a full pool for the S-VOL (pair status: PAIR)

The following figure shows the failure area and recovery when a pair is suspended due to a full pool of the S-VOL.

GUID-D95F802A-E460-417A-BAE9-809374C329F1-low.png

Procedure

  1. Increase an available capacity to the pool on which the full pool was detected.

    For details on how to increase an available pool capacity, see the Provisioning Guide for the storage system.

  2. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  3. Resynchronize a GAD pair.

    1. Confirm that the I/O mode of the P-VOL is Local.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/L
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PSUE 
      NEVER ,  100  2222 -   -   0  -  -            -       - B/B
    2. At the primary storage system, resynchronize the pair.

      pairresync -g oraHA -IH0
    3. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  4. Using the alternate path software, resume I/O to the S-VOL (I/O might resume automatically).

Pair condition and recovery: path failure, primary to secondary storage system

If the statuses of storage systems in both the primary and secondary sites are normal, a failure might have occurred in a physical path or switch between the storage systems. You can correct the issue by recovering the paths and resynchronizing the pair.

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use any physical path from the primary storage system to the secondary storage system.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror(RL))1

PAIR (Mirror(RL))1

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

PAIR (Mirror(RL))

PAIR (Block)

PSUE (Local)

PAIR (Block)2

OK

NG

P-VOL

COPY (Mirror(RL))

COPY (Block)

PSUE (Local)

PSUE/COPY (Block)3

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Notes:

  1. The I/O modes for the P-VOL and S-VOL can change, even if I/Os are not issued from the server to the P-VOL. For example, a command that synchronizes the storage system at the primary site and the storage system at the secondary site might be issued to the P-VOL due to its health check (such as by ATS). When you execute this command, synchronization fails if a path between these storage systems has a failure. As a result, the I/O mode for the P-VOL to which I/Os were not issued from the server becomes Local, and the I/O mode for the S-VOL to which I/Os were issued becomes Block. This might cause a failure and suspend GAD pairs.
  2. For the recovery procedure, see Pair condition and recovery: quorum disk and primary-to-secondary path failure.
  3. If either of the following conditions is met, the S-VOL might become COPY (Block):
    • No volume is set for the quorum disk.
    • A quorum disk failure occurs.
SIMs
  • Primary storage system: DD0xyy, 2180xx
  • Secondary storage system: DD3xyy

Procedure

  1. Recover the paths from the primary storage system to the secondary storage system.

  2. Resynchronize the pair.

Recovering paths, primary to secondary storage system (pair status: PAIR)

The following figure shows the failure area and recovery when a pair is suspended due to path failure from the primary storage system to the secondary storage system.

GUID-B3AFBD82-704B-40CC-8779-EBF9ABD9243D-low.png

Procedure

  1. Reconnect the physical path or reconfigure the SAN to recover the path failure.

    When the path between the storage systems is recovered, the remote path is either automatically recovered or a manual recovery might be required. To verify the remote path status and perform any recommended action, see Troubleshooting related to remote path status. If the remote path failure persists even after following the recommended action, contact customer support.

  2. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  3. Resynchronize the pair.

    1. Confirm that the P-VOL I/O mode is Local.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/L
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PSUE 
      NEVER ,  100  2222 -   -   0  -  -            -       - B/B
    2. At the primary storage system, resynchronize the pair.

      pairresync -g oraHA -IH0
    3. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  4. Using the alternate path software, resume I/O to the volume that could not be accessed from the server (I/O might resume automatically).

Pair condition and recovery: path failure, secondary to primary storage system

If the statuses of the storage systems in both the primary and secondary sites are normal, a failure might have occurred in a physical path or switch between the storage systems. The following table shows the transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use any physical path from the secondary storage system to the primary storage system.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror(RL))1

PAIR (Mirror(RL))1

PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PSUE (Local)2

PSUE (Block)2

OK

NG

P-VOL

PAIR (Mirror(RL))

PAIR (Block)

PAIR (Mirror(RL))

PAIR (Block)

OK

NG

P-VOL

COPY (Mirror(RL))

COPY (Block)

COPY (Mirror (RL))

COPY (Block)

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Notes:

  1. The I/O modes for the P-VOL and S-VOL can change, even if I/Os are not issued from the server to the S-VOL.

    For example, a command that synchronizes the storage system at the primary site and the storage system at the secondary site might be issued to the S-VOL due to its health check (such as by ATS). When you execute this command, synchronization fails if a path between these storage systems has a failure. As a result, the I/O mode for the S-VOL to which I/Os were not issued from the server becomes Local, and the I/O mode for the P-VOL to which I/Os were issued becomes Block. This might cause a failure and suspend GAD pairs.

  2. If either of the following conditions is met, the P-VOL might become PSUE (Local) and the S-VOL might become PSUE (Block):
    • No volume is set for the quorum disk.
    • A quorum disk failure occurs.
SIMs
  • Primary storage system: DD3xyy
  • Secondary storage system: DD0xyy, 2180xx

Procedure

  1. Recover the paths from the secondary storage system to the primary storage system.

  2. Resynchronize the pair.

Recovering paths, secondary to primary storage system (pair status: PAIR when a volume is set for the quorum disk)

The following figure shows the failure area and recovery in a configuration with a volume set for the quorum disk when a pair is suspended due to path failure from the secondary storage system to the primary storage system.

GUID-C9636DBD-66EC-465A-9FDD-3D804FF33D63-low.png

Procedure

  1. Reconnect the physical path or reconfigure the SAN to recover the path from the secondary storage system to the primary storage system.

    After the path between the storage systems is recovered, the remote path is either automatically recovered or a manual recovery might be required. To verify the remote path status and perform any recommended action, see Troubleshooting related to remote path status. If the remote path failure persists even after following the recommended action, contact customer support.

  2. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  3. Resynchronize the pair.

    1. Confirm that the S-VOL I/O mode is Local.

      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL SSWS 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/L
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - B/B
    2. At the secondary storage system, resynchronize the pair.

      pairresync -g oraHA -swaps -IH1

      The volume on the primary storage system changes to an S-VOL, and the volume on the secondary storage system changes to a P-VOL.

    3. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.S-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.P-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.P-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.S-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  4. Using the alternate path software, resume I/O to the S-VOL (P-VOL before the failure).

    I/O from the server might resume automatically.

  5. Reverse the P-VOL and the S-VOL if necessary.

Recovering paths, secondary to primary storage system (pair status: PAIR when no volume is set for the quorum disk)

When no volume is set for the quorum disk and the pair status is PAIR, if a GAD pair is suspended due to a path failure from the secondary storage system to the primary storage system, perform the following procedure.

Procedure

  1. Recover the paths from the secondary storage system to the primary storage system.

  2. Resynchronize the pair.

Pair condition and recovery: primary storage system failure

When the primary storage system fails, you can correct the issue by recovering the primary storage system, recovering the physical path between primary and secondary storage systems, and recovering the pair.

The following table shows transitions for pair status and I/O mode, the volumes accessible from the server, and location of the latest data when you can no longer use the primary storage system due to failure.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL1

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PSUE (Block)

SSWS (Local)2

NG

OK

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PSUE (Block)3

PSUE (Block)3

NG

NG

Both P-VOL and S-VOL

PAIR (Mirror (RL))

PAIR (Block)

PSUE (Block)

PSUE (Block)

NG

NG

Both P-VOL and S-VOL

COPY (Mirror (RL))

COPY (Block)

PSUE (Local)

COPY (Block)

NG

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

NG

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Notes:

  1. If shared memory in the primary storage system becomes volatilized, the P-VOL status changes to SMPL, and the GAD reserve attribute is set for the volume, which prevents host access to the volume.
  2. If the server does not issue write I/O, the pair status is PAIR (Mirror (RL)).
  3. If either of the following conditions is met, the P-VOL might become PSUE (Block) and the S-VOL might become PSUE (Block):

    • No volume is set for the quorum disk.
    • A quorum disk failure occurs.
SIMs
  • Primary storage system: SIM varies depending on the failure type
  • Secondary storage system: 2180xx, DD0xyy, DD3xyy

Procedure

  1. When the primary storage system is powered off, delete an alternate path (logical path) to the P-VOL, and then turn on the power.

    1. Using the alternate path software, distinguish the volumes which are not able to be accessed from the server.

    2. Using the alternate path software, delete the alternate paths to the P-VOL.

      If you cannot delete the alternate paths, detach all physical paths which are connected to the server at the primary site.
  2. Turn on the primary storage system.

  3. Recover the primary storage system.

    For details, contact customer support.

  4. Recover the physical path between the primary storage system and the secondary storage system.

  5. If S-VOL pair status is PAIR, suspend the pair specifying the S-VOL.

  6. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  7. Resynchronize or re-create a pair using the procedure in the following table whose pair status and I/O mode match your pair's status and I/O mode.

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUE

    COPY

    Local

    Block

    1. Delete the pair forcibly from the S-VOL.

      pairsplit -g oraHA -d dev1 -RF -IH2

      When you perform this step, delete the virtual LDEV ID so that the volume cannot be accessed from the server.

    2. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV:1111
      (Omitted)

      The VIR_LDEV information is not displayed if it is same as the LDEV information. If the virtual LDEV ID is deleted, set a correct LDEV ID.

    3. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.

      pairsplit -g oraHA -d dev1 -SFV -IH1
    4. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff
      (Omitted)

      VIR_LDEV : ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    5. Re-create the GAD pair by specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH1

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    SMPL

    COPY

    Not applicable

    Block

    1. Delete the pair forcibly from the S-VOL.

      When you perform this step, delete the virtual LDEV ID so that the volume cannot be accessed from the server.

      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -
      fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff
      (Omitted)

      VIR_LDEV : ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -
      fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)

      The VIR_LDEV information is not displayed if it is same as the LDEV information. If the virtual LDEV ID is deleted, set a correct LDEV ID.

      NoteIf shared memory in the secondary storage system becomes volatilized, the S-VOL pair status changes to SMPL, and GAD reserve is assigned to the virtual attribute.
    4. Re-create the GAD pair by specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f
      never -vl -jq 1 -IH1

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUS/PSUE

    SSWS

    Block

    Local

    1. Resynchronize the pair specifying the S-VOL.

      pairresync -g oraHA -d dev1 -swaps -IH2

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    SMPL

    SSWS

    Not applicable

    Local

    1. Check if the virtual LDEV ID of the P-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      VIR_LDEV : ffff
      (Omitted)
      

      VGADIR_LDEV : ffff indicates reserve. If it shows another value, set a correct virtual LDEV ID.

    2. Delete the pair forcibly from the S-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.

      pairsplit -g oraHA -d dev1 -RFV -IH2
    3. Check if the virtual LDEV ID of the S-VOL is not deleted.

      raidcom get ldev -ldev_id 0x2222 -
      fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : 1111
      (Omitted)
      
    4. Re-create the pair specifying the S-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH2

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUS/PSUE

    SSUS/PSUE

    Local

    Block

    1. Resynchronize the pair specifying the P-VOL.

      pairresync -g oraHA -d dev1 -IH1

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUE

    PSUE

    Block

    Block

    1. Delete the pair forcibly from the S-VOL.

      When you perform this step, delete the virtual LDEV ID so that the volume cannot be accessed from the server.

      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -
      fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff
      (Omitted)

      VIR_LDEV : ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.

      pairsplit -g oraHA -d dev1 -SFV -
      IH1
    4. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -
      fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)

      The VIR_LDEV information is not displayed if it is same as the LDEV information. If the virtual LDEV ID is deleted, set a correct LDEV ID.

    5. Re-create the pair specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f
      never -vl -jq 1 -IH1

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    SMPL

    SSUS/PSUE

    Not applicable

    Block

    1. Delete the pair forcibly from the S-VOL.

      For VSP G/F350, G/F370, G/F700, G/F900, when you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.

      For VSP 5000 series, when you perform this step, delete the virtual LDEV ID, which allows the volume to be accessed from the server.

      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -
      fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff
      (Omitted)
      

      VIR_LDEV : ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -
      fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)
      

      The VIR_LDEV information is not displayed if it is same as the LDEV information. If you cannot find the virtual LDEV ID, set a correct LDEV ID.

      NoteIf shared memory in the secondary storage system becomes volatilized, the S-VOL pair status changes to SMPL, and GAD reserve is assigned to the virtual attribute.
    4. Re-create the pair specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH1

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUE

    PSUE

    Block

    Block

    These steps apply only to VSP 5000 series.

    1. Delete the pair forcibly from the S-VOL.

      When you perform this step, delete the virtual LDEV ID so that the volume cannot be accessed from the server.
      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff	
      (Omitted)
      

      VIR_LDEV: ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.
      pairsplit -g oraHA -d dev1 -SFV -IH1
    4. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)
      

      The VIR_LDEV information is not displayed if it is same as the LDEV information. If the virtual LDEV ID is deleted, set a correct LDEV ID.

    5. Re-create the GAD pair by specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f
      never -vl -jq 1 -IH1
      
  8. If the alternate path to the P-VOL has been deleted, add the alternate path.

    1. If you have detached the physical paths of the primary site, restore all physical paths to their original status, and then add the alternate path.

    2. Using the alternate path software, add the alternate path deleted at step 1 to the P-VOL.

Setting correct virtual LDEV ID values

If a virtual LDEV ID value is not correct, you can set the correct value.

Procedure

  1. Check the virtual LDEV ID.

    raidcom get ldev -ldev_id 0x1111 -fx -IH1
    (Omitted)
    LDEV : 1111
    VIR_LDEV : 2222
    (Omitted)
    
  2. Delete the virtual LDEV ID.

    raidcom unmap resource -ldev_id 0x1111 -virtual_ldev_id 0x2222 -IH1
    • If the command execution result shows VIR_LDEV is ffff, specify reserve for -virtual_ldev_id to delete the virtual LDEV ID.
      raidcom unmap resource -ldev_id 0x1111 -virtual_ldev_id reserve -IH1
    • If the command execution result does not show VIR_LDEV, specify the LDEV ID for -virtual_ldev_id to delete the virtual LDEV ID.
  3. Confirm that the virtual LDEV ID is deleted.

    raidcom get ldev -ldev_id 0x1111 -fx -IH1
    (Omitted)
    LDEV : 1111
    VIR_LDEV : fffe
    (Omitted)
    

    VIR_LDEV : fffe indicates that the virtual LDEV ID is deleted.

  4. Set the LDEV ID as the virtual LDEV ID.

    If you set a virtual LDEV ID which is different from the LDEV ID, set the virtual LDEV ID you set.
    raidcom map resource -ldev_id 0x1111 -virtual_ldev_id 0x1111 -IH1

    If you want to set GAD reserve, specify reserve for -virtual_ldev_id.

    raidcom map resource -ldev_id 0x1111 -virtual_ldev_id reserve -IH1
  5. Check the virtual LDEV ID.

    raidcom get ldev -ldev_id 0x1111 -fx -IH1
    (Omitted)
    LDEV : 1111
    (Omitted)
    

    VIR_LDEV information is not displayed if it is same as the LDEV information.

Pair condition and recovery: secondary storage system failure

When the secondary storage system fails, you can correct the issue by recovering the secondary storage system, recovering the physical path between primary and secondary storage systems, and recovering the pair. The following table shows transitions for pair status and I/O mode, the volumes accessible from the server, and location of the latest data when you can no longer use the secondary storage system due to failure.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL1

P-VOL

S-VOL

PAIR (Mirror (RL))

PAIR (Mirror (RL))

PSUE (Local)2

PSUE (Block)

OK

NG

P-VOL

PAIR (Mirror (RL))

PAIR (Block)

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

COPY (Mirror (RL))

COPY (Block)

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

NG

S-VOL

Notes:

  1. If shared memory in the secondary storage system becomes volatilized, the S-VOL pair status changes to SMPL, and the reserve attribute is set for the volume, which prevents host access to the volume.
  2. If the server does not issue write I/O, the pair status might be PAIR (Mirror (RL)).
SIMs
  • Primary storage system: 2180xx, DD0xyy, DD3xyy
  • Secondary storage system: SIM varies depending on the failure type

Procedure

  1. When the secondary storage system is powered off, delete an alternate path (logical path) to the S-VOL, and then turn on the power.

    1. Using the alternate path software, distinguish the volumes which are not able to be accessed from the server.

    2. Using the alternate path software, delete the alternate paths to the S-VOL.

      If you cannot delete the alternate paths, detach all physical paths which are connected to the server at the secondary site.
  2. Turn on the secondary storage system.

  3. Recover the secondary storage system.

    For details, contact customer support.

  4. Recover the physical path between the primary storage system and the secondary storage system.

  5. If P-VOL pair status is PAIR, suspend the pair specifying the P-VOL.

  6. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  7. Resynchronize or re-create the pair using the procedure in the following tables whose pair status and I/O mode match your pair's status and I/O mode.

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUS/PSUE

    PSUS/PSUE

    Local

    Block

    1. Resynchronize the pair specifying the P-VOL.

      pairresync -g oraHA -d dev1 -IH1

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUS/PSUE

    SMPL

    Local

    Not applicable

    1. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -
      fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff
      (Omitted)
      

      VIR_LDEV : ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    2. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.

      pairsplit -g oraHA -d dev1 -SFV -IH1
    3. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -
      fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)
      

      The VIR_LDEV information is not displayed if it is same as the LDEV information. If virtual LDEV ID is deleted, set a correct virtual LDEV ID.

    4. Re-create the pair specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f
      never -vl -jq 1 -IH1

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUS/PSUE

    SSWS

    Block

    Local

    1. Resynchronize the pair specifying the S-VOL.

      pairresync -g oraHA -d dev1 -swaps -IH2

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUS/PSUE

    SMPL

    Block

    Not applicable

    1. Delete the pair forcibly from the P-VOL.

      When you perform this step, ensure that you do not delete the virtual LDEV ID, which allows access to the volume from the server.

      pairsplit -g oraHA -d dev1 -SF -IH1
    2. Check if the virtual LDEV ID of the P-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x1111 -
      fx -IH1
      (Omitted)
      LDEV : 1111
      VIR_LDEV : ffff
      (Omitted)

      VIR_LDEV : ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Check if the virtual LDEV ID of the S-VOL i s not deleted.

      raidcom get ldev -ldev_id 0x2222 -
      fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : 1111
      (Omitted)

      If the virtual LDEV ID is deleted, set a correct virtual LDEV ID.

    4. Re-create the pair specifying the S-VOL.

      paircreate -g oraHA -d dev1 -f
      never -vl -jq 1 -IH2
    (VSP 5000 series)

    Pair status

    I/O mode

    P-VOL

    S-VOL

    P-VOL

    S-VOL

    PSUE

    PSUE

    Block

    Block

    1. Delete the pair forcibly from the S-VOL.

      When you perform this step, delete the virtual LDEV ID so that the volume cannot be accessed from the server.

      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff	
      (Omitted)

      VIR_LDEV : ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.
      pairsplit -g oraHA -d dev1 -SFV -IH1
    4. Check if the virtual LDEV ID of the P-VOL i s not deleted.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)

      The VIR_LDEV information is not displayed if it is same as the LDEV information. If the virtual LDEV ID is deleted, set a correct LDEV ID.

    5. Re-create the pair specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f
      never -vl -jq 1 -IH1
  8. If the alternate path to the S-VOL has been deleted, add the alternate path.

    1. If you have detached the physical paths of the secondary site, restore all physical paths to their original status, and then add the alternate path.

    2. Using the alternate path software, add the alternate path deleted at step 1 to the S-VOL.

Pair condition and recovery: path failure, primary to external storage system

If the status of the external storage system is normal, a failure might have occurred in a physical path from the primary or secondary storage system to the external storage system, or a switch. Recover from the failure that occurred in the physical path or switch.

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use any physical path from the primary storage system to the quorum disk's external storage system.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror(RL))1

PAIR (Mirror(RL))1

PAIR (Mirror (RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PAIR (Mirror(RL))

OK

OK

Both P-VOL and S-VOL

PAIR (Mirror (RL))

PAIR (Block)

PAIR (Mirror (RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

COPY (Mirror(RL))

COPY (Block)

COPY (Mirror(RL))2

COPY (Block)2

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Notes:

  1. The pair status and I/O mode after failure depends on the requirement of the pair. For details, see Server I/Os and data mirroring with blocked quorum disk or without quorum disk volumes.
  2. The P-VOL might change to PSUE (Local) and the S-VOL might change to PSUE (Block) if a failure occurred on a physical path to an external storage system for quorum disks immediately after the pair status changed to COPY.
SIMs
  • Primary storage system: 21D0xy, 21D2xx, DD2xyy, DEF0zz, EF5xyy, EFD000, FF5xyy
  • Secondary storage system: DD2xyy

Procedure

  1. Recover the paths to the external storage system.

  2. Resynchronize the pair suspended by a failure.

Recovering the path: primary to external storage system (pair status: PAIR)

The following figure shows the failure area and recovery when the GAD status of a pair changes to Suspended or Quorum disk blocked due to path failure from the primary storage system to the external storage system.

When the GAD status of the pair changes to Suspended, the P-VOL I/O mode changes to Local and the S-VOL I/O mode changes to Block. Server I/O continues on the P-VOL. When the GAD status of the pair changes to Quorum disk blocked, the P-VOL I/O mode remains Mirror (RL), and the S-VOL I/O mode changes to Block. Server I/O continues on the P-VOL.

GUID-AD6FEAF7-8FFD-4FD9-AC07-830A8D3DFD3C-low.png

Procedure

  1. Recover the path to the external storage system.

    1. Reconnect the physical path or reconfigure the SAN to recover the path to the external storage system.

      After the path is recovered, the remote path is automatically recovered.
    2. Confirm that the external storage system is connected correctly.

      raidcom get path -path_grp 1 -IH0
      PHG GROUP STS CM IF MP# PORT   WWN                 PR LUN PHS  Serial# PRODUCT_ID LB PM DM
        1 1-1   NML E  D    0 CL5-A  50060e8008000140     1   0 NML    55555 VSP 5000 series  N  M D
    3. Confirm the LDEV ID of the quorum disk by obtaining the information of the external volume from the primary storage system.

      raidcom get external_grp -external_grp_id 1-1 -IH0
      T GROUP  P_NO  LDEV#   STS         LOC_LBA        SIZE_LBA            Serial#
      E 1-1       0   9999   NML  0x000000000000  0x000003c00000             555555
    4. Confirm that the primary storage system recognizes the external volume as a quorum disk by specifying the LDEV ID of the quorum disk.

      raidcom get ldev -ldev_id 0x9999 -fx -IH0 
      (snip) 
      QRDID : 0 
      QRP_Serial# : 522222 
      QRP_ID : R9 
      (snip)
      NoteVSP 5000 series is displayed as R9 in command output. VSP Fx00 models and VSP Gx00 models are displayed as M8 in command output.
  2. If the GAD status of the pair is Quorum disk blocked: The pair changes to Mirrored status automatically.

    If the GAD status of the pair is Suspended: Resynchronize the pair as follows.

    1. Confirm that the I/O mode of the P-VOL is Local.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/L
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PSUE 
      NEVER ,  100  2222 -   -   0  -  -            -       - B/B
    2. At the primary storage system, resynchronize the pair.

      pairresync -g oraHA -IH0
    3. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  3. Using the alternate path software, resume I/O to the volume that could not be accessed from the server (I/O might resume automatically).

Pair condition and recovery: path failure, secondary to external storage system

If the status of the external storage system is normal, a failure might have occurred in a physical path from the primary or secondary storage system to the external storage system, or a switch. Recover from the failure that occurred in the physical path or switch.

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use any physical path from the secondary storage system to the quorum disk's external storage system.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL1

P-VOL

S-VOL

PAIR (Mirror(RL))1

PAIR (Mirror(RL))1

PAIR (Mirror(RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PAIR (Mirror(RL))

OK

OK

Both P-VOL and S-VOL

PAIR (Mirror(RL))

PAIR (Block)

PAIR (Mirror(RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

COPY (Mirror(RL))

COPY (Block)

COPY (Mirror(RL))2

COPY (Block)2

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Notes:

  1. The pair status and I/O mode after failure depends on the requirement of the pair. For details, see Server I/Os and data mirroring with blocked quorum disk or without quorum disk volumes.
  2. The P-VOL might change to PSUE (Local) and the S-VOL might change to PSUE (Block) if a failure occurred on a physical path to an external storage system for quorum disks immediately after the pair status changed to COPY.
SIMs
  • Primary storage system: DD2xyy
  • Secondary storage system: 21D0xy, 21D2xx, DD2xyy, DEF0zz, EF5xyy, EFD000, FF5xyy

Procedure

  1. Recover the paths to the external storage system.

  2. Resynchronize the pair suspended by a failure.

Recovering the path: secondary to external storage system (pair status: PAIR)

When a failure occurs on the path between the secondary storage system and the external storage system, you should know how to resolve the failure and restore the GAD pair. The following figure shows the failure area and recovery when the GAD status of a pair changes to Suspended or Quorum disk blocked due to path failure from the secondary storage system to the external storage system.

When the GAD status of the pair changes to Suspended, the P-VOL I/O mode changes to Local and the S-VOL I/O mode changes to Block. Server I/O continues on the P-VOL. When the GAD status of the pair changes to Quorum disk blocked, the P-VOL I/O mode remains Mirror (RL), and the S-VOL I/O mode changes to Block. Server I/O continues on the P-VOL.

GUID-7CC2496C-8EBD-420F-AB55-95F7A3A21C42-low.png

Procedure

  1. Recover the path to the external storage system.

    1. Reconnect the physical path or reconfigure the SAN to recover the path to the external storage system.

      After the path is recovered, the remote path is automatically recovered.
    2. Confirm that the external storage system is connected correctly.

      (VSP 5000 series)
      raidcom get path -path_grp 1 -IH1
      PHG GROUP STS CM IF MP# PORT   WWN                 PR LUN PHS  Serial# PRODUCT_ID LB PM DM
        1 1-1   NML E  D    0 CL5-C  50060e8008000160     1   0 NML   555555 VSP 5000 series  N  M D
      (VSP Gx00 models and VSP Fx00 models)
      raidcom get path -path_grp 1 -IH1
      PHG GROUP STS CM IF MP# PORT   WWN                 PR LUN PHS  Serial# PRODUCT_ID LB PM
      1 1-2     NML E  D    0 CL5-C  50060e8007823521     1   0 NML   433333   VSP Gx00 N  M
      
    3. Confirm the LDEV ID of the quorum disk by obtaining the information of the external volume from the secondary storage system.

      raidcom get external_grp -external_grp_id 1-2 -IH1
      T GROUP  P_NO  LDEV#   STS         LOC_LBA        SIZE_LBA            Serial#
      E 1-2       0   9999   NML  0x000000000000  0x000003c00000             555555
    4. Confirm that the secondary storage system recognizes the external volume as a quorum disk by specifying the LDEV ID of the quorum disk.

      raidcom get ldev -ldev_id 0x8888 -fx -IH1
      (snip)
      QRDID : 0
      QRP_Serial# : 511111
      QRP_ID : R9
      (snip)
      NoteVSP 5000 series is displayed as R9 in command output. VSP Fx00 models and VSP Gx00 models are displayed as M8 in command output.
  2. If the GAD status of the pair is Quorum disk blocked: The pair changes to Mirrored status automatically.

    If the GAD status of the pair is Suspended: Resynchronize the pair as follows.

    1. Confirm that the P-VOL I/O mode is Local.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/L
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PSUE 
      NEVER ,  100  2222 -   -   0  -  -            -       - B/B
    2. At the primary storage system, resynchronize the pair.

      pairresync -g oraHA -IH0
    3. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  3. Using the alternate path software, resume I/O to the S-VOL (I/O might resume automatically).

Pair condition and recovery: quorum disk failure

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use the quorum disk volume.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL1

P-VOL

S-VOL

PAIR (Mirror(RL))1

PAIR (Mirror(RL))1

PAIR (Mirror(RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PAIR (Mirror(RL))

OK

OK

Both P-VOL and S-VOL

PAIR (Mirror (RL))

PAIR (Block)

PAIR (Mirror(RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

COPY (Mirror(RL))

COPY (Block)

COPY (Mirror(RL))2

COPY (Block)2

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Notes:

  1. The pair status and I/O mode after failure depends on the requirement of the pair. For details, see Server I/Os and data mirroring with blocked quorum disk or without quorum disk volumes.
  2. The P-VOL might change to PSUE (Local) and the S-VOL might change to PSUE (Block) if a quorum disk failure occurred immediately after the pair status changed to COPY.
SIMs
  • Primary storage system: 21D0xy, 21D2xx, DD2xyy, DEF0zz, EF5xyy, EFD000, FF5xyy
  • Secondary storage system: 21D0xy, 21D2xx, DD2xyy, DEF0zz, EF5xyy, EFD000, FF5xyy

Procedure

  1. Recover the quorum disk.

  2. Resynchronize or re-create GAD pairs if they are suspended by a failure.

Recovering the quorum disk (pair status: PAIR)

When the GAD status of the pair changes to Suspended, the P-VOL I/O mode changes to Local and the S-VOL I/O mode changes to Block. Server I/O continues on the P-VOL. When the GAD status of the pair changes to Quorum disk blocked, the P-VOL I/O mode remains Mirror (RL), and the S-VOL I/O mode changes to Block. Server I/O continues on the P-VOL.

The following figure shows the failure area and recovery when a pair is suspended due to quorum disk failure.

NoteIn this example no consistency group is specified.
GUID-40FDC044-6981-42F2-99E7-229BA993D1B9-low.png
NoteThe following procedure is also used for re-creating the quorum disk when it has been mistakenly reformatted.
NoteSteps 1 and 2 below describe the recovery procedure for an external storage system made by Hitachi, for example, VSP G1000. If you are using another vendor's storage system as external storage, follow the vendor's recovery procedure for the storage system. When you complete the recovery procedure for the external storage system, start the following procedure at step 3.

Procedure

  1. On the external storage system, recover the quorum disk.

    1. Block the quorum disk.

    2. Format the quorum disk.

      If the quorum disk recovers after formatting, go to step h.

      If the quorum disk does not recover after formatting, continue to step c.

      NoteYou can recover the quorum disk by replacing its external storage system with a new one while keeping the GAD pair.
    3. Confirm the following information about the quorum disk:

      - Vendor

      - Machine name

      - Volume properties

      - Device ID (if the information is valid)

      - Serial number

      - SSID

      - Product ID

      - LBA capacity (the capacity must be larger than the quorum disk before the failure occurred)

      - CVS attribute

      For details about confirming this information, see the Hitachi Universal Volume Manager User Guide.

      For details about confirming the CVS attribute, see Table 1: Confirming the CVS attribute on the external storage system.

    4. Delete the LU path to the quorum disk.

    5. Delete the volume that is used as the quorum disk.

    6. Create a new volume.

      For the LDEV ID, set the same value as the LDEV ID of the quorum disk that has been used since before the failure occurred. If you cannot set the same value, go to step 3.

      Also set the same values for the following information as the values that were used before the failure occurred. If you cannot set the same value, go to step 3.

      - Vendor

      - Machine name

      - Volume properties

      - Device ID (if the information is valid)

      - Serial number

      - SSID

      - Product ID

      - LBA capacity (the capacity must be larger than the quorum disk before the failure occurred)

      - CVS attribute

      For details about confirming this information, see the Hitachi Universal Volume Manager User Guide. For details about confirming the CVS attribute, see Table 1: Confirming the CVS attribute on the external storage system and Table 2: Conditions for the CVS attribute for volumes created in the external storage system.

    7. Set an LU path to the new volume.

      For the LU number, set the same value as the LU number of the quorum disk that was used since before the failure occurred. If you cannot set the same value, go to step 3.
    8. Reconnect the external storage system or the quorum disk to the primary and secondary storage systems.

  2. If the GAD status of the pair is Quorum disk blocked: The pair changes to the Mirrored status automatically.

    If the GAD status of the pair is Suspended: Resynchronize the pair as follows.

    1. Confirm that the P-VOL I/O mode is Local.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/L
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PSUE 
      NEVER ,  100  2222 -   -   0  -  -            -       - B/B
    2. On the primary storage system, resynchronize the pair.

      pairresync -g oraHA -IH0
    3. Confirm that the P-VOL and S-VOL pair status has changed to PAIR (Mirror (RL)). If so, go to step 4.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  3. Re-create the pairs.

    1. On the primary storage system, delete all pairs that use the quorum disk where the failure occurred.

      pairsplit -g oraHA -S -d dev1 -IH0
    2. Confirm that the pairs were deleted.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU), Seq#,   LDEV#.P/S,Status,Fence,   %,    P-LDEV# M   CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)      (CL1-A-0, 0,   0)511111  2222.SMPL  ---- ------,  ----- -----   -   -   -   -  -        -      -       -/-
      oraHA   dev1(R)      (CL1-C-1, 0,   0)522222  4444.SMPL  ---- ------,  ----- -----   -   -   -   -  -        -      -       -/-
    3. On the primary and secondary storage systems, delete the quorum disk.

    4. On the primary and secondary storage systems, add a quorum disk.

    5. On the primary storage system, create the pairs.

      paircreate -g oraHA -f never -vl -jq 0 -d dev1 -IH0
    6. Confirm that the P-VOL and S-VOL pair statuses change to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  4. Using the alternate path software, resume I/O to the S-VOL (I/O might resume automatically).

Next steps

NoteWhen the external storage system is installed at the primary site, if a failure occurs in both the primary storage system and the external storage system, forcibly delete the pair on the secondary storage system, and then re-create the pair. For details, see Recovering the storage systems: primary site failure with external storage system.
Table 1: Confirming the CVS attribute on the external storage system

Interface

To confirm the CVS attribute:

HCS

Open the Logical device window, and then confirm whether the CVS attribute is displayed in the Emulation type column for the LDEV that is being used as the quorum disk.

CCI

Execute the raidcom get ldev command from CCI to the LDEV which is used as quorum disk by external storage system, and then confirm whether the CVS attribute is output for VOL_TYPE. Details of the raidcom get ldev command; refer to Command Control Interface Command Reference.

Web Console*

Confirm whether the CVS attribute is displayed in the CVS column on the LUN Management window.

* Ask the maintenance personnel to operate the Web Console.

Table 2: Conditions for the CVS attribute for volumes created in the external storage system

Interface

Condition

CVS attribute

HDvM - SN HCS

CCI

Internal volume or external volume

Allowed

HDP-VOL

VSP G1x00, VSP F1500, or VSP 5000 series

VSP Gx00 models and VSP Fx00 models or later

Allowed

VSP or earlier

HUS VM or earlier

Create LDEV of maximum size

Not allowed

Create LDEV less than maximum size

Allowed

Web Console*

The LDEV is created during the operation of the installation of Define Config & Install or ECC/LDEV, which remains the initial value of the Number of LDEVs on the Device Emulation Type Define window.

Not allowed

Other than above

Allowed

* Ask the maintenance personnel to operate the Web Console.

Recovering from quorum disk failures

You can recover from quorum disk failures without deleting GAD pairs.

Replacing a failed external storage system with a new one
By replacing a failed external storage system with a new one, you can recover from quorum disk failures without deleting GAD pairs.

GUID-6B0E0E19-47F3-42A3-8130-A688CCB3E3EA-low.png

CautionIf a GAD pair is not specified for the same quorum disk ID, delete the quorum disk first, and then re-create a quorum disk. When a GAD pair is not specified for the same quorum disk ID, if you replace the external volume for the quorum disk with a new one, the replacement might fail. When you re-create a quorum disk, create GAD pairs if necessary.

Procedure

  1. Prepare a new quorum disk.

    1. Format the disk of a new external storage system.

    2. Map the formatted disk to the primary and secondary storage systems.

      Use the same procedure you use for creating a quorum disk. However, you do not need to set an external volume for the quorum disk.
  2. Check the status of the quorum disk.

    1. On the primary storage system, confirm that the status of the quorum disk is BLOCKED.

      (VSP 5000 series)

      raidcom get quorum -quorum_id 1 -IH0
      QRDID : 1
      LDEV : 2045
      QRP_Serial# : 511111
      QRP_ID : R9
      Timeout(s) : 30
      STS : BLOCKED
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get quorum -quorum_id 1 -IH0
      QRDID : 1
      LDEV : 2045
      QRP_Serial# : 411111
      QRP_ID : M8
      Timeout(s) : 30
      STS : BLOCKED
      
    2. On the secondary storage system, confirm that the status of the quorum disk is BLOCKED.

      (VSP 5000 series)

      raidcom get quorum -quorum_id 1 -IH1
      QRDID : 1
      LDEV : 2045
      QRP_Serial# : 522222
      QRP_ID : R9
      Timeout(s) : 30
      STS : BLOCKED
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get quorum -quorum_id 1 -IH1
      QRDID : 1
      LDEV : 2045
      QRP_Serial# : 422222
      QRP_ID : M8
      Timeout(s) : 30
      STS : BLOCKED
      
  3. Check the pair operation mode for the blocked quorum disk.

    Depending on the check results, you might have to split the GAD pair.
    1. If the QM column output with pairdisplay -fcxe command is AA, the GAD pair is split.

      Go to step 4 if this is the case.

      If the QM column is other than AA, the GAD pair is not split in most cases. Go to Step b.

      # pairdisplay -g oraHA -fcxe -d dev0
      Group   PairVol(L/R)(Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# R/W QM
      oraHA   dev0(L)     (CL1-C-0, 0,0)511111   400.P-VOL PAIR 
      NEVER,  100   500  -   -   0  1  -      -       - L/M AA
      oraHA   dev0(R)     (CL7-C-0,28,0)522222   500.S-VOL PAIR 
      NEVER,  100   400  -   -   0  1  -      -       - L/M AA
      
    2. Split the GAD pair if it is not already split.

      pairsplit -g oraHA -IH0
  4. Replace the external volume for the quorum disk.

    1. On the primary storage system, replace the current external volume for the quorum disk with a new one.

      raidcom replace quorum -quorum_id 1 -ldev_id 1234 -IH0
    2. On the primary storage system, confirm that the status of the quorum disk is REPLACING.

      (VSP 5000 series)

      raidcom get quorum -quorum_id 1 -IH0
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 511111
      QRP_ID : R9
      Timeout(s) : 30
      STS : REPLACING
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get quorum -quorum_id 1 -IH0
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 411111
      QRP_ID : M8
      Timeout(s) : 30
      STS : REPLACING
      
    3. On the secondary storage system, replace the current external volume for the quorum disk with a new one.

       raidcom replace quorum -quorum_id 1 -ldev_id 1234 -IH1
    4. On the secondary storage system, confirm that the status of the quorum disk is REPLACING.

      (VSP 5000 series)

      raidcom get quorum -quorum_id 1 -IH1
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 522222
      QRP_ID : R9
      Timeout(s) : 30
      STS : REPLACING
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get quorum -quorum_id 1 -IH1
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 422222
      QRP_ID : M8
      Timeout(s) : 30
      STS : REPLACING
      
      Note If the raidcom replace quorum command is executed normally, the status of the quorum disk changes from BLOCKED to REPLACING in a few seconds. If the status does not change in a few minutes, contact customer support.
  5. Resynchronize the GAD pair you previously split.

    pairresync -g oraHA -IH0
  6. Confirm that the status of the quorum disk is NORMAL.

    1. On the primary storage system, confirm that the status of the quorum disk is NORMAL.

      (VSP 5000 series)

      raidcom get quorum -quorum_id 1 -IH0
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 511111
      QRP_ID : R9
      Timeout(s) : 30
      STS : NORMAL
      
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get quorum -quorum_id 1 -IH0
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 411111
      QRP_ID : M8
      Timeout(s) : 30
      STS : NORMAL
      
      
    2. On the secondary storage system, confirm that the status of the quorum disk is NORMAL.

      (VSP 5000 series)

      raidcom get quorum -quorum_id 1 -IH1
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 522222
      QRP_ID : R9
      Timeout(s) : 30
      STS : NORMAL
      
      
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get quorum -quorum_id 1 -IH1
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 422222
      QRP_ID : M8
      Timeout(s) : 30
      STS : NORMAL
      
      

      If the raidcom replace quorum command is executed normally, the status of the quorum disk changes from REPLACING to NORMAL in a minute.

  7. If the status does not change in five minutes, check whether remote paths between the storage systems are in Normal state.

  8. Confirm that the GAD pair you resynchronized in step 5 is synchronized normally.

    If the status of the replaced quorum disk is FAILED, the primary storage system and the secondary storage system might be connected to different quorum disks.
    1. Specify the external volume so that the primary storage system and the secondary storage system are connected to the same quorum disk.

    2. After specifying the correct external volume, perform steps 5 through 8.

Recovering from the FAILED status

When the primary storage system and the secondary storage system are connected to different quorum disks, the status of the quorum disk shows FAILED. If this happens, disconnect the storage systems from the quorum disk first, and then replace the external volume for the quorum disk with the new one.

Procedure

  1. Check the status of the quorum disk.

    1. On the primary site storage system, disconnect the connection to the quorum disk.

      (VSP 5000 series)

      raidcom disconnect external_grp -ldev_id 0x2045 -IH0

      (VSP Gx00 models and VSP Fx00 models)

      raidcom disconnect external_grp -ldev_id 0x2045 -IH0
      
    2. On the secondary storage system, disconnect the connection to the quorum disk.

      (VSP 5000 series)

      raidcom disconnect external_grp -ldev_id 0x2045 -IH1

      (VSP Gx00 models and VSP Fx00 models)

      raidcom disconnect external_grp -ldev_id 0x2045 -IH1
      
  2. Confirm that the primary storage system and the secondary storage system are disconnected from the quorum disk.

    1. On the primary storage system, confirm that the connection with the quorum disk is disconnected.

      (VSP 5000 series)

      raidcom get path -path_grp 1 -IH0
      PHG GROUP STS CM IF MP# PORT WWN PR LUN PHS
      Serial# PRODUCT_ID LB PM DM
      1 1-1 DSC E D 0 CL5-A 50060e8008000140 1 0 NML
      55555 VSP 5000 series N M D
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get path -path_grp 1 -IH0
      PHG GROUP STS CM IF MP# PORT WWN PR LUN PHS
      Serial# PRODUCT_ID LB PM
      1 1-1 DSC E D 0 CL5-A 50060e8007823520 1 0 NML
      433333 VSP Gx00 N M
      
    2. On the secondary storage system, confirm that the connection with the quorum disk is disconnected.

      (VSP 5000 series)

      raidcom get path -path_grp 1 -IH1
      PHG GROUP STS CM IF MP# PORT WWN PR LUN PHS
      Serial# PRODUCT_ID LB PM
      1 1-2 DSC E D 0 CL5-C 50060e8008000140 1 0 NML
      55555 VSP 5000 series N M
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get path -path_grp 1 -IH1
      PHG GROUP STS CM IF MP# PORT WWN PR LUN PHS
      Serial# PRODUCT_ID LB PM
      1 1-2 DSC E D 0 CL5-C 50060e8007823521 1 0 NML
      433333 VSP Gx00 N M
      
  3. Replace the external volume for the quorum disk with a new one.

    1. On the primary storage system, replace the current external volume for the quorum disk with a new one.

      (VSP 5000 series)

      raidcom replace quorum -quorum_id 1 -ldev_id 1234 -IH0
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom replace quorum -quorum_id 1 -ldev_id 1234 -IH0
      
    2. On the primary storage system, confirm that the status of the quorum disk is REPLACING.

      (VSP 5000 series)

      raidcom get quorum -quorum_id 1 -IH0
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 511111
      QRP_ID : R9
      Timeout(s) : 30
      STS : REPLACING
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get quorum -quorum_id 1 -IH0
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 411111
      QRP_ID : M8
      Timeout(s) : 30
      STS : REPLACING
      
    3. On the secondary storage system, replace the current external volume for the quorum disk with a new one.

      (VSP 5000 series)

      raidcom replace quorum -quorum_id 1 -ldev_id 1234 -IH1
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom replace quorum -quorum_id 1 -ldev_id 1234 -IH1
      
    4. On the secondary storage system, confirm that the status of the quorum disk is REPLACING.

      (VSP 5000 series)

      raidcom get quorum -quorum_id 1 -IH1
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 522222
      QRP_ID : R9
      Timeout(s) : 30
      STS : REPLACING
      

      (VSP Gx00 models and VSP Fx00 models)

      raidcom get quorum -quorum_id 1 -IH1
      QRDID : 1
      LDEV : 1234
      QRP_Serial# : 422222
      QRP_ID : M8
      Timeout(s) : 30
      STS : REPLACING
      
      Note When the raidcom replace quorum command is executed normally, the quorum disk status changes to REPLACING in a few seconds. If it does not change from FAILED in a few minutes, contact customer support.

Pair condition and recovery: external storage system failure

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use the external storage system.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror(RL))1

PAIR (Mirror(RL))1

PAIR (Mirror(RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PAIR (Mirror(RL))

OK

OK

Both P-VOL and S-VOL

PAIR (Mirror(RL))

PAIR (Block)

PAIR (Mirror(RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

COPY (Mirror(RL))

COPY (Block)

COPY (Mirror(RL))2

COPY (Block)2

OK

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

Notes:

  1. The pair status and I/O mode after failure depends on the requirement of the pair. For details, see Server I/Os and data mirroring with blocked quorum disk or without quorum disk volumes.
  2. The P-VOL might change to PSUE (Local) and the S-VOL might change to PSUE (Block) if a failure occurred in an external storage system immediately after the pair status changed to COPY.
SIMs
  • Primary storage system: 21D0xy, 21D2xx, DD2xyy, DEF0zz, EF5xyy, EFD000, FF5xyy
  • Secondary storage system: 21D0xy, 21D2xx, DD2xyy, DEF0zz, EF5xyy, EFD000, FF5xyy

Procedure

  1. Recover the external storage system. For details, contact the vendor.

  2. Resynchronize or re-create GAD pairs if they are suspended by a failure.

Pair condition and recovery: other failures

Generally, you can correct failures by recovering the paths and resynchronizing the pair. The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when a failure other than explained above occurs.

Before failure

After failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror (RL))1

PAIR (Mirror (RL))1

PSUE (Local)

PSUE (Block)

OK

NG

P-VOL

PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

PAIR (Mirror (RL))

PAIR (Block)

OK

NG

Both P-VOL and S-VOL

PAIR (Mirror (RL))

PAIR (Block)

PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

COPY (Mirror (RL))

COPY (Block)

PSUE (Local)

PSUE/COPY (Block)

OK

NG

P-VOL

NG2

NG

P-VOL

PSUS/PSUE (Local)

SSUS/PSUE (Block)

PSUS/PSUE (Local)

SSUS/PSUE (Block)

OK

NG

P-VOL

NG2

NG

P-VOL

PSUS/PSUE (Block)

SSWS (Local)

PSUS/PSUE (Block)

SSWS (Local)

NG

OK

S-VOL

NG

NG3

S-VOL

Notes:

  1. The pair status and I/O mode after failure depends on the requirement of the pair. For details, see Server I/Os and data mirroring with blocked quorum disk or without quorum disk volumes.
  2. Depending on the failure factor, if you cannot access the P-VOL, you cannot access the P-VOL or the S-VOL.
  3. Depending on the failure factor, if you cannot access the S-VOL, you cannot access the P-VOL or the S-VOL.
SIMs
  • Primary storage system: SIM varies depending on the failure type
  • Secondary storage system: SIM varies depending on the failure type

Procedure

  1. Recover the system.

  2. Resynchronize the pair.

Recovery procedure for GAD pair suspension due to other failures

A GAD pair might be suspended due to a failure other than those described in this chapter. Use the following procedure to recover a suspended pair from other types of failure.

If you are not able to restore the GAD volumes using this procedure, contact customer support.

Procedure

  1. Recover from the failure.

    1. Verify that a failure, such as a suspended GAD pair, has occurred, for example, by checking for SIMs issued by the primary or secondary storage system.

    2. When a failure has occurred, identify the failure and perform troubleshooting according to the failure type to remove the cause of the failure.

  2. Confirm the quorum disk status. If the quorum disk is blocked, recover from the blockade. If you do not set a volume for the quorum disk, you can skip this step.

  3. Resynchronize the GAD pair.

    1. Check the I/O mode of the P-VOL and the S-VOL of the suspended GAD pair.

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/L
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PSUE 
      NEVER ,  100  2222 -   -   0  -  -            -       - B/B
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PSUE 
      NEVER ,  100  2222 -   -   0  -  -            -       - B/B
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PSUE 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/L
    2. If the I/O mode of the P-VOL is Local, resynchronize the GAD pair at the primary storage system.

      pairresync -g oraHA -IH0
    3. If the I/O mode of the S-VOL is Local, resynchronize the GAD pair at the secondary storage system.

      pairresync -g oraHA -swaps -IH1

      The volume in the primary storage system changes to an S-VOL, and the volume in the secondary storage system changes to a P-VOL.

    4. Confirm that the pair status of the P-VOL and the S-VOL of the GAD pair has changed to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
      Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
      oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
      NEVER ,  100  2222 -   -   0  -  -            -       - L/M
      oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
      NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  4. Using the alternate path software, resume I/O to the S-VOL.

  5. If necessary, reverse the P-VOL and the S-VOL.

Recovering the storage systems: primary site failure with external storage system

If a failure occurs at the primary site in a configuration with the external storage system for the quorum disk located at the primary site, the failure might affect the primary storage system and the external storage system simultaneously. In this case, the GAD pair is suspended, and access to the GAD volumes stops.

Failure at the primary site (external storage system at the primary site) GUID-380D3B71-D4BC-4314-92D8-F410606B133F-low.png

Failure locations

Reference codes of SIMs that might be issued

Can the volume access to the GAD volumes?1

Primary storage system

Secondary storage system

P-VOL

S-VOL

Both the primary storage system and the external storage system for the quorum disk

Depends on the failure type2

DD0xyy

DD2xyy

DD3xyy

2180xx

21D0xx

21D2xx

EF5xyy

EFD000

FF5xyy

DEF0zz

No

No3

Notes:

  1. Hardware such as drives, cache, front-end director (CHB), back-end director (BED), and MPU is redundant in the storage system configuration. Even if a failure occurs in a part of redundant hardware, the failure does not cause a GAD pair being suspended, or an inaccessible GAD volume. The failure does not cause the GAD pair suspended, or the inaccessible GAD volume even if a failure occurs in a part of hardware, if the following physical paths are redundant.
    • Between a server and a storage systems of the primary and secondary sites
    • Between an external storage system and storage systems of the primary and secondary sites
    • Between storage systems of the primary and secondary sites
  2. A SIM that corresponds to the failure type is issued. You might not be able to view SIMs according to the failure type.
  3. You can access the S-VOL, if the pair status of the S-VOL is SSWS, even if a failure occurs.

Procedure

  1. Using the alternate path software, delete the alternate path to the GAD P-VOL.

  2. At the secondary storage system, delete the GAD pair forcibly.

    When deleting the pair forcibly, do not delete the virtual ID, which allows the volume to be accessed from the server.

    pairsplit -g oraHA -d dev1 -RFV -IH2
  3. Confirm that the virtual LDEV ID is not deleted.

    raidcom get ldev -ldev_id 0x2222 -fx -IH2
    (Omitted)
    LDEV : 2222
    VIR_LDEV : 1111
    (Omitted)
  4. Confirm that the GAD pair is deleted.

  5. Using the alternate path software, resume I/Os from the server to the GAD S-VOL.

  6. Restore the primary storage system from the failure.

  7. At the primary storage system, delete the GAD pair forcibly.

    When deleting the pair forcibly, delete the LDEV ID so that the volume cannot be accessed from the server.

    Depending on the failure type of the primary storage system, after the primary storage system is restored from a failure, the pair status of the P-VOL might change to SMPL, and the GAD reserve attribute might be set. In this case, you do not need to delete the GAD pair forcibly.

    pairsplit -g oraHA -d dev1 -SF -IH1
  8. Confirm that the virtual LDEV ID indicates GAD reserve.

    raidcom get ldev -ldev_id 0x1111 -fx -IH1
    (Omitted)
    LDEV : 1111
    VIR_LDEV : ffff
    (Omitted)

    VIR_LDEV : ffff indicates GAD reserve.

  9. Confirm that the GAD pair is deleted.

  10. Restore the external storage system from a failure.

  11. From the primary and secondary storage systems, delete the quorum disk.

    Depending on the failure type of the external storage system, after the external storage system is restored from a failure, a quorum disk can be deleted. In this case, you do not need to delete the quorum disk.

  12. From the primary and secondary storage systems, add a quorum disk.

  13. From the secondary storage system, re-create a GAD pair.

  14. Using the alternate path software, add a path to the GAD P-VOL, and then resume I/Os.

  15. Reverse the P-VOL and the S-VOL if necessary.

Reversing the P-VOL and S-VOL

During disaster recovery operations, P-VOLs are changed to S-VOLs and S-VOLs to P-VOLs to reverse the flow of data from the secondary site to the primary site to restore the primary site. When normal operations are resumed at the primary site, the direction of copy is changed again so that the original P-VOLs become primary volumes again and the original S-VOLs become secondary volumes again with data flowing from the primary site to the secondary site.

Procedure

  1. Using the alternate path software, stop I/O from the server to P-VOLs in the secondary storage system.

    Continue to the next step even if the alternate path cannot be deleted.

  2. Confirm that the P-VOL and the S-VOL have been reversed.

    pairdisplay -g oraHA -fxce -IH0
    Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
    Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
    oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.S-VOL PAIR 
    NEVER ,  100  4444 -   -   0  -  -            -       - L/M
    oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.P-VOL PAIR 
    NEVER ,  100  2222 -   -   0  -  -            -       - L/M
  3. At the primary storage system, change the pair statuses of the S-VOLs to SSWS to suspend the pairs (swap suspension).

    pairsplit -g oraHA -d dev1 -RS -IH0
  4. At the primary storage system, reverse the P-VOL and the S-VOL, and then resynchronize the pairs (swap resync).

    pairresync -g oraHA -d dev1 -swaps -IH0
  5. Confirm that the P-VOL and the S-VOL pair statuses change to PAIR (Mirror (RL)).

    pairdisplay -g oraHA -fxce -IH0
    Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
    Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
    oraHA   dev1(L)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
    NEVER ,  100  4444 -   -   0  -  -            -       - L/M
    oraHA   dev1(R)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
    NEVER ,  100  2222 -   -   0  -  -            -       - L/M
    pairdisplay -g oraHA -fxce -IH1
    Group   PairVol(L/R) (Port#,TID, LU),Seq#,LDEV#.P/S,Status,
    Fence,   %,P-LDEV# M CTG JID AP EM       E-Seq# E-LDEV# R/W
    oraHA   dev1(L)     (CL1-C-1, 0,   0)522222  4444.S-VOL PAIR 
    NEVER ,  100  2222 -   -   0  -  -            -       - L/M
    oraHA   dev1(R)     (CL1-A-0, 0,   0)511111  2222.P-VOL PAIR 
    NEVER ,  100  4444 -   -   0  -  -            -       - L/M
  6. Using the alternate path software, restart I/Os from the server to S-VOLs in the secondary storage system.

Creating GAD pairs when virtual LDEV IDs are deleted from the P-VOL and S-VOL

Some failure recovery operations, such as a primary storage system failure or secondary storage system failure, require you to delete GAD pairs. You might not be able to create them again until you assign a virtual LDEV ID.

Procedure

  1. Confirm that the GAD reserve attribute is assigned to the P-VOL and the S-VOL by using the raidcom get ldev command.

    raidcom get ldev -ldev_id 0x4444 -fx -IH0
    LDEV : 4444 VIR_LDEV : ffff
    raidcom get ldev -ldev_id 0x5555 -fx -IH1
    LDEV : 5555 VIR_LDEV : ffff

    If you execute the raidcom get ldev command for a volume that has the GAD reserve attribute, ffff is displayed for VIR_LDEV (virtual LDEV ID).

  2. Delete all of the LU paths to the P-VOL.

  3. Release the GAD reserve attribute of the P-VOL (LDEV ID: 0x4444) by using the raidcom unmap resource command.

    raidcom unmap resource -ldev_id 0x4444 -virtual_ldev_id reserve -IH0
  4. Display the information about the P-VOL by using the raidcom get ldev command.

    raidcom get ldev -ldev_id 0x4444 -fx -IH0
    LDEV : 4444 VIR_LDEV : fffe

    For the volume whose GAD reserve attribute was released, a virtual LDEV ID is not assigned. If you execute the raidcom get ldev command for a volume to which a virtual LDEV ID is not assigned, fffe is displayed for VIR_LDEV (virtual LDEV ID).

  5. Set a virtual LDEV ID for the P-VOL (LDEV ID: 0x4444) by using the raidcom map resource command.

    raidcom map resource -ldev_id 0x4444 -virtual_ldev_id 0x4444 -IH0
  6. Display the information about the P-VOL by using the raidcom get ldev command.

    raidcom get ldev -ldev_id 0x4444 -fx -IH0
    LDEV : 4444 VIR_LDEV : 4444
  7. Check the virtual attributes of the P-VOL and the S-VOL by using the raidcom get ldev command.

    raidcom get ldev -ldev_id 0x4444 -fx -IH0
    LDEV : 4444
    raidcom get ldev -ldev_id 0x5555 -fx -IH1
    LDEV : 5555 VIR_LDEV : ffff

    The virtual LDEV ID (0x4444) is assigned to the P-VOL (LDEV ID: 0x4444) and the GAD reserve attribute ( VIR_LDEV : ffff) is assigned to the S-VOL (LDEV ID: 0x5555).

  8. Specify a port and host group for the P-VOL, and set the LU path again.

  9. Create GAD pairs again.

Creating GAD pairs when virtual LDEV IDs are set for the P-VOL and S-VOL

Some failure recovery operations, such as a primary storage system failure or secondary storage system failure, require you to delete GAD pairs. You might not be able to create them again until you set a GAD reserve attribute as the virtual attribute of the S-VOL.

Procedure

  1. Check the virtual attributes of the P-VOL and the S-VOL by using the raidcom get ldev command.

    raidcom get ldev -ldev_id 0x4444 -fx -IH0
    LDEV : 4444 
    raidcom get ldev -ldev_id 0x5555 -fx -IH1
    LDEV : 5555 
  2. Delete all of the LU paths to the S-VOL.

  3. Delete the virtual LDEV ID (0x5555) of the S-VOL (LDEV ID: 0x5555) by using the raidcom unmap resource command.

    raidcom unmap resource -ldev_id 0x5555 -virtual_ldev_id 0x5555 -IH1
  4. Display the information about the S-VOL (LDEV ID: 0x5555) by using the raidcom get ldev command.

    raidcom get ldev -ldev_id 0x5555 -fx -IH1
    LDEV : 5555 VIR_LDEV : fffe

    If you execute the raidcom get ldev command for a volume to which a virtual LDEV ID is not assigned, fffe is displayed for VIR_LDEV (virtual LDEV ID).

  5. Set the GAD reserve attribute as the virtual attribute of the S-VOL (LDEV ID: 0x5555) by using the raidcom map resource command.

    raidcom map resource -ldev_id 0x5555 -virtual_ldev_id reserve -IH1
  6. Display the information about the S-VOL by using the raidcom get ldev command.

    raidcom get ldev -ldev_id 0x5555 -fx -IH1
    LDEV : 5555 VIR_LDEV : ffff

    The GAD reserve attribute (VIR_LDEV: ffff) is assigned to the S-VOL (LDEV ID: 0x5555).

  7. Check the reserve attributes of the P-VOL and the S-VOL by using the raidcom get ldev command.

    raidcom get ldev -ldev_id 0x4444 -fx -IH0
    LDEV : 4444
    raidcom get ldev -ldev_id 0x5555 -fx -IH1
    LDEV : 5555 VIR_LDEV : ffff

    The virtual LDEV ID (0x4444) is assigned to the P-VOL (LDEV ID: 0x4444) and the GAD reserve attribute ( VIR_LDEV : ffff) is assigned to the S-VOL (LDEV ID: 0x5555).

  8. Specify a port and host group for the S-VOL, and set the LU path again.

  9. Create GAD pairs again.

Resolving failures in multiple locations

If failures occur in multiple locations, use the following recovery procedure:

Procedure

  1. Identify the failure locations from the SIMs issued by the primary and secondary storage systems and using the SAN management software, and then recover from the failures.

  2. If data has been lost from both volumes, recover from the backup data using ShadowImage or Thin Image volumes, or backup software.

  3. If I/O is stopped, resume I/O from the server.

  4. If GAD pairs are suspended, resynchronize the pairs.

    If the pairs cannot be resynchronized, perform the following recovery procedures depending on the pair status and I/O mode.

    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    PSUE COPY Local Block
    1. Delete the pair forcibly from the S-VOL.

      When you perform this step, delete the virtual LDEV ID so that the volume cannot be accessed from the server.
      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)
      

      The VIR_LDEV information is not displayed if it is the same as the LDEV information. If the virtual LDEV ID is deleted, set a correct virtual LDEV ID.

    3. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.
      pairsplit -g oraHA -d dev1 -SFV -IH1
    4. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff	
      (Omitted)
      

      VIR_LDEV: ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    5. Re-create the pair by specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH1
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    SMPL COPY Not applicable Block
    1. Delete the pair forcibly from the S-VOL.

      When you perform this step, delete the virtual LDEV ID so that the volume cannot be accessed from the server.
      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff	
      (Omitted)
      

      VIR_LDEV: ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)
      

      The VIR_LDEV information is not displayed if it is the same as the LDEV information. If the virtual LDEV ID is deleted, set a correct virtual LDEV ID.

      Note If shared memory in the secondary storage system becomes volatilized, the S-VOL pair status changes to SMPL, and GAD reserve is assigned to the virtual attribute.
    4. Re-create the pair by specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH1
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    PSUS/PSUE SSWS Block Local
    1. Resynchronize the pair by specifying the S-VOL.

      pairresync -g oraHA -d dev1 -swaps -IH2
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    SMPL SSWS Not applicable Local
    1. Check if the virtual LDEV ID of the P-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      VIR_LDEV : ffff	
      (Omitted)
      

      VIR_LDEV: ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    2. Delete the pair forcibly from the S-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.
      pairsplit -g oraHA -d dev1 -RFV -IH2
    3. Check if the virtual LDEV ID of the S-VOL is not deleted.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : 1111	
      (Omitted)
      
    4. Re-create the pair by specifying the S-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH2
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    PSUS/PSUE SSUS/PSUE Local Block
    1. Resynchronize the pair by specifying the P-VOL.

      pairresync -g oraHA -d dev1 -IH1
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    SMPL SSUS/PSUE Not applicable Block
    1. Delete the pair forcibly from the S-VOL.

      When you perform this step, delete the virtual LDEV ID so that the volume can be accessed from the server.
      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff	
      (Omitted)
      

      VIR_LDEV: ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)
      

      The VIR_LDEV information is not displayed if it is the same as the LDEV information. If the virtual LDEV ID is deleted, set a correct virtual LDEV ID.

      Note If shared memory in the secondary storage system becomes volatilized, the S-VOL pair status changes to SMPL, and GAD reserve is assigned to the virtual attribute.
    4. Re-create the pair by specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH1
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    PSUE PSUE Block Block
    1. Delete the pair forcibly from the S-VOL.

      When you perform this step, delete the virtual LDEV ID so that the volume cannot be accessed from the server.
      pairsplit -g oraHA -d dev1 -RF -IH2
    2. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff	
      (Omitted)
      

      VIR_LDEV: ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.
      pairsplit -g oraHA -d dev1 -SFV -IH1
    4. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)
      

      The VIR_LDEV information is not displayed if it is the same as the LDEV information. If the virtual LDEV ID is deleted, set a correct virtual LDEV ID.

    5. Re-create the pair by specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH1
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    PSUS/PSUE PSUS/PSUE Local Block
    1. Resynchronize the pair by specifying the P-VOL.

      pairresync -g oraHA -d dev1 -IH1
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    PSUS/PSUE SMPL Local Not applicable
    1. Check if the virtual LDEV ID of the S-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : ffff	
      (Omitted)
      

      VIR_LDEV: ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    2. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.
      pairsplit -g oraHA -d dev1 -SFV -IH1
    3. Check if the virtual LDEV ID of the P-VOL is not deleted.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      (Omitted)
      

      The VIR_LDEV information is not displayed if it is the same as the LDEV information. If the virtual LDEV ID is deleted, set a correct virtual LDEV ID.

    4. Re-create the pair by specifying the P-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH1
    Pair status I/O mode
    P-VOL S-VOL P-VOL S-VOL
    PSUS/PSUE SMPL Block Not applicable
    1. Delete the pair forcibly from the P-VOL.

      When you perform this step, make sure not to delete the virtual LDEV ID, which allows the volume to be accessed from the server.
      pairsplit -g oraHA -d dev1 -SF -IH1
    2. Check if the virtual LDEV ID of the P-VOL indicates GAD reserve.

      raidcom get ldev -ldev_id 0x1111 -fx -IH1
      (Omitted)
      LDEV : 1111
      VIR_LDEV : ffff	
      (Omitted)
      

      VIR_LDEV: ffff indicates GAD reserve. If it shows another value, set a correct virtual LDEV ID.

    3. Check if the virtual LDEV ID of the S-VOL is not deleted.

      raidcom get ldev -ldev_id 0x2222 -fx -IH2
      (Omitted)
      LDEV : 2222
      VIR_LDEV : 1111
      (Omitted)
      

      If the virtual LDEV ID is deleted, set a correct virtual LDEV ID.

      Note If shared memory in the secondary storage system becomes volatilized, the S-VOL pair status changes to SMPL, and GAD reserve is assigned to the virtual attribute.
    4. Re-create the pair by specifying the S-VOL.

      paircreate -g oraHA -d dev1 -f never -vl -jq 1 -IH2

Pair condition and recovery: quorum disk and primary-to-secondary path failure

If a failure occurs in the quorum disk, the pair status of the P-VOL and S-VOL does not change from PAIR (Mirror(RL). However, if a failure occurs on the physical path from the storage system at the primary site to the storage system at the secondary site, the status of the P-VOL changes from PAIR (Mirror(RL)) to PSUE (Local), and the status of the S-VOL changes from PAIR (Mirror(RL)) to PSUE (Block).

GUID-0D108133-5951-41F7-882F-DD1EEA2F6A69-low.png

The following table shows transitions for pair status and I/O mode, the volumes that are accessible from the server, and the location of the latest data when you can no longer use any physical path from the primary storage system to the secondary storage system after the quorum disk failure.

After quorum disk failure

After primary-to-secondary path failure

Pair status and I/O mode

Pair status and I/O mode

Volume with latest data

Volume accessible from the server

P-VOL

S-VOL

P-VOL

S-VOL

P-VOL

S-VOL

PAIR (Mirror(RL))

PAIR (Mirror(RL))

PSUE (Local)

PAIR (Block)

OK

NG

P-VOL

SIMs
  • Primary storage system: 21D0xy, 21D2xx, DD2xyy, DEF0zz, EF5xyy, EFD000, FF5xyy, DD0xyy, 2180xx
  • Secondary storage system: 21D0xy, 21D2xx, DD2xyy, DEF0zz, EF5xyy, EFD000, FF5xyy, DD3xyy, DD0xyy, DD1xyy

Procedure

  1. Recover the quorum disk failure and the path to the external storage system.

  2. Recover the path from the primary storage system to the secondary storage system.

  3. Resynchronize the GAD pair suspended due to the failure.

  4. Confirm the pair status.

    When the pair status of the P-VOL and S-VOL is PAIR (Mirror(RL)), the recovery is completed.

  5. Suspend the GAD pair by specifying the S-VOL (swap suspend).

    The pair suspension operation fails, but the S-VOL pair status changes to PSUE (Block).

  6. Resynchronize the GAD pair by specifying the P-VOL.

    The pair status of the P-VOL and S-VOL changes to PAIR (Mirror(RL)).

Recovering the quorum disk and primary-to-secondary path failure (VSP 5000 series)

The following figure shows the failure area and recovery from the path failure from the primary storage system to the secondary storage system after the GAD status changes to Quorum disk blocked.

GUID-F95F2BAE-B1F1-4D40-AB4B-CBAD807760C0-low.png

Procedure

  1. Recover the quorum disk failure and the path to the external storage system.

    1. Recover the quorum disk.

    2. Reconnect the physical path or reconfigure the SAN to recover the path to the external storage system. When the path is recovered, the external path is automatically recovered.

    3. Confirm that the external storage system is connected correctly.

      raidcom get path -path_grp 1 -IH0
      PHG GROUP STS CM IF MP# PORT WWN PR LUN PHS Serial# PRODUCT_ID LB PM
      1 1-1 NML E D 0 CL5-A 50060e8008000140 1 0 NML 55555 VSP 5000 series  N M
    4. Confirm the LDEV ID of the quorum disk by obtaining the information of the external volume from the primary storage system.

      raidcom get external_grp -external_grp_id 1-1 -IH0
      T GROUP P_NO LDEV# STS LOC_LBA SIZE_LBA Serial#
      E 1-1 0 9999 NML 0x000000000000 0x000003c00000 55555
    5. Confirm that the primary storage system recognizes the external volume as a quorum disk by specifying the LDEV ID of the quorum disk.

      raidcom get ldev -ldev_id 0x9999 -fx -IH0
      (snip)
      QRDID : 0
      QRP_Serial# : 522222
      QRP_ID : R9
      (snip)
      NoteThe VSP 5000 series is displayed as R9 in command output. VSP Fx00 models and VSP Gx00 models are displayed as M8 in command output.
  2. Reconnect the physical path or reconfigure the SAN to recover the path failure from the primary to secondary storage system.

    When the path between the storage systems is recovered, the remote path is either automatically recovered or a manual recovery might be required. To verify the remote path status and perform any recommended action, see Troubleshooting related to remote path status. If the remote path failure persists even after following the recommended action, contact customer support.

  3. Resynchronize the GAD pair whose GAD status is Suspended.

    1. Confirm that the P-VOL I/O mode is Local.

      pairdisplay -g oraHA -fxce -IH0
      Group PairVol(L/R) (Port#,TID, LU), Seq#,LDEV#.P/S,Status,
      Fence, %,P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# R/W
      oraHA dev1(L) (CL1-A-0, 0, 0)511111 2222.P-VOL PSUE NEVER , 100 4444 - - 0 - - - - L/L 
      oraHA dev1(R) (CL1-C-1, 0, 0)522222 4444.S-VOL PSUE NEVER , 100 2222 - - 0 - - - - B/B
      
    2. At the primary storage system, resynchronize the pair.

      pairresync -g oraHA -IH0
      NoteWhen the P-VOL pair status is PSUE (Local) and the S-VOL pair status is PAIR(Block), the pair resynchronization fails. The result of the pair resynchronization depends on whether the GAD pair is registered to the consistency group.
      • When the GAD pair is registered to the consistency group, the pair resynchronization operation fails.
      • When the GAD pair is not registered to the consistency group, the pair resynchronization operation succeeds, but the pair resynchronization process fails. The pair status of the P-VOL after the pair resynchronization remains PSUE.
    3. Confirm that the pair status of the P-VOL and S-VOL changes to PAIR (Mirror (RL)).

      pairdisplay -g oraHA -fxce -IH0
      Group PairVol(L/R) (Port#,TID, LU), Seq#, LDEV#.P/S,Status,
      Fence, %,P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# R/W
      oraHA dev1(L) (CL1-A-0, 0, 0)511111 2222.P-VOL PAIR NEVER , 100 4444 - - 0 - - - - L/M
      oraHA dev1(R) (CL1-C-1, 0, 0)522222 4444.S-VOL PAIR NEVER , 100 2222 - - 0 - - - - L/M
      pairdisplay -g oraHA -fxce -IH1
      Group PairVol(L/R) (Port#,TID, LU), Seq#, LDEV#.P/S,Status,
      Fence, %,P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# R/W
      oraHA dev1(L) (CL1-C-1, 0, 0)522222 4444.S-VOL PAIR NEVER , 100 2222 - - 0 - - - - L/M
      oraHA dev1(R) (CL1-A-0, 0, 0)511111 2222.P-VOL PAIR NEVER , 100 4444 - - 0 - - - - L/M
      
      NoteWhen the pair whose P-VOL pair status is PSUE (Local) and S-VOL pair status is PAIR (Block) exists, go to step 4. When no pairs meet this condition, go to step 6.
  4. Suspend all GAD pairs whose P-VOL pair status is PSUE (Local) and S-VOL pair status is PAIR (Block) by specifying the S-VOL (swap suspend).

    pairsplit -g oraHA -RS -d dev1 -IH0

    The pair suspension operation fails, but the S-VOL pair status changes to PSUE.

    Note
    • Even if the pairs are registered to a consistency group, swap-suspend the pairs by pair.
    • The following SIMs might be issued, but you do not need to address these SIMs: DD0xyy, DD1xyy, DD2xyy, DD3xyy
  5. At the primary storage system, resynchronize the GAD pair.

    pairresync -g oraHA -IH0
  6. Using the alternate path software, resume I/O to the S-VOL (I/O might resume automatically).