Skip to main content
Hitachi Vantara Knowledge

Planning the data path

You can plan the data path by calculating bandwidth and design the data path network.

Data path design

The data path network must be designed to manage your organization’s throughput to the secondary site. You must determine the bandwidth, number of ports, and Fibre Channel or iSCSI data path configuration that will ensure the update data arrives at the secondary site in a time consistent with your organization’s RPO.

To set up a data path, you must establish the following:

Note
  • Before replacing a data path (Fibre Channel or iSCSI ), first delete the pairs and delete the remote paths that use the data path to be replaced, and then replace the data path. Do not replace a data path that is being used for remote copy operations.
  • Use the same protocol for data paths between a host and a storage system and between primary and secondary storage systems. When different protocols are used in the data paths (for example, Fibre Channel data paths between the host and storage system and iSCSI data paths between the storage systems), make sure the timeout period for commands between the host and the storage system is equal to or greater than the timeout period for commands between the storage systems.
  • The interface that can be used for a data path is Fibre Channel or iSCSI. NVMe-oF is not supported.

Sizing bandwidth

Bandwidth is determined based on the amount of data to be transferred from the primary storage system to the secondary storage system within a certain amount of time. The amount of data to be transferred between the primary storage system and the secondary storage system depends on the amount of host I/O activity and the amount of copied data during the initial copy operation (the initial copy operation or the resync copy operation).

If the amount of data to be transferred between the primary storage system and the secondary storage system is larger than the amount of bandwidth, the data updated in the P-VOLs is stored in the journal volume until the additional bandwidth capacity is added or becomes available. If the journal volume becomes full, the pairs are suspended with a failure. You need to perform the resync copy operation to restore the pair status.

While it can be costly to increase bandwidth, increasing the capacity of a journal volume is relatively inexpensive. However, as the amount of journal data accumulated in the journal volume increases, the data in the secondary volume will not be the latest. For this reason, the differences between the primary and secondary volumes increase as more update data accumulates in the journal, and the RPO (Recovery Point Objective) increases if a failure occurs on the primary storage system. Therefore, sizing bandwidth often involves a trade-off between expense and keeping the data currency of the secondary volumes within your RPO goals.

Five sizing strategies

Refer to the following typical sizing strategies as you determine an approach to sizing bandwidth. This is not a complete list of sizing strategies, and your approach might combine several strategies.

  • Size bandwidth to peak workload This approach results in the smallest difference between the data in the P-VOL and S-VOL. Identify peak workload on the production volumes, and then add extra capacity to accommodate packet loss and protocol overhead. RPO is at or near zero when bandwidth is sized to peak workload.
  • Size bandwidth to peak workload rolling average The rolling average is less than peak but more than average. This guarantees that at some point data will accumulate in the journal, but most of the time it will not. Your system can afford to journal for the planned amount of time and still maintain RPO.
  • Size bandwidth to typical workload When bandwidth is sized to typical write-workload and an extended peak workload is experienced, excess write-data is written to journal. This excess data is delayed for subsequent transmission to the secondary site until network capacity becomes available. The amount of differential data is proportional to the amplitude and duration of the workload surge.
  • Size bandwidth to average or mean workload If you cannot determine a typical workload, sizing should be to the average or mean workload with a small compensation for network overhead. In this scenario, excess data in the journals will be completely emptied to the S-VOL only occasionally. If bandwidth is sized below average write-workload, the journals never fully drain and eventually overflow.
  • Alternate pair status between suspend and resync You can size bandwidth and journal size for cases such as data migration in which data consistency is not required. In this strategy, you can alternate the pair status between suspend and resync in order to process point-in-time copies in batches. When pairs are suspended, journals are not used to queue write operations. Instead, a bitmap is used to track the changed cylinders on the disks. For access patterns that favor multiple writes to a relatively small region of disk, this technique can provide especially efficient transfer of data: multiple writes to one region are not sent each and every time, and only the last update before resync is sent. The disadvantage of this strategy is that it does not guarantee I/O consistency on the secondary storage system until the resync is complete.

Calculating bandwidth

To determine bandwidth for Universal Replicator, write-workload must be measured. Production system workload data is collected using performance monitoring software. See Measuring write-workload.

When you have collected write-workload data, size your bandwidth according to your sizing strategy. In the following procedures, bandwidth is sized for peak and peak rolling average write-workload.

Sizing bandwidth for peak write-workload

  1. Make sure that write-workload data is imported into a spreadsheet tool. Column C in the following figure shows an example of collected raw data over 10-minute segments.

  2. Locate the highest peak. Based on your write-workload measurements, this is the greatest amount of data transferred during the collection period. It indicates the base amount of data that your bandwidth must be able to handle for near 0 RPO.

    Though the highest peak is used for determining bandwidth, you should take notice of extremely high peaks. In some cases a batch job, defragmentation, or other process could be driving workload to abnormally high levels. It is sometimes worthwhile to review the processes that are running. After careful analysis, it might be possible to lower or even eliminate some spikes by optimizing or streamlining high-workload processes. Changing the timing of a process can lower workload.
  3. With a base bandwidth value established, make adjustments for growth and a safety factor.

    • Projected growth rate accounts for the increase expected in write-workload over a 1, 2, or 3 year period.
    • A safety factor adds extra bandwidth for unusually high spikes that did not occur during write-workload measurement but could.

Sizing bandwidth for peak rolling average write-workload

  1. Using write-workload data imported into a spreadsheet and your RPO, calculate write rolling-averages.

    For example, if RPO time is 1 hour, then 60-minute rolling averages are calculated. Do this by arranging the values in six 10-minute intervals, as follows:
    1. In cell E4 type, =average(b2:b7), and press Enter. (Most spreadsheet tools have an average function.)

      This instructs the tool to calculate the average value in cells B2 through B7 (six 10-minute intervals) and populate cell E4 with that data. (The calculations used here are for example purposes only. Base your calculations on your RPO.)
    2. Copy the value that displays in E4.

    3. Highlight cells E5 to the last E cell of workload data in the spreadsheet.

    4. Right-click the highlighted cells and select the Paste option.

      Excel maintains the logic and increments the formula values initially entered in E4. It then calculates all of the 60-minute averages for every 10-minute increment, and populates the E cells, as shown in the following example. For comparison, 24-hour rolling averages are also shown.

      For another perspective, you can use 60-minute rolling averages graphed over raw data, as shown in the following figure.

  2. From the spreadsheet or graph, locate the largest or highest rolling average value. This is the peak rolling average, which indicates the base amount of data that your bandwidth must be able to handle.

  3. With a base bandwidth value established, make adjustments for growth and a safety factor.

    • Projected growth rate accounts for the increase expected in write-workload over a 1, 2, or 3 year period.
    • A safety factor adds extra bandwidth for unusually high spikes that did not occur during write-workload measurement but could.
    Other factors that must be taken into consideration because of their effect on bandwidth are latency and packet loss, as described in the following topics.

Latency

Network latency affects replication. It is the amount of data that can be present in the data path. In the event of network failure, a certain number of transmitted records will not yet be resident in the secondary storage system’s journal, because they are still in-route within the data path. During periods of low workload, there might be no records in the path, but during periods of heavy workload, the network might be fully used. This amount represents the minimum difference between data in the primary and secondary storage systems.

Packet loss

Packet losses have the effect of reducing overall bandwidth because lost packets must be re-transmitted, which consumes network capacity that would otherwise be occupied by new data traffic. Also, a network can elongate consistency time, since journals are not applied until a contiguous sequence of records has arrived at the secondary site.

Planning ports for data transfer

When new data exists in the P-VOL, the data is transferred through initiator ports and RCU target ports at the primary and secondary systems.

The operation commands (that is, for pair creation, and resynchronization) are sent from primary site initiator ports to secondary site RCU target ports.

For initial or update copy, the secondary site initiator port sends the read-journal command to the primary site RCU target port. The data is then sent through these ports, that is, from primary site RCU target ports to secondary site RCU initiator ports.

Note the following:

  • An initiator port in one system must be connected to an RCU target port in the other system.
  • Two or more initiator ports must be configured before you can create the UR relationship with the secondary storage system and create pairs. (VSP 5000 series)
  • The amount of data that each port can transmit is limited. Therefore, it is critical to know the amount of data that will be transferred during peak periods. This knowledge will ensure that you can set up a sufficient number of ports as initiator and RCU target ports in order to handle all workloads.
  • If your UR system supports a disaster recovery failover/failback environment, the same number of initiator and RCU target ports should be configured on primary and secondary storage systems to enable replication from the secondary site to primary site in a failover scenario.
  • Up to eight paths can be established in both directions. It is recommended that you establish at least two independent data paths to provide hardware redundancy.
Example configuration
  • Two initiator ports on the primary storage system, with two matching RCU target ports on the secondary storage system.
  • Four initiator ports on the secondary storage system, with four matching RCU target ports on the primary storage system.

Port types

You should know about the ports that Universal Replicator systems use for connecting storage systems to hosts, and for transmitting Universal Replicator commands and data between the primary and secondary storage systems.

Ports have the following characteristics:

  • Ports for receiving data and sending data are the same.
    Tip Establish bidirectional logical paths between the primary and secondary sites. When setting logical paths, confirm that the number of logical paths from the primary site to the secondary site, and the number of logical paths from the secondary site to the primary site are the same.
  • The amount of data that each port can transmit is limited.
    Tip
    • It is critical to know the amount of data that will be transferred during peak periods. This knowledge will ensure that you can set up a sufficient number of ports in order to handle all workloads.
    • You need to determine ports for Universal Replicator and for Universal Volume Manager before starting operation.
    • If a system supports failover for disaster recovery, set the same port size for the primary storage system and for the secondary storage system.
    Note When ports are shared by primary and secondary storage systems of Universal Replicator, and primary and secondary storage systems of Universal Volume Manager, if you perform either of the following operations, I/Os stop temporarily until the processing completes:
    • When a remote path and a path for Universal Volume Manager are defined, remove either of them.
    • When either a remote path or a path for Universal Volume Manager is defined, define the path which is not defined.

Supported data path configurations

The data path can be configured using one of the following connection types. For port and topology setting, use Device Manager - Storage Navigator or CCI commands. For a switch connection, you must set the port to Fabric on, Point-to-Point (F-port).

Create at least two independent data paths (one per cluster) between the primary storage system and the secondary storage system for hardware redundancy for this critical element. Configure the paths bidirectionally by using the same connection type for each path:

  • A path from the primary storage system to the secondary storage system
  • A path from the secondary storage system to the primary storage system
Direct connection

A direct connection (loop only) is a direct link between the primary and secondary arrays. NL-port (Node Loop) connections are supported for the data path and host-to-system path.

GUID-290E5B06-3923-4682-A676-2FFFAC71C479-low.png
  • Set the Fabric to OFF for the initiator port and the RCU target port.
  • Set the topology to FC-AL.

Fab settings, topology settings, and available link speeds depend on the settings of channel boards and protocols used for the connections between storage systems as listed in the following table. Auto can be set regardless of the settings of fab and topology.

Package nameProtocolFab settingTopologyAvailable link speed
CHB (FC32G)32GbpsFCOFFFCAL4 Gbps
8 Gbps
Auto*
CHB (FC32G)32GbpsFCOFFPoint-to-Point16 Gbps
32 Gbps
Auto*
* When the link is up with Auto setting, the link speed is automatically determined depending on the speed of the port to which the storage system is connected.
Switch connection

Switch connections push data from the local switch through a Fibre Channel link across a WAN to the remote switch and Fibre Channel of the secondary storage system. F-port (point-to-point) and FL-port (loop) switch connections are supported.

GUID-D39E304E-1A4A-4080-B501-103A2E74F114-low.png
  • Set the Fabric to ON for the initiator port and the RCU target port.
  • Set the topology to Point-to-Point.

Fab settings, topology settings, and available link speeds depend on the settings of channel boards and protocols used for the connections between storage systems as listed in the following table. Auto can be set regardless of the settings of fab and topology.

Switches from some vendors, McData ED5000 for example, require an F-port.

Package nameProtocolFab settingTopologyAvailable link speed
CHB (FC32G)32GbpsFCONPoint-to-Point4 Gbps
8 Gbps
16 Gbps
32 Gbps
Auto*
* When the link is up with Auto setting, the link speed is automatically determined depending on the speed of the port to which the storage system is connected.
Extender connection

Make sure that the extender supports remote I/O. Contact customer support for details.

  • Set the Fabric to ON for the initiator port and the RCU target port.
  • Set the topology to Point-to-Point.
NoteWhen the primary and secondary storage systems are connected using switches with a channel extender, and multiple data paths are configured, the capacity of data to be transmitted might concentrate on particular switches, depending on the configuration and the settings of switch routing. For more information contact customer support.

Fibre Channel data path requirements

Multimode or single-mode optical fiber cables are required on primary and secondary storage systems. The type of cable and number of switches depends on the distance between primary and secondary sites, as specified in the following table.

Distance

Cable type

Data path relay

0 km to 1.5 km (4,920 feet)

Multimode shortwave Fibre Channel interface cables.

Switch is required between 0.5 km to 1.5 km.

1.5 km to 10 km (6.2 miles)

Single-mode longwave optical fibre cables.

Not required.

10 km to 30 km (18.6 miles)

Single-mode longwave optical fibre cables.

Switch is required.

Greater than 30 km (18.6 miles)

Communications lines are required.

Approved third-party channel extender products.

With Fibre Channel connections using switches, no special settings are required for the physical storage system. Direct connections up to 10 km with single-mode longwave Fibre Channel interface cables are supported.

Link speed determines the maximum distance you can transfer data and still achieve good performance. The following table shows maximum distances at which performance is maintained per link speed over single-mode longwave Fibre Channel.

Link speed

Distance maximum performance maintained

4 Gbps

3 km

8 Gbps

2 km

16 Gbps

1 km

32 Gbps

0.6 km

This information is illustrated in the graphic in Additional switches. Note that the type of cable determines the type of SFP used for the port. Longwave cables must be connected to longwave SFPs in the storage system and switch. Shortwave cables must be connected to shortwave SFPs in the storage system and switch. The default Fibre Channel SFP type is shortwave.

Additional switches

When the initiator port on the primary storage system sends data to the secondary storage system, the Fibre Channel protocol accommodates a certain number of un-acknowledged frames before the sender must stop sending. These are known as buffer credits. As Fibre Channel frames are sent out, available buffer credits are exhausted. As acknowledgments come back, the supply of buffer credits is replenished. Because it takes longer for acknowledgments to return as distance increases, exhausting the supply of buffer credits becomes increasingly likely as distance increases.

Adding Fibre Channel switches on either end of the replication network provides the additional credits necessary to overcome buffer shortages due to the network latency.

The following figure shows data path types, switches, and distances.

GUID-8C5EA690-E588-4A99-BA58-53A658F0AEFE-low.png

Fibre Channel used as remote paths

Before configuring a system using Fibre Channel, there are restrictions that you need to consider.

For details about Fibre Channel, see the Provisioning Guide for your system.

  • When you use Fibre Channel as a remote path, if you specify Auto for Port Speed, specify 10 seconds or more for Blocked Path Monitoring. If you want to specify 9 seconds or less, do not set Auto for Port Speed.
  • If the time specified for Blocked Path Monitoring is not long enough, the network speed might be slowed down or the period for speed negotiation might be exceeded. As a result, paths might be blocked.

iSCSI data path requirements

For the iSCSI interface, direct, switch, and channel extender connections are supported. The following table lists the requirements and cautions for systems using iSCSI data paths. For details about the iSCSI interface, see the Provisioning Guide.

Item

Requirement

Remote paths

If iSCSI is used in a remote path, Blocked Path Monitoring must be at least 40 seconds (default). If Blocked Path Monitoring is less than 40 seconds, the path could become blocked bacause of a delaying factor on the network, such as the spanning tree of a switch.

Physical paths

  • To add a remote path to a path group, it is recommended to configure the remote paths by using the same protocol. The configuration containing both of the Fiber Channel and iSCSI connections might impact the performance.
  • Before replacing Fibre Channel or iSCSI physical paths, remove the UR pair and the remote path that are using the physical path to be replaced.
  • Using the same protocol in the physical path between the host and a storage system, or between storage systems is recommended.

    As in the example below, if protocols are mixed, set the same or a greater command timeout value between the host and a storage system than between storage systems.

    Example:

    - Physical path between the host and a storage system: Fibre Channel

    - Physical path between storage systems: iSCSI

Ports

  • When the parameter settings of an iSCSI port are changed, the iSCSI connection is temporarily disconnected and then reconnected. To minimize the impact on the system, change the parameter settings when the I/O load is low.
  • If you change the settings of an iSCSI port connected to the host, a log might be output on the host, but this does not indicate a problem. In a system that monitors system logs, an alert might be output. If an alert is output, change the iSCSI port settings, and then check if the host is reconnected.
  • When you use an iSCSI interface between storage systems, disable Delayed ACK in the Edit Ports window in HDvM - SN, or execute the raidcom modify port -delayed_ack_mode disable command in CCI. By default, Delayed ACK is enabled.

    If Delayed ACK is enabled, it might take time for the host to recognize the volume used by a UR pair. For example, when the number of volumes is 2,048, it takes up to 8 minutes.

  • Do not change the default setting (enabled) of Selective ACK for ports.
  • In an environment in which a delay occurs in a line between storage systems, such as long-distance connections, you must set an optimal window size of iSCSI ports in storage systems at the primary and secondary sites after verifying various sizes. The maximum value you can set is 1,024 KB. The default window size is 64 KB, so you must change this setting.
  • iSCSI ports do not support fragment processing (dividing a packet). When the maximum transmission unit (MTU) of a switch is smaller than that of an iSCSI port, packets might be lost, and data cannot be transferred correctly. Set the same MTU value (or greater) for the switch as the value used for the iSCSI port. For more information about the MTU setting and value, see the switch manual.

    Note, however, that you cannot set an MTU value of 1500 or smaller for iSCSI ports. In a WAN environment in which the MTU value is less than 1500, fragmented data cannot be transferred. In this case, lower the maximum segment size (MSS) of the WAN router according to the WAN environment, and then connect to an iSCSI port. Alternatively, use a WAN environment in which the MTU value is at least 1500.

  • When using a remote path on the iSCSI port for which virtual port mode is enabled, use the information about the iSCSI port that has virtual port ID (0). You cannot use virtual port IDs other than 0 as a virtual port.
  • A port can be used for connections to the host (target attribute) and to a storage system (initiator attribute). However, to minimize the impact on the system if a failure occurs either on the host or in a storage system, you should connect the port for the host and for the storage system to separate CHBs.

Network setting

  • Disable the spanning tree setting for a port on a switch connected to an iSCSI port. If the spanning tree function is enabled on a switch, packets do not loop through a network when the link is up or down. When this happens, packets might be blocked for about 30 seconds. If you need to enable the spanning tree setting, enable the Port Fast function of the switch.
  • In a network path between storage systems, if you use a line that has a slower transfer speed than the iSCSI port, packets are lost, and the line quality is degraded. Configure the system so that the transfer speed for the iSCSI ports and the lines is the same.
  • Delays in lines between storage systems vary depending on system environments. Validate the system to check the optimal window size of the iSCSI ports in advance. If the impact of the line delay is major, consider using devices for optimizing or accelerating the WAN.
  • When iSCSI is used, packets are sent or received using TCP/IP. Because of this, the amount of packets might exceed the capacity of a communication line, or packets might be resent. As a result, performance might be greatly affected. Use Fibre Channel data paths for critical systems that require high performance.

Restrictions when using NVMe-oF

The following restrictions apply when you configure the system by using NVMe over Fibre Channel (NVMe-oF) for the connection between hosts and the storage system. For details about the supported storage systems and DKCMAIN microcode versions for using NVMe-oF, see System requirements. For details about NVMe-oF, see the Provisioning Guide for Open Systems.

Restrictions on remote paths

Configurations using NVMe-oF for a remote path are not supported.

Restrictions on physical paths between hosts and the storage system

Set a larger value for the timeout time of command execution between hosts and the storage system than the timeout time of command execution between the storage systems.

Restrictions on a mixed use with LU path definition using Fibre Channel or iSCSI

A pair cannot be created in the combination of NVMe-oF volume (a volume that is set as a namespace on the NVM subsystem for which NVM subsystem ports have been added) and Fibre Channel or iSCSI volume (a volume to which LU path is set). The following table indicates combinations that can or cannot create a pair.

P-VOLS-VOLA pair can be created
NVMe-oFNVMe-oFYes
NVMe-oFFibre Channel or iSCSINo
Fibre Channel or iSCSINVMe-oFNo
Fibre Channel or iSCSIFibre Channel or iSCSIYes
Restrictions on a 3DC configuration

3DC configurations that use a volume connected to a host by using NVMe-oF are not supported.

Restrictions on connection with other storage systems

For details about the combinations of storage systems that support the P-VOL and S-VOL connected to a host by using NVMe-oF, contact customer support.

Restrictions on pair operations using CCI

When you use a volume connected to a host by using NVMe-oF as a pair volume, specify the pair volume as a dummy LU. For details, see the relevant topic in the Command Control Interface User and Reference Guide.