Remote Copy Performance Data Collection
Note: Please label all file names clearly to identify which MCU, RCU, HOST or SWITCH the LOG or DUMP applies to. Include the date and time it was extracted in the file name or accompanying documentation.
When a remote copy solution is deployed, the customer needs to work with the Hitachi Hardware Engineer as we need:
- Synchronized, System Option Mode 31 on, Type=DETAIL Auto Dump from the MCU and RCU taken while the performance problem is in progress. See Detail Dump - CE Required for more details. Remember, if MODE 31 has been turned on, then remember to turn it off after the Detail dump completes. MODE 31 can have a performance impact in some workload types and should not be left on unless specifically requested by GSC.
- Please detail exactly what was happening when the dumps were taken. For example, was there large TC Initial Copy or Resync activity being performed?
- The time difference between the Hosts, Switches and the SVP clocks on the MCU and RCU.
- Performance Monitor data from both the MCU and RCU. Please collect Performance Monitor data within 6 hours of each other (MCU and RCU) and within 23 hours of the problem... the sooner the better. See Data Collection from Performance Monitor for more details.
- GetConfig information from as many affected HOSTS as soon after the problem is recreated or has happened. This should allow GSC to see any IO errors that may be being experienced.
- Diagram showing connectivity between the MCU and RCU including switches, port numbers, link and distance.
- Does the problem occur if the pairs are deleted?
- If this is a 9900V, please turn on the HRC Usage Monitor at 1 minute intervals for a while before taking the Detail Dump.
Data Collection needs to be properly synchronized between all the various components and collected within 23 hours of the problem.
If there are switches in the TrueCopy configuration and there is a possibility of link issues anywhere in the configuration, gather the appropriate LOG data from all concerned switches ASAP after an incident.
- McData fabric - provide a "data capture" from EFCM if available for every switch. If EFCM is not available, then use the Web Explorer (or EFCM Lite) and export ALL the error logs from each switch directly.
- Brocade fabric - provide a "supportshow" from each switch in the Fabric ASAP after the incident. Recommend to upgrade all capable switches to code level V5.x to allow proper date & time stamps on some of the LOG data.
- Cisco fabric - provide a "show tech-support details" from each switch in the fabric ASAP after the event.
Supply also the following configuration data:
- Which LUNS, which CU:LDEVs and which C/T Group (if applicable) are suffering the problem.
Provide a Host SAN and TrueCopy SAN configuration diagram. If one is NOT available then draw a basic diagram and provide that. This must show the Hosts having problems, how many HBAs they have, what WWN the HBAs are, what PORTS in the switch they plug into, what Ports in the RAID the Host attaches to and what ports in the switches the RAID ports plug into.
The TrueCopy SAN diagram must clearly identify which are the MCU Initiator Ports and the equivalent RCU Target Ports, where the Ports plug into the TrueCopy SAN switches, and what distance and type of LINK (WAN or FC) exists between the MCU and RCU.
Please detail any recent workload increase on any of the Hosts. Also, describe the impact to the customer's operations.