Content
Expedite the troubleshooting process for the most common issues by reviewing and answering the following questions for your product.
General Troubleshooting Questions
- What is the current impact?
- What error(s) is the customer reporting?
- When did this issue occur and is it ongoing?
- What troubleshooting steps have been taken so far?
- What is the currently installed version?
Potential Issue Areas
- Performance issues
- Security permissions issues
- Heap memory issues
- File (NDMP) replication problems
- HNAS resets / HNAS "boot loops"
- NFS/CIFS access issues
- File system down
- AV issues
- DM2C issues
- Gathering packet captures
Performance issues
If a customer is reporting a performance problem refer them to this knowledgebase article:
If they are reporting that a node is "hung" refer them to this knowledgebase article:
Security permissions issues
If a customer is reporting a problem with security permissions refer them to this knowledgebase article:
Heap memory issues
Events/issues like:
- Failed to allocate panics ("start of "FailedToAllocate" panic"),
- Heap fragmentation ("No big free heap blocks left"),
- Heap related enforced limits:
- Insufficient free heap:
- "There is insufficient heap: dropping new connection from"
- "There is insufficient heap: dropping additional connection request from"
- Insufficient free heap in big sized heap blocks:
- "There is insufficient heap: dropping new connection from"
- "There is insufficient heap: dropping additional connection request from"
- Insufficient free heap:
Refer to the knowledgebase article What Information Do I Need to Gather to Allow GSC to Diagnose Heap Issues on HNAS. Once you have the appropriate diagnostics refer to the article How To Diagnose HNAS Heap Memory Issues to continue the diagnosis.
File (NDMP) replication problems
If a customer is reporting a problem with file (NDMP) replication refer them to this knowledgebase article:
HNAS resets / HNAS "boot loops"
If a customer (or Remote Ops (Hi-Track)) reports a HNAS reset then you usually just need HNAS managed server diagnostics from after the reset. If the server has been restarted since the reset then the diagnostics for the reset will no-longer be in the last_dblog.txt. In that case you will need to ask the customer to get historical logs. For both diagnostics and historical logs refer to Hitachi NAS (HNAS) Platform Data Collection.
If the HNAS is continually resetting (aka it is stuck in a "boot loop.") refer the customer initially to:
Once you have the appropriate diagnostics refer to How To Diagnose HNAS Resets to proceed with the diagnosis.
Remember to check that the system wasn't just being deliberately rebooted.
NFS/CIFS Access
- Which share/EVS is impacted?
- Are all clients impacted or just one?
- Is the EVS/Share pingable by hostname and/or ip address?
- When did the share become inaccessible?
- Are there any AV servers attached to the impacted EVS?
- Any errors reported from the clients?
- What errors is the HNAS reporting?
- Which OS is the client(s) running? [e.g. SW version and patch level]
- Try migrating the impacted EVS to the other node, does the connectivity issue follow the EVS or stick to one node?
- Collect and upload diags.
File System Down
- What is the current impact?
- Which file system is currently down?
- What time did the file system go offline?
- Any recent changes in the environment (storage issues, power issues)?
- Was the customer able to remount the file system? If so what was the result?
- What is the current state of the underlying storage array?
- Is the file system full?
- Any errors reported from the clients?
- What errors is the HNAS reporting?
- Please collect and upload diags
AV Server
- What is the current impact to the HNAS/AV server?
- Is CIFS access slow or impacted in some way?
- What is the make and version of the AV server(s)?
- How many AV servers have been dedicated to each EVS?
- Are the AV servers shared between the EVS's or dedicated to unique EVS's?
- Which EVS is impacted?
- What is the status of the AV server (CPU maxed, hung)?
- What errors are being seen on the HNAS and/or clients?
- Are zipped/archived files only impacted or are all file types impacted?
DM2C
- Which file system is impacted? If multiple, please provide a list.
- What is the name of the impacted namespace data is being migrated to?
- Any errors reported from the clients?
- What errors is the HNAS reporting?
- What is the current HNAS version?
- What is the current HCP version?
- Please collect HCP logs and HNAS diags
- Which DM2C jobs are impacted? Only one or all?
- Which HCP account is the HNAS using to interact with the impacted namespace?
Gathering packet captures
Refer the customer to the knowledgebase article:
CXone Metadata
Tags: hnas, Performance, CIFS, security, reset, heap, Replication, NFS, Heap Memory, file replication, boot loop, permissions, triage-questions, Triage Questions, File System Down, AV Server, DM2C, NDMP replication
Page ID: 26074