Skip to main content
Outside service Partner
Hitachi Vantara Knowledge

Monitoring resources


You use the Resources page in the HCP System Management Console to monitor the use of system resources. The information on this page can help you determine the causes of system issues such as slowed responses to client read and write requests or abnormal conditions reported in the system log.

The Resources page uses graphs to show statistics about the use of these resources over time:

CPU

Local logical volumes

Memory

Front-end, back-end, and, if it is enabled, management networks

The graphs let you analyze trends across individual storage nodes and compare node performance to the performance of the system as a whole. The graphs are coordinated with each other, allowing you to easily view the use of multiple resources during the same time period. Additionally, the Resources page can display the HCP system log, so you can correlate resource usage with system events.

To diagnose issues, you should review all the graphs for the applicable time period. Some issues become apparent only when you compare graphs for multiple resources.

To display the Resources page, in the top-level menu of the System Management Console, select Monitoring Resources.

RoleWebHelp.png

Roles: To view the Resources page, you need the monitor or administrator role.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

About the resource usage graphs


HCP uses System Activity Reporter (SAR) data as the basis for resource usage reporting. SAR is a utility that runs on each node in the HCP system. Every ten minutes, SAR records statistics representing the average use of various resources in the node during the past ten-minute interval. The graphs on the Resources page in the System Management Console show these statistics for a subset of those resources.

For the CPU, memory, and network resources, the graphs can show either the average of the SAR statistics across all storage nodes or this average along with the SAR statistics for an individual storage node. For the logical volume resource, the graphs can show can show the SAR statistics only for an individual logical volume.

For information on managing the resource usage graphs, see Managing the resource usage graphs.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

CPU


CPU statistics provide information about the processing load on the HCP system. HCP reports CPU statistics in these graphs:

CPU Usage — This graph shows both the percent of CPU capacity used by the operating system kernel (OS in the graph legend) and the percent of CPU capacity used by HCP processes (HCP in the graph legend).

CPU IO Wait — This graph shows the percent of CPU capacity spent waiting to access logical volumes that are in use by other processes.

These two statistics together equal the total processing load on the system.

If CPU usage is consistently high across all nodes and system performance is degraded, the namespace application workload may be too heavy for the system to handle efficiently. In this case, you may need to add nodes to the system or upgrade the existing nodes to nodes with greater CPU capacity.

If CPU usage is high on a recurring basis, check the system log to see whether the high CPU usage correlates with recurring events such as services running. If the high usage correlates with services running, you may want to change the service schedule. For information on doing this, see Scheduling services.

High CPU usage on only a small number of nodes may mean that applications are repeatedly using the same IP addresses to access the system. In this case, you may want to suggest to tenant administrators that their applications use DNS or some other mechanism to help balance the workload across all the nodes in the system.

Consistently high CPU IO wait with low CPU usage may mean that HCP cannot access the system storage fast enough to keep up with application demand. In this case, you may need to add storage to the system so that attempts to access storage are spread across a larger number of logical volumes.

In an HCP SAIN system with spindown storage, high IO wait on nodes with logical volumes that can be spun down may indicate that these volumes are being spun up frequently. Check the LUN Utilization graph for the load on spindown volumes. If the load is high, you may want to redefine service plans to have a longer wait time before objects are moved to spindown storage. To see which logical volumes are spindown volumes, check the Hardware page in the System Management Console. For information on service plans, see Working with service plans.

A brief period of high CPU IO wait that corresponds to increased workload does not necessarily indicate a problem.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Logical volumes


Logical volume usage statistics provide information about the load on storage managed by HCP. These statistics are available only for individual logical volumes. HCP reports logical volume usage statistics in these graphs:

LUN Read/Write — This graph shows the number of blocks read from the logical volume per second and the number of blocks written to the logical volume per second.

LUN Utilization — This graph shows the usage of the communication channel between the operating system and the logical volume as a percent of the channel bandwidth.

The way logical volumes are used depends on the HCP system configuration. Some logical volumes can store only objects, some can store only the metadata query engine index, and some can store both. Additionally, in an HCP SAIN system, some logical volumes may used for spindown storage. How you interpret the statistics in the logical volume usage graphs is partly dependent on the these factors.

High read, write, and access values for all logical volumes along with low CPU usage may mean that the HCP system storage has insufficient bandwidth to support its workload. In this case, you may need to add storage to the system to spread read and write operations across more logical volumes.

High read and write rates for some but not all logical volumes may mean that the distribution of objects across the nodes in the system is uneven. To verify that this is the case, check the logical volume usage statistics on the Hardware page in the System Management Console. To resolve the issue, submit a request to your authorized HCP service provider to run the capacity balancing service to bring the object distribution to a more balanced state.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Memory


HCP reports on memory usage in the Memory Swap graph. This graph shows the number of pages swapped out of memory per second.

Typically, the page-swap rate for an HCP system is less than one page per second. A consistently high page-swap rate may indicate that the system has insufficient memory to handle its workload. In this case, you may need to add nodes to the system, add memory to the existing nodes, or upgrade existing nodes to nodes with more memory to resolve the issue.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Networks


Network statistics provide information about bandwidth usage on the front-end, back-end, and, if it is enabled, management networks used by the HCP system. HCP reports network statistics in these graphs:

Front-end Network — This graph shows the number of bytes read from the node per second and the number of bytes written to the node per second over the front-end network. These are the total numbers of bytes across the [hcp_system] network and all user-defined networks.

Back-end Network — This graph shows the number of bytes read from the node per second and the number of bytes written to the node per second over the back-end network.

Management Network — This graph shows the number of bytes read from the node per second and the number of bytes written to the node per second over the management network. These are the total numbers of bytes across the [hcp_management] network. The management network graph is only shown if the network is enabled.

NoteWebHelp.png

Note: The amount of back-end network traffic generated by any given namespace is directly related to the ingest tier DPL defined for the namespace by its service plan. The higher the ingest tier DPL, the more back-end network traffic the namespace creates.

Heavy traffic (greater than 120 MB per second) on both the front-end and back-end networks may mean that the HCP system has insufficient bandwidth to accommodate its workload. In this case, you may need to add nodes to the HCP system to increase the available bandwidth.

Heavy front-end traffic on some nodes, but not all of them, may indicate one of the following problems:

The HCP subdomain is not correctly configured in your DNS. For information on configuring DNS for HCP, see Configuring DNS for HCP.

Applications are repeatedly using the same IP addresses to access the system. In this case, you may want to suggest to tenant administrators that their applications use DNS or some other mechanism to help balance the workload across all the nodes in the system.

Heavy back-end traffic on some but not all nodes may mean that the distribution of objects across the nodes in the system is uneven. To verify that this is the case, check the logical volume usage statistics on the Hardware page in the System Management Console. To resolve the issue, submit a request to your authorized HCP service provider to run the capacity balancing service to bring the object distribution to a more balanced state.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Managing the resource usage graphs


To manage the resource usage graphs on the Resources page, you can:

Choose which graphs to display

Choose whether the graphs show information about an individual node or all nodes

Zoom in to more easily see the individual data points in the graphs

Select a time for which you want to know the resource usage details

Specify the time period you want to see in the graph windows

Scroll left and right to change the time period shown in the graph windows

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Switching graphs


The Resources page shows four graphs at a time. By default, the page shows these graphs: CPU Usage, CPU IO Wait, Back-end Network, and Front-end Network. At any time, you can change which graphs are shown.

To switch one graph for another, in the title bar of the graph you want to switch, click on the icon for the graph you want. The graph icons are:

ResourceCPUUsage.png (%) — CPU Usage

ResourceCPUIOWait.png (IO) — CPU IO Wait

ResourceVolumeReadWrite.png (r/w) — LUN Read/Write

ResourceVolumeUtilization.png (%) — LUN Utilization

ResourceMemory.png (M) — Memory Swap

ResourceFrontEndNetwork.png (F) — Front-end Network

ResourceBackEndNetwork.png (B) — Back-end Network

ResourceManagement.png (M) - Management Network

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Setting the scope


The CPU, memory, and network graphs can show statistics for all nodes or for an individual node along with the statistics for all nodes. The logical volume graphs can show statistics only for an individual volume and only while a node is selected for display in the other graphs.

To set the scope of the information shown in the graphs, select either All Nodes or the node you want in the field above the top right graph. The selection applies to all the graphs.

To set the logical volume for a logical volume graph, while an individual node is selected for display, select the volume you want in the field in the graph title bar. The selection applies only to that graph.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Zooming


The graphs on the Resources page can show resource usage statistics for at most 30 days (or fewer if the system was installed less than 30 days ago). However, the longer the time period shown, the harder it is to see short-term changes in resource usage.

You can zoom in on the graphs to more easily see the individual data points in them. Zooming in changes the time period visible in each graph but does not change the height of the graphs. You can zoom in until the visible time period is one day. You can zoom out until the visible time period is 30 days.

Zooming affects all the graphs equally.

To zoom in, click on the plus control ( PlusControlOrange.png ) above the top left graph.

To zoom out, click on the minus control ( MinusControlOrange.png ) above the top left graph.

After each zoom action, the zoom controls are momentarily grayed.

You can set the amount by which zooming increases or decreases the visible time period each time you zoom. To do this:

1.On the Resources page, click on the edit control ( EditControlBlue.png ) above the top left graph.

2.In the Modify Zoom and Scroll Settings window, select one of these in the Zoom Settings section:

oIncremental zoom to increase or decrease the visible time period by a factor of two each time you zoom in or out

oMaximum zoom to decrease the visible time period to one day when you zoom in or to increase the visible time period to 30 days (or fewer if the system was installed less than 30 days ago) when you zoom out

3.Click on Update Settings.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Viewing details for a point in time


In each resource usage graph, a vertical line serves as a time marker. The marked time is displayed in the middle above the graphs. The graph legends show the detailed resource usage statistics for the marked time.

To show detailed statistics for a different time, you reposition the time marker. The time marker is always in the same location in all the graphs, so when you reposition it in one graph, it changes in all.

To reposition the time marker, click in the graph at the point in time to which you want the marker to move.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Specifying a time period


Above the resource usage graphs, the Resources page displays the start time of the currently visible time period on the left and the end time on the right. By default, when you open the page, the graphs show statistics for 30 days (or fewer if the system was installed less than 30 days ago), ending with the day of the most recently recorded SAR data.

You can change the time period shown in the graphs by selecting new start and end dates. The time period you specify can be at most 30 days. This time period applies to all the graphs.

HCP keeps SAR data for 180 days. As a result, the start date you specify cannot be more than 180 days in the past.

To change the time period shown in the resource usage graphs:

1.On the Resources page, click on the calendar control ( CalendarControlOrange.png ) above the top left graph.

2.In the Modify Date Range window:

oIn the From field, specify the start date for the time period you want the graphs to cover. If you leave this field blank, the graphs use a start date of 30 days before the date specified in the To field (or fewer if the system was installed less than 30 days ago).

oIn the To field, specify the end date for the time period you want the graphs to cover. If you leave this field blank, the graphs use an end date of 30 days after the date specified in the From field (or fewer if the From date is less than 30 days ago).

In both fields, you can specify the date in either of these ways:

oClick on the calendar control ( CalendarControlGrey.png ) next to the applicable field and select the date you want.

oType the date you want, in this format: m/d/y

In this format, m is the one- or two digit month, d is the one- or two-digit day, and y is the two- or four-digit year.

You can specify values in either or both of the From and To fields. You cannot leave both fields empty.

3.Click on Update Settings.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Scrolling


You can scroll left and right in the resource usage graphs to shift the time period that’s visible in the graphs. Scrolling affects all graphs equally.

To scroll left to see an earlier time period, click on the left arrow ( ResourceGraphLeft.png ) above the top left graph.

To scroll right to see a later time period, click on the right arrow ( ResourceGraphRight.png ) above the top left graph.

To scroll right or left to see the time period with the currently marked time in the middle, click on the vertical line ( ResourceGraphCenter.png ) above the top left graph.

After each scroll action, the scroll controls are momentarily grayed.

You can set the percent by which scrolling shifts the visible time period each time you scroll left or right. To do this:

1.On the Resources page, click on the edit control ( EditControlBlue.png ) above the top left graph.

2.In the Modify Zoom and Scroll Settings window, select one of these in the Scroll Settings section:

oMove left/right by 25% to shift the visible time period by 25% each time you scroll left or right.

oMove left/right by 50% to shift the visible time period by 50% each time you scroll left or right.

oMove left/right by 100% to shift the visible time period by 100% each time you scroll left or right.

3.Click on Update Settings.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Viewing system log messages on the Resources page


To display the HCP system log on the Resources page, click on Logs at the bottom of the page.

When the Logs section opens, the message at the top of the list is the most recently recorded message before the end of the time period shown in the graphs.

For information on displays of system log messages, see Understanding the HCP system log.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.