Storage components
Within the Hitachi Content Platform for cloud scale (HCP for cloud scale) system, the Object Storage Management application lets you manage and monitor storage components.
The starting point for storage component management is the page Storage in the application Object Storage Management. The procedures in this module begin at this page.
Adding a storage component
You can use the Object Storage Management application or an API method to add a storage component to the HCP for cloud scale system.
The storage component must contain an HCP for cloud scale bucket before you can add the storage component to the HCP for cloud scale system.
To add a storage component, it must be available and you need the following information about it:
- Storage component type
- Endpoint information (host name or IP address)
- If an HCP S Series Node storage component, the cluster name, management host name, and administrative user credentials
- If used, the proxy host and port and the proxy user name and password
- API port
- S3 credentials (the access key and secret key to use for access to the storage component bucket)
Object Storage Management application instructions
Procedure
From the Storage page, click Add storage component.
The ADD STORAGE COMPONENT page opens.Specify the following:
Name (optional): Type the display name you choose for the storage component, up to 1024 alphanumeric characters.
If you leave this blank, the storage component is listed without a name.Type: Select AMAZON_S3, HCP_S3, HCPS_S3 (HCP S Series Node), or GENERIC_S3.
Region (optional): Type a region name of up to 1024 characters.
Endpoint: Type either the IP address or the cluster host name of the storage component. Type as many as 255 URI unreserved characters using only A-Z, a-z, 0-9, hyphen (-), period (.), underscore (_), and tilde (~). The final segment of a host name must not begin with a number.
For an HCP S Series Node storage component, the host name must be hs3.cluster_name.
In the S3 CONNECTION section, specify the following:
Select the Protocol used, either HTTPS (the default) or HTTP.
If Use Default is selected, the applicable default port number is filled in. If you cancel the selection Use Default, type the Port number.
In the PROXY section, specify the following:
If you select Use Proxy, type values in the Host and Port boxes, and if the proxy needs authentication, type the Username and Password.
In the BUCKET section, specify the following:
Bucket Name: Type the name of the bucket on the storage component. The name can be from 3 to 63 characters long and must contain only lowercase characters (a-z), numbers (0-9), periods (.), or hyphens (-).
NoteThe bucket must already exist on the storage component and should be empty.(Optional) To use path-style URLs to access buckets, select Use path style always (the default).
In the AUTHENTICATION section, specify the following:
Type: Select the AWS Signature version, either V2 or V4.
Type the Access Key.
Type the Secret Key.
When you are finished, click Save.
The storage component is added to the Storage components section of the Storage page with the state ACTIVE.
Results
If the storage component state is INACTIVE, a configuration value might be incorrect. Select Verify from the More menu for the storage component and click Activate from the window that appears. If configuration errors are detected, correct them and try again.
Related REST API methods
POST /storage_component/create
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Verifying a storage component
You can use the Object Storage Management application to verify a storage componen.
The storage component must be in the state Available before it can be used by the HCP for cloud scale system.
Object Storage Management application instructions
The verification process checks for these possible configuration errors:
- The specified bucket is already in use.
- The specified bucket does not exist.
- Th endpoint is incorrect.
- The secret key or access key is incorrect.
- Path style addressing is configured but the storage component cannot use it.
- The authorization type is incorrect.
If configuration errors are detected, edit the storage component configuration to correct them and try again.
Modifying a storage component
You can use the Object Storage Management application or an API method to modify the configuration of a storage component after defining it.
Object Storage Management application instructions
Procedure
From the Storage page, navigate to the storage component you want to edit.
Click the more icon (
The storage component's configuration page appears.) icon by the storage component and select Edit.
Edit the connection information as needed. When you're finished, click Save.
The Username field is blank, but the configured value is used unless you are change it.
Results
If the storage component state becomes INACTIVE, a configuration value might be incorrect. Select Verify from the More menu for the storage component and click Activate from the window that appears. If configuration errors are detected, correct them and try again.
Related REST API methods
POST /storage_component/update
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Activating a storage component
You can use the Object Storage Management application or an API method to activate a storage component.
A storage component is displayed as UNVERIFIED if HCP for cloud scale cannot reach the storage component with the supplied parameters or if the storage component is misconfigured.
Object Storage Management application instructions
Procedure
From the Storage page, navigate to the storage component you want to activate.
Click the more icon (
A message appears and prompts you to confirm your action.) of the storage component and then select Set active.
Click Yes, active
The storage component state changes to ACTIVE.
Results
If the storage component state remains INACTIVE, a configuration value might be incorrect. Select Verify from the More menu for the storage component and click Activate from the window that appears. If configuration errors are detected, correct them and try again.
Related REST API methods
POST /storage_component/update_state
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Deactivating a storage component
You can use the Object Storage Management application or an API method to deactivate a storage component.
You might deactivate a storage component for maintenance purposes.
After you mark a storage component as INACTIVE, read, write, and healthcheck requests are rejected.
Object Storage Management application instructions
Procedure
From the Storage page, navigate to the storage component you want to deactivate.
Click the more icon (
A message appears and prompts you to confirm your action.) of the storage component and then select Set inactive.
Click Yes, inactivate
The storage component state changes to INACTIVE.
Related REST API methods
POST /storage_component/update_state
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Marking a storage component as read-only
You can use the Object Storage Management application or API methods to mark a storage component as read-only.
Storage components are not automatically marked as read-only mode if they become completely full. You might mark a storage component as read-only if it is nearly full.
Once you mark a storage component as read-only, write requests are directed to different storage components.
You can only mark a storage component as read-only if it is marked read-write and in the state ACTIVE.
Object Storage Management application instructions
Procedure
From the Storage page, navigate to the storage component you want to mark.
Click the more icon (
A message appears and prompts you to confirm your action.) of the storage component and then select Set read-only.
Click Mark read-only.
The storage component is marked as read-only.
Related REST API methods
PATCH /storage_component/update POST /storage_component/update_state
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Marking a storage component as read-write
You can use the Object Storage Management application or API methods to mark a storage component as read-write.
This makes the storage component available for writing new objects.
You can only mark a storage component as read-write if it is marked read-only and in the state ACTIVE.
Object Storage Management application instructions
Procedure
From the Storage page, navigate to the storage component you want to mark.
Click the more icon (
A message appears and prompts you to confirm your action.) of the storage component and then select Open for writes.
Click Open for writes.
The Read-only flag for the storage component is marked as No.
Related REST API methods
PATCH /storage_component/update POST /storage_component/update_state
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Viewing storage components
You can use the Object Storage Management application or an API method to view information about the storage components defined in the system.
For each storage component, you can get information about its name, type, region, and current state.
The storage component types are:
- AMAZON_S3: An Amazon Web Services S3 compatible node
- HCP_S3: A Hitachi Content Platform node
- HCPS_S3: An HCP S Series node
- GENERIC_S3: An S3 compatible node
The possible storage component states are:
- Active: Available to serve requests
- Inactive: Not available to serve requests (access is administratively paused)
- Inaccessible: Available to serve requests, but HCP for cloud scale is having access issues (for example, network, authentication, or certificate issues)
- Unverified: Not available to serve requests (unreachable by specified parameters, miconfigured, or awaiting administrative activation)
The storage component state Read-only can be on or off.
Object Storage Management application instructions
The storage components defined in the HCP for cloud scale system are listed in the Storage components section of the Storage page.
Related REST API methods
POST /storage_component/list
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Displaying storage component analytics
The Storage page displays counts of active, inactive, and unverified storage components, the total count of active objects, and information about system-wide total, used, and estimated available storage capacity. The page also displays information about individual storage components and their current capacity.
The Storage page displays several areas of information.
The top area of the page displays the following rolled-up information for HCP S Series Node storage components configured in the system.

- Total capacity - the total number of bytes available, as well as the change over the past week
- Used capacity - the total number of bytes used, as well as the change over the past week
- Estimated available capacity - the total number of bytes unused, as well as the change over the past week
- Total objects - the total count of objects stored across all storage components, as well as the change over the past week
- Active storage - the number of storage components that can receive objects
- Inactive storage - the number of storage components that cannot receive objects
- Unverified storage - the number of storage components whose state can't be determined
The calculation of used capacity includes:
- HCP S Series Node storage components configured for capacity monitoring
- Storage components set to read-only status
- Storage components that are inactive
Metrics for capacity usage are for Metadata Gateway instances only, so adding used capacity to estimated available capacity will not equal the total capacity on the system. Also, multiple services are running on a system instance, all sharing the disk capacity. Therefore, the estimated available capacity for the Metadata Gateway service on one node can be consumed by a different service running on the same node.
The calculation of estimated available system capacity does not include:
- HCP S Series Node storage components not configured for capacity monitoring
- Storage components other than HCP S Series Node storage components
- Storage components set to read-only status
- Storage components that are inactive
The central area of the page displays information for each HCP S Series Node storage component configured for capacity monitoring in the system.

- User-defined name.
- Type (HCP S Series Node, displayed as HCPS_S3).
- AWS region.
- State:
- Active - Available to serve requests
- Inactive - Not available to serve requests (access is administratively paused)
- Unverified - Not available to serve requests (unreachable by specified parameters, or awaiting administrative activation)
- Whether or not the storage component is set to read-only status.
- Disk capacity: A graphical display of used capacity as a percentage of total capacity. You can configure a warning threshold, which is displayed as a red line. If the used capacity is below the threshold the bar is displayed in blue, and if the used capacity exceeds the threshold the bar is displayed in red. If no capacity is used the bar is displayed in gray. For example:
- Total capacity (used plus free).
- Available (estimated) capacity.
Capacity alerts are generated by the MAPI Gateway service. Use the System Management application to configure the capacity alert threshold for individual storage components or the overall system.
A more button (), to the right of each storage component, opens a menu of actions that you can perform on that storage component:
- Edit - edit the configuration of the storage component
- Set inactive | Set active- change the state of the storage component between active and inactive
- Set read-only | Set read-write - change the status of the storage component between read-only and read-write
The bottom area of the page displays a graph over time of the count of active objects stored in the system. The maximum time period is the previous week.

Displaying counts of storage components
You can use the Object Storage Management application or an API method to display counts of storage components in the system.
The page displays the following rolled-up information for HCP S Series Node storage components configured in the system:
- Active storage - the number of storage components that can receive objects
- Inactive storage - the number of storage components that cannot receive objects
- Unverified storage - the number of storage components that are misconfigured or whose state can't be determined
Object Storage Management application instructions
To display storage counts, select Storage.
The infographic displays the count of active, inactive, and unverified storage components.
Related REST API methods
POST /storage_component/list
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Metrics
HCP for cloud scale uses a third-party, open-source software tool, running over HTTPS as a service, to provide storage component metrics through a browser.
The Metrics service collects metrics for these HCP for cloud scale services:
- S3 Gateway
- MAPI Gateway
- Policy Engine
- Metadata Coordination
- Metadata Gateway
By default the Metrics service collects all storage component metrics and you cannot disable collection. By default, the Metrics service collects data every ten seconds (the Scrape Interval) and retains data for 15 days (the Database Retention); you can configure these values in the service by using the System Management application.
Displaying the active object count
The Object Storage Management application displays a count of active objects stored in the system.
Object Storage Management application instructions
To display the Active objects report, select Storage. The Storage page opens.
The page displays a line graph showing the total number of active objects in the system over time. The maximum time period is one week.
Displaying metrics
You can use the metrics service to display or graph metrics, or use the service API to obtain metrics.
Object Storage Management application instructions
You can display and graph metrics using the metrics GUI.
To display metrics, click the app switcher menu () and then select Prometheus. The metrics tool opens in a separate browser window.
The metrics tool is a third-party, open-source package. For information about using the metrics tool, see the documentation provided with the tool.
Available metrics
Metrics provide information about the operation of a service. Metrics are collected while the service is active. If a service restarts, its metrics are restarted.
The metrics described here fall into these categories:
- Counter - A numeric value that can only increase or be reset to zero. A counter tracks the number of times a specific event has occurred. An example is the number of S3 servlet operations.
- Gauge - A counter that can increase or decrease. An example of a gauge is the number of active connections.
- Histogram - A set of grouped samples. A histogram approximates the distribution of numerical data.
http_s3_servlet_requests_latency_seconds
), but doesn't have at least two data points, the value is reported as NaN.The following metrics are available from all services.
Metric | Description |
http_healthcheck_requests_total | The total number of requests made to the health verification API. |
http_monitoring_requests_total | The total number of requests made to the monitoring API. |
scheduled_policy_work_items | The total number of work items processed by each scheduled policy. A work item is defined as:
|
scrape_duration_seconds | The duration in seconds of the scrape (collection interval). |
scrape_samples_post_metric_relabeling | The count of samples remaining after metric relabeling was applied. |
scrape_samples_scraped | The count of samples the target exposed. |
up | 1 if the instance is healthy (reachable) or 0 if collection of metrics from the instance failed. |
The following metrics are available from the Data Lifecycle service. Metrics are recorded for the following policies. Not every metric applies to every policy.
- CHARGEBACK_POPULATION
- CLIENT_OBJECT_POLICY
- DELETE_BACKEND_OBJECTS
- EXPIRE_FAILED_WRITE
- INCOMPLETE_MPU_EXPIRATION
- TOMBSTONE_DELETION
- VERSION_EXPIRATION
Metric | Description |
lifecycle_policy_accept_latency_seconds | The lifecycle policy acceptance processing latency in seconds. |
lifecycle_policy_completed | The total number of lifecycle policies completed. |
lifecycle_policy_concurrency | The total number of threads currently running for the policy. |
lifecycle_policy_conflicts | The total number of lifecycle policy conflicts. |
lifecycle_policy_deleted_backend_objects_count | The total number of objects deleted from backend storage by the policy DELETE_BACKEND_OBJECTS. |
lifecycle_policy_errors | The total number of errors that occurred while executing lifecycle policy actions, in the categories:
|
lifecycle_policy_examine_latency_seconds | The lifecycle policy examination processing latency in seconds. |
lifecycle_policy_expiration_completed_count | The total number of objects completely processed by the expiration policies DELETE_MARKER and PERMANENT_DELETE). |
lifecycle_policy_list_latency_seconds | The lifecycle policy listing latency in seconds. |
lifecycle_policy_rekey_initiated_count | The number of times a rekey operation has been initiated. |
lifecycle_policy_rekeyed_objects_count | The total number of objects rekeyed. |
lifecycle_policy_splits | The total number of lifecycle policy splits. |
lifecycle_policy_started | The total number of lifecycle policies started. |
lifecycle_policy_submitted | The total number of lifecycle policies submitted. |
s3_operation_count | The count of S3 operations (READ, WRITE, DELETE, and HEAD) per storage component. |
s3_operation_error_count | The count of failed S3 operations (READ, WRITE, DELETE, and HEAD) per storage component. |
s3_operation_latency_seconds | The latency of storage component operations (READ, WRITE, DELETE, and HEAD) in seconds. |
The following metrics are available from the Key Management Server service. These metrics are collected every five minutes.
Metric | Description |
kmip-servers_offline | The count of KMS servers that are offline. Updated hourly. |
kmip_servers_online | The count of KMS servers that are online. Updated hourly. |
kmip_total_kek_count | The count of key encryption keys stored in the KMS server. This count increments when an HCP S Series Node is added or when a rekey occurs. |
lifecycle_policy_rekey_initiated_count | The count of how many times rekeying has been initiated through either the MAPI method or the Object Storage Management application. |
lifecycle_policy_rekeyed_objects_count | The total count of data encryption keys that are re-wrapped with key encryption keys. |
The following metrics are available from the MAPI Gateway service. These metrics are collected every five minutes.
Metric | Description |
storage_available_capacity_bytes | The number of bytes free on an HCP S Series Node. |
storage_total_capacity_bytes | The number of bytes total, available and used, on an HCP S Series Node. |
storage_total_objects | The number of objects on an HCP S Series Node. |
storage_used_capacity_bytes | The number of bytes used on an HCP S Series Node. |
Each metric is reported with a label, store
, identifying it as being either from a specific HCP S Series Node or the aggregate total. You can also retrieve the metrics using this label. For example, to retrieve the used storage capacity of the storage component hcps10.company.com
, you would specify:
storage_used_capacity_bytes{store="hcps10.company.com"}
To retrieve the number of objects stored on the HCP S Series Node storage component snode67.company.com
, you would specify:
storage_total_objects{instance="hcpcs_cluster:9992",job=MAPI-Gateway",store="snode67.company.com"}
To retrieve the used storage capacity of all available storage components, you would specify:
storage_used_capacity_bytes{store="aggregate"}
The Message Queue service supports a large number of general metrics. Information on these metrics is available at https://github.com/rabbitmq/rabbitmq-prometheus/blob/master/metrics.md.
The following metrics are available from the Metadata Coordination service.
Metric | Description |
mcs_copies_per_partition | Gauge of the number of copies of each metadata partition per key space (to verify protection). Two copies means available but not fault tolerant; three copies means available and fault tolerant. |
mcs_disk_usage_per_instance | Gauge of the total disk usage of each metadata instance. |
mcs_disk_usage_per_partition | Gauge of the disk usage of each metadata partition per key space. |
mcs_failed_moves_per_keyspace | Counter of the number of unsuccessful requests for metadata partition moves per keyspace. |
mcs_failed_splits_per_keyspace | Counter of the number of unsuccessful requests for metadata partition splits per keyspace. |
mcs_moves_per_keyspace | Counter of the number of successful requests for metadata partition moves per keyspace. |
mcs_partitions_per_instance | Gauge of the total number of metadata partitions per metadata instance. This is useful to verify balance and determine when scaling might be necessary. |
mcs_splits_per_keyspace | Counter of the number of successful requests for metadata partition splits per keyspace. |
The following metrics are available from the Metadata Gateway service.
- Client count metrics are an approximation and might not correspond to the actual count.
- Depending on when garbage collection tasks run, the ratio of client objects size to stored objects size might show a discrepancy.
Metric | Description |
async_action_count | The count of actions performed. |
async_action_latency_seconds_bucket | A histogram for the duration, in seconds, of actions on buckets. For actions comprising multiple steps, this is the total of all steps. |
async_action_latency_seconds_count | The count of action latency measurements taken. |
async_action_latency_seconds_sum | The sum of action latency in seconds. |
async_concurrency | A gauge for the number of concurrent actions. |
async_duq_latency_seconds_bucket | A histogram for the duration, in seconds, of operations on the durable update queue. |
async_duq_latency_seconds_count | The count of durable update queue latency measurements. |
async_getwork_database_count | The number of driver work checks accessing the database. |
async_getwork_optimized_count | The number of driver work checks avoiding the database. |
async_duq_latency_seconds_sum | The sum of actions on durable update queue in seconds. |
metadata_available_capacity_bytes | The free bytes per instance (node) for the Metadata Gateway service. The label store is either the instance or aggregate .Note: Because multiple service instances can run on a node, all consuming the same shared disk space, the value returned by this metric might be more than the actual capacity available. |
metadata_clientobject_active_count | The count of client objects in metadata that are in the ACTIVE state. |
metadata_clientobject_active_encrypted_count | The count of encrypted client objects in metadata that are in the ACTIVE state. |
metadata_clientobject_active_unencrypted_count | The count of unencrypted client objects in metadata that are in the ACTIVE state. |
metadata_clientobject_and_part_active_space | the space occupied by client objects and parts in metadata that are in the ACTIVE state. |
metadata_clientobject_part_active_count | The count of client object parts in metadata that are in the ACTIVE state. |
metadata_storedObject_active_space | The space occupied by stored objects on the back-end storage components. |
metadata_used_capacity_bytes | The used bytes per instance (node) for the Metadata Gateway service. The label store gives the domain name of the instance.Note: Because multiple service instances can run on a node, all consuming the same shared disk space, combining this value with the value of |
update_queue_inprogress | The count of update queue entries in progress. |
update_queue_size | The size of the update queue. |
The following metrics are available from the Mirror In service.
Metric | Description |
mirror_failed_total | The count of failed mirror operations, both whole objects and multipart uploads. The mirror (synchronization) type is IN. |
mirror_mpu_bytes | The number of bytes synchronized as part of multi-part uploads (using MultiPartUpload). This metric is updated as uploads proceed. The mirror (synchronization) type is OUT. |
mirror_mpu_errors | The count of multi-part upload synchronization errors. The mirror (synchronization) type is IN. The client types are:
The error categories are:
|
mirror_mpu_objects | The count of objects synchronized using multi-part uploads (using MultiPartUpload). The mirror (synchronization) type is IN. |
mirror_skipped | The count of skipped mirror operations, on both whole objects and multi-part uploads. The mirror (synchronization) type is IN. |
mirror_success_total | The count of objects successfully synchronized. The mirror (synchronization) type is IN. |
mirror_whole_bytes_total | The number of bytes synchronized as whole objects (using PutObject). The mirror (synchronization) type is IN. |
mirror_whole_errors_total | The count of non-multipart synchronization errors (using PutObject). The mirror (synchronization) type is IN. The client types are:
The error categories are:
|
mirror_whole_objects_total | The count of objects synchronized as whole objects (using PutObject). The mirror (synchronization) type is IN. |
s3_operation_count_total | The count of S3 operations (READ, WRITE, DELETE, and HEAD) per storage component The mirror (synchronization) type is IN. |
sync_from_bytes_copied | The number of bytes synchronized by full copy from external storage (sync-from) by this instance. This metric is updated as synchronization proceeds. |
sync_from_bytes_putcopied | The number of bytes synchronized by put-copy from external storage (sync-from) by this instance. This metric is updated as synchronization proceeds. |
sync_from_object_count_failed | The count of objects that failed to synchronize from external storage (sync-from) by this instance, grouped by class of error. The error classes are AUTHENTICATION, METADATA, OPERATION_ABORTED, RESOURCE_NOT_FOUND, S3, SERVICE_UNAVAILABLE, and UNKNOWN. |
sync_from_object_count_succeeded | The count of objects synchronized from external storage (sync-from) by this instance. |
sync_from_object_size_total | Total size of object data synchronized from external storage (sync-from) by this instance. This metric is updated as synchronization proceeds. |
sync_from_objects | Total number of objects synchronized from external storage (sync-from) by this instance. This metric is updated as synchronization proceeds. |
The following metrics are available from the Mirror Out service.
Metric | Description |
mirror_failed_total | The count of failed mirror operations, both whole objects and multi-part uploads. The mirror (synchronization) type is OUT. |
mirror_mpu_bytes | The number of bytes synchronized as part of multi-part uploads (using MultiPartUpload). This metric is updated as uploads proceed. The mirror (synchronization) type is OUT. |
mirror_mpu_errors | The count of multi-part upload synchronization errors. The mirror (synchronization) type is OUT. The client types are:
The error categories are:
|
mirror_mpu_objects | The count of objects synchronized using multi-part uploads (using MultiPartUpload). The mirror (synchronization) type is OUT. |
mirror_skipped | The count of skipped mirror operations, on both whole objects and multi-part uploads. The mirror (synchronization) type is OUT. |
mirror_success_total | The count of objects successfully synchronized. The mirror (synchronization) type is OUT. |
mirror_whole_bytes_total | The number of bytes synchronized as whole objects (using PutObject). The mirror (synchronization) type is OUT. |
mirror_whole_errors_total | The count of non-multipart synchronization errors (using PutObject). The mirror (synchronization) type is OUT. The client types are:
The error categories are:
|
mirror_whole_objects_total | The count of objects synchronized as whole objects (using PutObject). The mirror (synchronization) type is OUT. |
s3_operation_count_total | The count of S3 operations (READ, WRITE, DELETE, and HEAD) per storage component The mirror (synchronization) type is IN. |
sync_to_bytes_copied | The number of bytes synchronized by full copy to external storage (sync-to) by this instance. This metric is updated as synchronization proceeds. |
sync_to_bytes_putcopied | The number of bytes synchronized by put-copy (previously copied) to external storage (sync-to) by this instance. |
sync_to_objects | The count of objects synchronized to external storage (sync-to) by this instance. |
sync_to_object_size_total | The total size of object data synchronized to external storage (sync-to) by this instance. This metric is updated as synchronization proceeds. |
The following metrics are available from the Policy Engine service.
Metric | Description |
confirm_latency_seconds_created | The message queue publish confirmation latency in seconds. |
duq_query_latency | The time to get a response from a get_duq query. |
duq_query_latency_count | The number of times the durable update queue (DUQ) has been queried (for determining the average). |
duq_query_latency_sum | The aggregate sum of latencies for DUQ queries (for determining the average). |
mq_all_bucket_lookup_latency_seconds | Average latency from a lookup of all buckets. |
mq_all_mirror_count_total | The count of messages dispatched to mirror exchange. |
mq_all_mirror_drop_count_total | The count of messages filtered from mirror exchange. |
mq_all_notification_count_total | The count of messages dispatched to notification exchange. |
mq_all_notification_drop_count_total | The count of messages filtered from notification exchange. |
mq_queued_messages |
Gauge of the queue depth (number of messages) that are being processed, or waiting to be processed, in these product queues:
Note: A task represents a range of objects. Each range can have many thousands of objects. |
policy_engine_errors_total | Count of how many errors per error type per instance. |
policy_engine_operations_total | Count of how many time a policy ran per policy type per instance (similar to http_s3_servlet_operations_total). Operations include both asynchronous and scheduled operations, such as sync_to , sync_from , and sched_storage_component_healthchecks_examined . |
policy_engine_time_total | Total time spent processing requests per instance. This helps measure load balancing between instances of the Policy Engine service. |
sync_from_bytes | The number of bytes synchronized from external storage (sync-from) by this instance. This metric is updated as synchronization proceeds. |
sync_from_bytes_copied | The number of bytes synchronized by full copy from external storage (sync-from) by this instance. This metric is updated as synchronization proceeds. |
sync_from_bytes_putcopied | The number of bytes synchronized by put-copy from external storage (sync-from) by this instance. This metric is updated as synchronization proceeds. |
sync_from_objects | Total number of objects synchronized from external storage (sync-from) by this instance. This metric is updated as synchronization proceeds. |
sync_to_bytes | The number of bytes synchronized to external storage (sync-to) by this instance. This metric is updated as synchronization proceeds. |
sync_to_bytes_copied | The number of bytes synchronized by full copy to external storage (sync-to) by this instance. This metric is updated as synchronization proceeds. |
sync_to_bytes_putcopied | The number of bytes synchronized by put-copy (previously copied) to external storage (sync-to) by this instance. |
sync_to_object_count_failed | The count of objects that failed to synchronize to external storage (sync-to) by this instance, grouped by class of error. The error classes are AUTHENTICATION, METADATA, OPERATION_ABORTED, RESOURCE_NOT_FOUND, S3, SERVICE_UNAVAILABLE, and UNKNOWN. |
sync_to_object_count_succeeded | The count of objects synchronized to external storage (sync-to) by this instance. |
sync_to_objects | The count of objects synchronized to external storage (sync-to) by this instance. |
RabbitMQ is a third-party application that is used by HCP for cloud scale to coordinate tasks submitted to the Policy Engine service for asynchronous processing. You can log in to the RabbitMQ interface to observe queue health. The following metrics are available from RabbitMQ:
- The number of messages in the queue
- The number of confirmed messages
- The number of unconfirmed (unacknowledged) messages
- The number of consumed (delivered and acknowledged) messages
- The number of unroutable returned messages
- The number of nodes in the RabbitMQ cluster
The following metrics are available from the S3 Gateway service.
Metric | Description |
http_s3_monitoring_requests_created | The timestamp when the counter http_s3_monitoring_requests_total was created. |
http_s3_monitoring_requests_total | The total count of S3 monitoring requests. |
http_s3_servlet_errors_total | The total number of errors returned by the s3 servlet, grouped by error. |
http_s3_servlet_get_object_response_bytes_created | The timestamp when the counter http_s3_servlet_get_object_response_bytes_total was created. |
http_s3_servlet_get_object_response_bytes_per_bucket_created | The timestamp when the counter http_s3_servlet_get_object_response_bytes_per_bucket_total was created. |
http_s3_servlet_get_object_response_bytes_per_bucket_total | The total number of total bytes in the body of S3 GET object responses per bucket. |
http_s3_servlet_get_object_response_bytes_total | The total number of bytes in the body of S3 GET object responses. |
http_s3_servlet_ingest_object_bytes_per_bucket_created | The timestamp when the counter http_s3_servlet_ingest_object_bytes_per_bucket_total was created. |
http_s3_servlet_ingest_object_bytes_per_bucket_total | The total count of objects ingested for the specified bucket. |
http_s3_servlet_operations_created | The timestamp when the counter http_s3_servlet_operations_total was created. |
http_s3_servlet_operations_total | The total number of S3 operations made to the s3 servlet for each method, grouped by operation. |
http_s3_servlet_post_object_bytes_created | The timestamp when the counter http_s3_servlet_post_object_bytes_total was created. |
http_s3_servlet_post_object_bytes_total | The total number of bytes of objects posted to S3. |
http_s3_servlet_put_copied_bytes_total | The number of total bytes of objects PUT copied (previously copied) to S3. |
http_s3_servlet_put_object_bytes_created | The timestamp when the counter http_s3_servlet_put_object_bytes_total was created. |
http_s3_servlet_put_object_bytes_total | The number of total bytes of objects PUT (previously copied) to S3. |
http_s3_servlet_put_object_part_bytes_total | The number of total bytes of PUT part operations (previously copied) to S3. |
http_s3_servlet_requests_histogram_latency_seconds | The latency in seconds as measured by a histogram timer, grouped by operation. |
http_s3_servlet_requests_histogram_latency_seconds_bucket | The latency in seconds as measured by a histogram timer, grouped by bucket. |
http_s3_servlet_requests_histogram_latency_seconds_count | The count of s3 servlet request observations; used with sum to determine average. |
http_s3_servlet_requests_histogram_latency_seconds_sum | Sum of s3 servlet request latency in seconds; used with count to determine average. |
http_s3_servlet_requests_latency_seconds | The latency in seconds as measured by a summary timer, grouped by operation. |
http_s3_servlet_requests_latency_seconds:hour_average | The latency in seconds over the last hour as measured by a summary timer. |
http_s3_servlet_requests_latency_seconds_count | |
http_s3_servlet_requests_latency_seconds_sum | The sum of request latency in seconds. |
http_s3_servlet_requests_per_bucket_created | The timestamp when the counter http_s3_servlet_requests_per_bucket_total was created. |
http_s3_servlet_requests_per_bucket_total | The total count of total put, get, or deletion requests made to the specified bucket. |
http_s3_servlet_requests_created | The timestamp when the counter http_s3_servlet_requests_total was created. |
http_s3_servlet_requests_total | The total number of requests made to the s3 servlet, grouped by method. |
http_s3_servlet_unimplemented_api_request_created | The timestamp when the counter http_s3_servlet_unimplemented_api_request_total was created. |
http_s3_servlet_unimplemented_api_request_total | The total number of requests made for unimplemented S3 methods. |
http_s3_servlet_unimplemented_bucket_api_request_total | The total number of requests made for unimplemented S3 methods per bucket, grouped by API. |
http_s3_servlet_unimplemented_object_api_request_total | The total number of requests made for unimplemented S3 methods per object, grouped by API. |
http_s3_servlet_unimplemented_service_api_request_total | The total number of requests made for unimplemented S3 methods per service, grouped by API. |
http_s3_servlet_unknown_api_requests_total | The total number of requests made for unknown S3 methods, grouped by API. |
s3_operation_error_count | The count of failed S3 operations (READ, WRITE, DELETE, and HEAD) per storage component |
s3_operation_latency_seconds | The latency of storage component operations (READ, WRITE, DELETE, and HEAD) in seconds |
s3select_total_bytes_scanned | The number of bytes scanned in the object |
s3select_total_bytes_processed | The number of bytes processed by the request |
s3select_total_bytes_returned | The number of bytes returned from the request |
s3select_input_type | Count of requests by file type |
s3select_output_type | Count of responses by file type |
The following metrics are available from the S3 Notification service.
Metric | Description |
mq_publish_latency_seconds | The message queue publishing latency in seconds. |
notification_events_considered_total | The count of events considered that could lead to notifications. |
notification_events_notification_attempted_total | The count of events that had at least one notification message attempted. |
notification_message_failures_total | The count of notification messages that were attempted but failed. |
notification_message_parsing_failures_total | The count of candidate object events that could not be parsed. |
notification_messages_sent_total | The count of notification messages that were successfully sent. |
notification_message_target_generation_failures_total | The count of candidate objects for which a list of notification targets could not be generated. |
Examples of metric expressions
By using metrics in formulas, you can generate useful information about the behavior and performance of the HCP for cloud scale system.
The following expression graphs the total capacity of the storage component store54.company.com
over time. Information is returned for HCP S Series Node storage components only. The output includes the label store
, which identifies the storage component by domain name. The system collects data every five minutes.
storage_total_capacity_bytes{store="store54.company.com"}
The following expression graphs the used capacity of all HCP S Series Node storage components in the system over time. (This is similar to the information displayed on the Storage page.) Information is returned only if all storage components in the system are HCP S Series nodes. The output includes the label aggregate
. The system collects data every five minutes.
storage_used_capacity_bytes(store="aggregate"}
The following expression graphs the count of active objects (metadata_clientobject_active_count
) over time. (This is similar to the graph displayed on the Storage page.) You can use this formula to determine the growth in the number of active objects.
sum(metadata_clientobject_active_count)
The metric lifecycle_policy_deleted_backend_objects_count
gives the total number of backend objects, including object versions, deleted by the policy DELETE_BACKEND_OBJECTS. You can graph this metric over time to monitor the rate of object deletion. In addition, the following expression graphs the count of deletion activities by the policy.
sum(lifecycle_policy_completed{policy="DELETE_BACKEND_OBJECTS"})
The following expression graphs the size of all update queues. You can use this formula to determine whether the system is keeping up with internal events that are processed asynchronously in response to S3 activity. If this graph increases over time, you might want to increase capacity.
sum(update_queue_size)
The following expression graphs the count of S3 put requests, summed across all nodes, at one-minute intervals. If you remove the specifier {operation="S3PutObjectOperation"}
the expression graphs all S3 requests.
sum(rate(http_s3_servlet_operations_total{operation="S3PutObjectOperation"}[1m]))
The following expression divides the latency of requests (async_duq_latency_seconds_bucket
) in seconds by the number of requests (async_duq_latency_seconds_count
), for the bucket getWork
and requests less than or equal to 10 ms, and graphs it over time. You can use this formula to determine the percentage of requests completed in a given amount of time.
sum(rate(async_duq_latency_seconds_bucket{op="getWork",le="0.01"}[1m]))/ sum(rate(async_duq_latency_seconds_count{op="getWork"}[1m]))
Here is a sample graph of data from a lightly loaded system:

The following expression estimates the quantile for the latency of requests (async_duq_latency_seconds_bucket
) in seconds for the bucket getWork
. You can use this formula to estimate the percentage of requests completed in a given amount of time.
histogram_quantile(.9, sum(rate(async_duq_latency_seconds_bucket{op="getWork"}[1m])) by (le))
Here is a sample graph of data from a lightly loaded system:

Starting the dashboards
The dashboards are generated from metrics collected by HCP for cloud scale and displayed using a third-party, open-source package. For more information about the interface, go to: https://grafana.com/docs/grafana/latest/dashboards/.
Procedure
There are two ways to start the dashboards:
- Click the app switcher menu (
) and then select Grafana.
- Open a browser window and go to https://cluster_name:3000/login
- Click the app switcher menu (
Enter the following initial credentials:
Username: admin
Password: admin
Keep or change the password.
It's best to change the password on first login and retain it securely.The dashboards open.
Results
Tracing requests and responses
HCP for cloud scale uses an open-source software tool, running over HTTPS as a service, for service tracing through a browser.
The Tracing service supports end-to-end, distributed tracing of S3 requests and responses by HCP for cloud scale services. Tracing helps you monitor performance and troubleshoot possible issues.
Tracing involves three service instances:
- Tracing Query: serves traces
- Tracing Agent: receives spans from tracers
- Tracing Collector: receives spans from Tracing Agent service using Tchannel
Displaying traces
You can display traces using the tracing service GUI.
To begin tracing, click the app switcher menu () and then select Jaeger. The tracing tool opens in a separate browser window.
When tracing, you can specify:
- Service to trace
- Operation to trace (all or specific) for each service
- Tags
- Lookback period (by default, over the last hour)
- Minimum duration
- Number of results to display (by default, 20)
The service displays all found traces with a chart giving the time duration for each trace. You can select a trace to display how the trace is served by difference services in cascade and the time spent on each service.
For information about the tracing tool, see the documentation provided with the tool.
Traceable operations
The following operations are traceable.
Component | Operation |
async-policy-engine | Action Pipeline Action: BucketIdToNameMapAction |
Action Pipeline Action: BucketLookupForAsyncPolicyAction | |
Action Pipeline Action: BucketOwnerIdToNameMapAction | |
Action Pipeline Action: BucketUpdateSecondaryAction | |
Action Pipeline Action: ClientObjectDispatchRemoveBackReferencesAction | |
Action Pipeline Action: ClientObjectLookupAction | |
Action Pipeline Action: ClientObjectModifyInProgressListAction | |
Action Pipeline Action: ClientObjectModifyListAction | |
Action Pipeline Action: ClientObjectUpdateSecondaryAction | |
Action Pipeline Action: DequeueAction | |
Action Pipeline Action: MetadataAction | |
BUCKET | |
CLIENT_OBJECT | |
STORED_OBJECT_BACK_REFERENCE | |
balance-engine | BalanceCluster |
BalanceEngineOperation | |
controlApi.ControlApiService | |
RefreshCluster | |
client-access-service | Action Pipeline Action: BucketAuthorizationAction |
Action Pipeline Action: BucketCountLimitAction | |
Action Pipeline Action: BucketCreateAction | |
Action Pipeline Action: BucketRegionValidationAction | |
Action Pipeline Action: BucketUpdateAclAction | |
Action Pipeline Action: ClientObjectInitiateMultipartAction | |
Action Pipeline Action: ClientObjectListInProgressMultipartAction | |
Action Pipeline Action: ClientObjectListVersionsAction | |
Action Pipeline Action: ClientObjectSizeLimitAction | |
Action Pipeline Action: ClientObjectTableLookupAction | |
Action Pipeline Action: ClientObjectUpdateAclAction | |
Action Pipeline Action: CompleteMultipartUploadAction | |
Action Pipeline Action: DataContentAction | |
Action Pipeline Action: DataDeletionAction | |
Action Pipeline Action: NotAnonymousAuthorizationAction | |
Action Pipeline Action: ObjectAuthorizationAction | |
Action Pipeline Action: ObjectDataPlacementAction | |
Action Pipeline Action: ObjectGetCurrentExpirationAction | |
Action Pipeline Action: ObjectGetMultipartAbortDateAction | |
Action Pipeline Action: ObjectGetUndeterminedExpirationAction | |
Action Pipeline Action: ObjectLookupAction | |
Action Pipeline Action: PartDataPlacementAction | |
Action Pipeline Action: PutAclAction | |
Action Pipeline Action: RequestBucketLookupAction | |
Action Pipeline Action: RequestVersionIdValidationAction | |
Action Pipeline Action: UploadIdValidationAction | |
Action Pipeline Action: UserLookupBucketsAction | |
Action Pipeline Action: VersionIdNotEmptyValidationAction | |
expiration-rules-engine | EvaluateOperation |
foundry-auth-client | FoundryAuthorizeOperation |
FoundryValidateOperation | |
jaeger-query | /api/dependencies |
/api/services | |
/api/services/{service}/operations | |
/api/traces | |
mapi-service | GET |
POST | |
metadata-client | BucketService/Create |
BucketService/List | |
BucketService/ListBucketOwnerListing | |
BucketService/LookupBucketNameById | |
BucketService/LookupByName | |
BucketService/UpdateACL | |
ClientObjectService/CloseNew | |
ClientObjectService/ClosePart | |
ClientObjectService/DeleteSpecific | |
ClientObjectService/List | |
ClientObjectService/LookupLatest | |
ClientObjectService/LookupSpecific | |
ClientObjectService/OpenNew | |
ClientObjectService/OpenPart | |
ClientObjectService/setACLOnLatest | |
ClientObjectService/Delete | |
ConfigService/List | |
ConfigService/LookupById | |
ConfigService/Set | |
StoredObjectService/Close | |
StoredObjectService/Delete | |
StoredObjectService/List | |
StoredObjectService/Lookup | |
StoredObjectService/MarkForCleanup | |
StoredObjectService/Open | |
UpdateQueueService/SecondaryEnqueue | |
UserService/LookupById | |
UserService/LookupOrCreate | |
UserService/UpdateAddAuthToken | |
metadata-coordination-service | Status.Service/GetStatus |
metadata-gateway-service | Status.Service/GetStatus |
BucketService/Create | |
BucketService/List | |
BucketService/ListBucketOwnerListing | |
BucketService/LookupBucketNameById | |
BucketService/LookupByName | |
BucketService/UpdateACL | |
ClientObjectService/CloseNew | |
ClientObjectService/ClosePart | |
ClientObjectService/DeleteSpecific | |
ClientObjectService/List | |
ClientObjectService/LookupLatest | |
ClientObjectService/LookupSpecific | |
ClientObjectService/OpenNew | |
ClientObjectService/OpenPart | |
ClientObjectService/setACLOnLatest | |
ConfigService/Delete | |
ConfigService/List | |
ConfigService/LookupById | |
ConfigService/Set | |
StoredObjectService/Close | |
StoredObjectService/Delete | |
StoredObjectService/List | |
StoredObjectService/Lookup | |
StoredObjectService/MarkForCleanup | |
StoredObjectService/Open | |
UpdateQueueService/SecondaryEnqueue | |
UserService/LookupById | |
UserService/LookupOrCreate | |
UserService/UpdateAddAuthToken | |
metadata-policy-client | PolicyService/ExecutePolicy |
metadata-policy-service | ServiceStatus/GetStatus |
PolicyService/ExecutePolicy | |
ScheduledDeleteBackendObjectsJob | |
ScheduledDeleteFailedWritesJob | |
ScheduledExpirationJob | |
ScheduledIncompleteMultipartExpirationJob | |
storage-component-client | InMemoryStorageComponentVerifyOperation |
InMemoryStorageDeleteOperation | |
InMemoryStorageReadOperation | |
InMemoryStorageWriteOperation | |
storage-component-manager | StorageComponentManager Operation: Create |
StorageComponentManager Operation: List | |
StorageComponentManager Operation: Lookup | |
StorageComponentManager Operation: Update | |
tomcat-servlet | S3 Operation |