Services
Services perform functions essential to the health and function of the Hitachi Content Platform for cloud scale (HCP for cloud scale) system. The System Management application supports management of services.
For example, the S3 Gateway service serves S3 API methods and communicates with storage components, while the Watchdog service ensures that other services remain running.
Services provide cluster management and coordination, metadata coordination and caching, and external gateways.
Internally, services run in Docker containers on the instances of the system. The container orchestration framework supports cloud or on-premise deployment.
HCP for cloud scale supports an adaptive service deployment model that can change based on workload.
Service categories
Services are grouped into categories depending on what actions they perform.
Services are grouped into these categories:
- Product services enable HCP for cloud scale functions. For example, the S3 Gateway service serves S3 API methods and communicates with storage components. You can scale, move, and reconfigure product services.
- System services maintain the health and availability of the HCP for cloud scale system. For example, the Watchdog service ensures that other services remain running. You cannot scale, move, or reconfigure system services.
HCP for cloud scale services
This table describes the services that HCP for cloud scale runs. Each service runs within its own Docker container. For each service, the table lists:
- Configuration settings: The settings you can configure for the service.
- RAM needed per instance: The amount of RAM that, by default, the service needs on each instance on which it's deployed. For all services except for System services, this value is also the default Docker value of Container Memory for the service.
- Number of instances: Shows both:
- The minimum number of instances on which a service must run to function properly.
- The best number of instances on which a service should run. If your system includes more than the minimum number of instances, you should take advantage of the instances by running services on them.
- Whether the service is persistent (that is, it must run on a specific instance) or supports floating (that is, it can run on any instance).
- Whether the service is scalable or not.
Service name and description | Configuration settings (changes cause the service to redeploy) | Properties | |
Product services: These services perform HCP for cloud scale functions. You can move and reconfigure these services. | |||
Cassandra Decentralized database that can be scaled across large numbers of hardware nodes. | Container Options: Default
| RAM needed per instance: | 2.4 GB |
Service Options
| Number of instances: |
Minimum: 3 Best: All | |
Service unit cost: | 10 | ||
Advanced Options: Compaction Frequency: How often the database is compacted. The options are Weekly (default) and Daily. Caution: Changing this setting can negatively affect the service. Use with caution. | Persistent or floating? | Persistent | |
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 10 | ||
Chronos Job scheduling. | Container Options: Default
| RAM needed per instance: | 712 MB |
Service unit cost: | 1 | ||
Service Options
| Number of instances: |
Minimum: 1 Best: 1 | |
Persistent or floating? | Floating | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Service unit cost: | 1 | ||
Elasticsearch Data indexing and search platform. | Container Options: Default
| RAM needed per instance: | 2 GB |
Service unit cost: | 25 | ||
Service Options
| Number of instances: |
Minimum: 3 Best: All | |
Persistent or floating? | Persistent | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 25 | ||
Kafka Stream processing platform for handling real-time data streams. | Container Options: Default
| RAM needed per instance: | 2 GB |
Service unit cost: | 5 | ||
Service Options
| Number of instances: |
Minimum: 3 Best: All | |
Persistent or floating? | Persistent | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 5 | ||
Key Management Server Manages storage component encryption keys | Container Options: Default
| RAM needed per instance: | 2 GB |
Service unit cost: | 10 | ||
Number of instances: |
Minimum: 1 Best: 2 or more | ||
Persistent or floating? | Floating | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 10 | ||
Logstash Collection engine for event data. Can perform transformations on the data it collects and then send that data to a number of outputs. | Container Options: Default
| RAM needed per instance: | 700 MB |
Service Options
| Number of instances: |
Minimum: 1 Best: 1 | |
Persistent or floating? | Floating | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Service unit cost: | 10 | ||
MAPI Gateway Serves MAPI methods. | Container Options: Default
| RAM needed per instance: | 768 MB |
Service unit cost: | 5 | ||
Service Options
| Number of instances: |
Minimum: 1 Best: All | |
Persistent or floating? | Floating | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 5 | ||
Message Queue Provides system queueing services. | Container Options: Default
| RAM needed per instance: | 2048 MB |
Service unit cost: | 10 | ||
Service Options None. | Number of instances: |
Minimum: 3 Best: 3 | |
Persistent or floating? | Floating | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Service unit cost: | 10 | ||
Metadata Cache Cache for HCP for cloud scale metadata | Container Options: Default
| RAM needed per instance: | 1024MB |
Service Options
| Number of instances: |
Minimum: 1 Best: 1 | |
Persistent or floating? | Persistent | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Service unit cost: | 10 | ||
Metadata Coordination Coordinates Metadata Gateway service instances and coordinates scaling and balancing. | Container Options: Default
| RAM needed per instance: | 768 MB |
Service Options
| Number of instances: |
Minimum: 1 Best: 1 | |
Persistent or floating? | Floating | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Service unit cost: | 5 | ||
Metadata Gateway Stores and protects metadata and serves it to other services. | Container Options: Default
| RAM needed per instance: | 768 MB |
Service unit cost: | 50 | ||
Service Options
| Number of instances: |
Minimum: 3 Best: All | |
Persistent or floating? | Persistent | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 50 | ||
Metrics | Container Options: Default
| RAM needed per instance: | 768 MB |
Service unit cost: | 10 | ||
Service Options
| Number of instances: |
Minimum: 1 Best: 1 | |
Persistent or floating? | Persistent | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 10 | ||
Policy Engine Executes system policy operations and asynchronous metadata updates. | Container Options: Default
| RAM needed per instance: | 768 MB |
Service unit cost: | 25 | ||
Service Options
| Number of instances: |
Minimum: 1 Best: All | |
Persistent or floating? | Floating | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
S3 Gateway Serves S3 API methods and communicates with storage components. | Container Options: Default
| RAM needed per instance: | 768 MB |
Service unit cost: | 25 | ||
Service Options
| Number of instances: |
Minimum: 1 Best: All | |
HTTP Options:
| Persistent or floating? | Floating | |
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
HTTPS Options:
| Scalable? | Yes | |
Service unit cost: | 25 | ||
Tracing Agent Supports end-to-end distributed tracing for S3 API calls and MAPI calls. | Container Options: Default
| RAM needed per instance: | 768 MB |
Service unit cost: | 1 | ||
Service Options
| Number of instances: |
Minimum: All | |
Persistent or floating? | Floating | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 1 | ||
Tracing Collector Supports end-to-end distributed tracing for S3 API calls and MAPI calls. | Container Options: Default
| RAM needed per instance: | 768 MB |
Service Options
| Number of instances: |
Minimum: 1 Best: All | |
Persistent or floating? | Floating | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 10 | ||
Tracing Query Supports end-to-end distributed tracing for S3 API calls and MAPI calls. | Container Options: Default
| RAM needed per instance: | 768 MB |
Service Options
| Number of instances: |
Minimum: 1 Best: All | |
Persistent or floating? | Floating | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | Yes | ||
Service unit cost: | 5 | ||
System services: These services manage system resources and ensure that the HCP for cloud scale system remains available and accessible. These services cannot be moved or reconfigured. | |||
Admin App The system management application. | Service Options
| RAM needed per instance: | N/A |
Number of instances: | N/A | ||
Persistent or floating? | Persistent | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Cluster Coordination Manages hardware resource allocation. | N/A | RAM needed per instance: | N/A |
Number of instances: | N/A | ||
Persistent or floating? | Persistent | ||
Supports volume configuration? | No | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Cluster Worker Receives and performs work from other services. | N/A | RAM needed per instance: | N/A |
Number of instances: | N/A | ||
Service unit cost: | 5 | ||
Persistent or floating? | Persistent | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Network Proxy Network request load balancer. | Security Protocol: Select which Transport Layer Security (TLS) versions to use:
| RAM needed per instance: | N/A |
SSL Ciphers: To use your own cipher suite, type it here. | Number of instances: | N/A | |
Service unit cost: | 1 | ||
Custom Global Configuration: Select Enable Advanced Global Configuration to enable adding your own parameters to the HAProxy "global" section. | Persistent or floating? | Persistent | |
Custom Defaults Configuration: Select Enable Defaults Configuration to enable adding your own parameters to the HAProxy "global" section. | Supports volume configuration? | Yes | |
Single or multiple types? | Single | ||
Scalable? | No | ||
Sentinel Runs internal system processes and monitors the health of the other services. | Service Options
| RAM needed per instance: | N/A |
Number of instances: | N/A | ||
Persistent or floating? | Persistent | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Service Deployment Handles deployment of high-level services (that is, the services that you can configure). | N/A | RAM needed per instance: | N/A |
Number of instances: | N/A | ||
Persistent or floating? | Persistent | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Synchronization Coordinates service configuration settings and other information across instances. | Service Options
| RAM needed per instance: | N/A |
Number of instances: | N/A | ||
Persistent or floating? | Persistent | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Watchdog Monitors the other System services and restarts them if necessary. Also responsible for initial system startup. | Service Options
| RAM needed per instance: | N/A |
Number of instances: | N/A | ||
Service unit cost: | 5 | ||
Persistent or floating? | Persistent | ||
Supports volume configuration? | Yes | ||
Single or multiple types? | Single | ||
Scalable? | No | ||
Service unit cost: | 5 |
Viewing services
You can use Admin App, CLI, and REST API to view the status of all services for the system.
Viewing all services
Procedure
To view the status of all services, in the Admin App, click Services.
For each service, the page shows:
- The service name
- The service state:
- Healthy: The service is running normally.
- Unconfigured: The service has yet to be configured and deployed.
- Deploying: The system is currently starting or restarting the service. This can happen when:
- You move the service to run on a completely different set of instances.
- You repair a service.
- Balancing: The service is running normally, but performing background maintenance.
- Under-protected: In a multi-instance system, one or more of the instances on which a service is configured to run are offline.
- Failed: The service is not running or the system cannot communicate with the service.
- CPU Usage: The current percentage CPU usage for the service across all instances on which it's running.
- Memory: The current RAM usage for the service across all instances on which it's running.
- Disk Used: The current total amount of disk space that the service is using across all instances on which it's running.
Viewing individual service status
Procedure
To view the detailed status for an individual service, select the service in the Services window.
In addition to status information, the window shows:
- Instances: A list of all instances on which the service is running.
- Volumes: To view a list of volumes used by the service, click the row for an instance in the Instances section.
- Network: [Internal|External]: Which network type this service uses to receive communications.
This section also displays a list of the ports that the service uses:
- Configuration settings: The settings you can configure for the service.
- Service Units: The total number of service units currently being spent to run this service. This value is equal to the service's service unit cost times the number of instances on which the service is running.
- Service unit cost: The number of service units required to run the service on one instance.
- Service Instance Types: For services that have multiple types, the types that are currently running.
- Instance Pool: For floating services, the instances that this service is eligible to run on.
- Events: A list of all system events for the service.
Related CLI commands
getService
listServices
Related REST API methods
GET /services
GET /services/{id}
You can get help on specific REST API methods for the Admin App at REST API - Admin.
Listing service ports
You can list service port information for ports available for customer use.
POST /public/discovery/get_service_port
For information about specific API methods, see the MAPI Reference or, in the Object Storage Management application, click the profile icon and select REST API.
Managing services
This section describes how you can reconfigure, restart, and otherwise manage the services running on your system.
Moving and scaling services
You can change a service to run on:
- Additional instances (for example, to improve service performance and availability)
- Fewer instances (for example, to free up resources on an instance for running other services)
- A different set of instances (for example, to retire the piece of hardware on which an instance is installed)
For floating services, instead of specifying the specific instances on which the service runs, you can specify a pool of eligible instances, any of which can run the service.
When moving or scaling a service that has multiple types, you can simultaneously configure separate rebalancing for each type.
- Avoid running multiple services with high service unit costs together on the same instance.
- On master instances, avoid running any services besides those classified as System services.
- To use your instances evenly, try to deploy a comparable number of service units on each instance.
- You cannot remove a service from an instance if doing so would cause or risk causing data loss.
- Service relocations can take a long time to complete and can impact system performance while they are running.
- Instance needs vary from service to service. Each service defines the minimum and maximum number of instances on which it can run.
Relocating services
Procedure
Select Services.
The Services page opens, displaying the services and system services.Click on the service that you want to scale or move.
Configuration information for the service is displayed.Click Scale, and if the service has more than one type, select the instance type that you want to scale.
- The next step depends on whether the service is floating or persistent (non-floating).
If the service is a floating service, you are presented with options for configuring an instance pool. For example:
In the field Service Instances, specify the number of instances on which the service should be running at any time.
Configure the instance pool:
- For the service to run on any instance in the system, select All Available Instances.
With this option, the service can be restarted on any instance in the instance pool, including instances that were added to the system after the service was configured.
- For the service to run on a specific set of instances, deselect All Available Instances. Then:
- To remove an instance from the pool, select it from the list Instance Pool, on the left, and then click Remove Instances.
- To add an instance to the pool, select it from the list Available Instances, on the right, and then click Add Instances.
- For the service to run on any instance in the system, select All Available Instances.
If the service is a persistent (non-floating) service, you are presented with options for selecting the specific instances that the service should run on. Do one or both of these, then click Next:
- To remove the service from the instances it's currently on, select one or more instances from the list Selected Instances, on the left, and then click Remove Instances.
- To add the service to other instances, select one or more instances from the list Available Instances, on the right, and then click Add Instances.
Click Update.
The Processes page opens, and the Service Operations tab displays the progress of the service update as "Running." When the update finishes, the service shows "Complete."
Next steps
Related CLI commands
updateServiceConfig
Related REST API methods
POST /services/configure
You can get help on specific REST API methods for the Admin App at REST API - Admin.
Scaling Metadata Gateway instances
The HCP for cloud scale software lets you deploy an instance of the Metadata Gateway service on every node in your system. You can scale the number of instances up or down as needed.
The Metadata Coordination service manages Metadata Gateway scaling. The service does the following:
- Constantly monitors the Metadata Gateway service and balances data among Metadata Gateway instances as needed
- Moves data into new Metadata Gateway instances
- Moves data out of a Metadata Gateway instance set for removal
Use the System Management application to add new Metadata Gateway instances. You can add more than one instance at a time.
Use the System Management application to remove a Metadata Gateway instance. Before you scale down Metadata Gateway instances, consider the following:
- You can only remove a Metadata Gateway instance from the system when there is one or zero Metadata Gateway instances down.NoteIf more than one instance is down, call Support to remove a Metadata Gateway instance.
- You cannot remove a Metadata Gateway instance when there are only three instances. You first need to add a new Metadata Gateway instance.
- You can only remove one Metadata Gateway instance at a time.
If a Metadata Gateway instance is down, the data in this instance becomes underprotected. To solve this situation, remove the Metadata Gateway instance that is down so that the Metadata Gateway service can recover the data protection. You should first add a new Metadata Gateway instance before removing the instance that is down. This ensures that the system keeps the same performance and capacity usage and also that there is a suitable target instance to recover the data protection. When removing the Metadata Gateway instance, the considerations on scaling down services apply.
A snapshot shows the current state of the state machine from a leader node (service instance) to any follower service instance that is out of synch. If a leader node runs out of space to store snapshots and can't send out its latest snapshot, the follower node cannot resynchronize. if this happens, bring down the leader service instance, increase its storage space, and restart the service.
Configuring service settings
You can configure settings for some of the services that the system runs.
Configuring service settings
Select the Services window.
Select the service you want to configure.
On the Configuration tab, configure the service.
Click Update.
Related CLI commands
updateServiceConfig
Related REST API methods
POST /services/configure
You can get help on specific REST API methods for the Admin App at REST API - Admin.
Repairing services
If a service becomes slow, unresponsive, or shows a status of Failed, you can repair it. If you change the configuration of a service you use the same process to restart it.
Repairing a service stops and restarts the service on each instance on which it's running.
If you change the cluster name (cluster hostname), you must repair the S3 Gateway services for the change to take effect.
If you regenerate or upload an SSL certificate, you must repair the S3 Gateway and MAPI services for the change to take effect.
If you upload an SSL certificate for access to a remote system for bucket synchronization, you must repair the Policy Engine and MAPI services for the change to tale effect.
To repair a service:
Procedure
Click the Services window.
Select the service you want to repair.
Click Repair.
The Processes window opens, displaying a progress bar for the repair process.
Configuring TLS cipher suite
To configure cipher suites:
Procedure
Select the Services window.
Select the service S3-Gateway.
On the Configuration tab, in the HTTPS Options section, enter the new cipher suite in the field SSL Ciphers.
Click Update.
The service redeploys.
Avoiding Message Queue shutdown
If two of the three Message Queue service instances fail, the service shuts down. To avoid the possible loss of queued messages, resolve any situation in which only two service instances are running.
To protect messaging consistency, the Message Queue service always has three service instances. To prevent being split into disconnected parts, the service shuts down if half of the service instances fail. In practice, messaging stops if two of the three instances fail.
Do not let the service run with only two instances, because in that scenario if one of the remaining instances fails, the service shuts down. However, when one of the failed instances restarts, messaging services recover and resume.
To protect the Message Queue service, immediately address a node failure where an instance cannot be restarted, because if two service instances are lost and cannot be recovered, the service cannot recover its previous state. You can still add new instances to form a new cluster, but messages that were queued are lost.
In case of such a multi-node failure, the best practice is to restart the Policy Engine service instances one at a time after the Message Queue service cluster is re-formed. This forces the service instances to recover configurations that might have been missed while the Message Queue service was down. Additionally, after the Message Queue service cluster is re-formed, bucket sync-to events that were in the messaging queues are lost, so you might need to regenerate bucket sync-to events for such objects.
The cluster forms based on instance names, including the IP address of the node on which an instance runs. Therefore, changing node configurations such as IP addresses can cause nodes to be permanently removed from the cluster, possibly triggering a shutdown. If this happens, first add instances to the messaging service. Ensure the instances synchronize with the cluster before taking nodes offline or changing node configurations such as IP addresses. This way, the cluster can keep over half of its instances running at all times.