This module describes the Hitachi Content Platform for cloud scale (HCP for cloud scale) system and its main use cases.
About HCP for cloud scale
HCP for cloud scale is a software-only data storage platform that rests on top of physical or cloud-based data storage systems, such as Hitachi Content Platform (HCP) and Amazon Web Services (AWS). HCP for cloud scale acts as a common interface between the storage systems that manages all storage objects, including buckets, objects, and metadata. HCP for cloud scale can scale to accommodate for any number of storage systems, and its data storage limitations are defined only by its underlying technologies.
You manage how the system scales by adding or removing instances to the system and also by specifying which services run on those instances.
An instance is a server or virtual machine on which the software is running. A system can have either a single instance or multiple instances. Multi-instance systems have a minimum of four instances.
A system with multiple instances maintains higher availability in the event of instance failures. Additionally, a system with more instances can run tasks concurrently and can typically process tasks faster than a system with fewer or only one instance.
A multi-instance system has two types of instances: master instances, which run an essential set of services, and non-master instances, which are called workers.
Each instance runs a configurable set of services, each of which performs a specific function. For example, the Metadata Gateway service stores metadata persistently.
In a single-instance system, that instance runs all services. In a multi-instance system, services can be distributed across all instances.
Single-instance systems vs. multi-instance systems
An HCP for cloud scale system can have a single instance or can have multiple instances (four or more).
A single-instance system is useful for testing and demonstration purposes. It needs only a single server or virtual machine and can perform all product functionality.
However, a single-instance system has these drawbacks:
- A single-instance system has a single point of failure. If the instance hardware fails, you lose access to the system.
- With no additional instances, you cannot choose where to run services. All services run on the single instance.
A multi-instance system is suitable for use in a production environment because it offers these advantages over a single-instance system:
- You can control how services are distributed across the multiple instances, providing improved service redundancy, scale out, and availability.
- A multi-instance system can survive instance outages. For example, with a four-instance system running the default distribution of services, the system can lose one instance and still remain available.
- Performance is improved as work can be performed in parallel across instances.
- You can add additional instances to the system at any time.NoteYou cannot change a single-instance system into a production-ready multi-instance system by adding new instances. This is because you cannot add master instances. Master instances are special instances that run a particular set of HCP for cloud scale services. Single-instance systems have one master instance. Multi-instance systems have at least three.
By adding additional instances to a single-instance system, your system still has only one master instance, meaning there is still a single point of failure for the essential services that only a master instance can run.
For information about adding instances to an existing HCP for cloud scale system, see the HCP for cloud scale online help.
The minimum HCP for cloud scale configuration has four instances. Four-instance systems should have three master instances.
About master and worker instances
Master instances are special instances that run an essential set of services, including:
- Admin-App service
- Cluster-Coordination service
- Synchronization service
- Service-Deployment service
Non-master instances are called workers. Workers can run any services except for those listed previously.
Single-instance systems have one master instance while multi-instance systems have either one or three master instances.
Services perform functions essential to the health or functionality of the system. For example, the Cluster Coordination service manages hardware resource allocation, while the Policy Engine service runs synchronous and asynchronous policies triggered by S3 API requests. Internally, services run in Docker containers on the instances in the system.
Depending on what actions they perform, services are grouped into these categories:
- Services: Enable product functionality. You can scale, move, and reconfigure these services.
- System services: Maintain the health and availability of the system. You cannot scale, move, or reconfigure these services.
Some system services run only on master instances.
Some services are classified as applications. These are the services with which users interact. Services that are not applications typically interact only with other services.
Services run on instances in the system. Most services can run simultaneously on multiple instances. That is, you can have multiple instances of a service running on multiple instances in the system. Some services run on only one instance.
Each service has a best and required number of instances on which it should run.
You can configure where Hitachi Content Platform for cloud scale services run, but not system services.
If a service supports floating, you have flexibility in configuring where new instances of that service are started when service instances fail.
Non-floating (or persistent) services run on the specific instances that you specify. If one of those service instances fails, the system does not automatically bring up a new instance of that service on another system instance.
With a service that supports floating, you specify a pool of eligible system instances and the number of service instances that should be running at any time. If a service instance fails, the system brings up another one on one of the system instances in the pool that doesn't already have an instance of that service running.
For services with multiple types, the ability to float can be supported on a per-type basis.
Each service binds to a number of ports and to one type of network, either internal or external. Networking for each service is configured during system installation and cannot be changed after a system is running.
Services can use volumes for storing data.
Volumes are properties of services that specify where and how a service stores its data.
You can use volumes to configure services to store their data in external storage systems, outside of the system instances. This allows data to be more easily backed up or migrated.
Volumes can also allow services to store different types of data in different locations. For example, a service might use two separate volumes, one for storing its logs and the other for storing all other data.
In this example, service A runs on instance 101. The service's Log volume stores data in a folder on the system instance and the service's Data volume stores data in an NFS mount.
Depending on how they are created and managed, volumes are separated into these groups:
- System-managed volumes are created and managed by the system. When you deploy the system, you can specify the volume driver and options that the system should use when creating these volumes.
After the system is deployed, you cannot change the configuration settings for these volumes.
- User-managed volumes can be added to services and job types after the system has been deployed. These are volumes that you manage; you need to create them on your system instances before you can configure a service or job to use them.NoteAs of release 1.3.0, none of the built-in services support adding user-managed volumes.
When configuring a volume, you specify the volume driver that it should use. The volume driver determines how and where data is stored.
Because services run in Docker containers on instances in the system, volume drivers are provided by Docker and other third-party developers, not by the system itself. For information about volume drivers you can use, see the applicable Docker or third-party developer's documentation.
By default, all services do not use volume drivers but instead use the bind-mount setting. With this setting, data for each service is stored within the system installation folder on each instance where the service runs.
For more information on volume drivers, see the Docker documentation.