Skip to main content

We've Moved!

Product Documentation has moved to docs.hitachivantara.com
Hitachi Vantara Knowledge

Introducing Hitachi Content Platform for cloud scale

This section introduces Hitachi Content Platform for cloud scale and its major features.

Hitachi Content Platform for cloud scale (HCP for cloud scale) is a software-defined object storage solution that is based on a massively parallel microservice architecture, and is compatible with the Amazon S3 application programming interface (API). HCP for cloud scale is especially well suited to service applications requiring high bandwidth and compatibility with Amazon S3 APIs.

HCP for cloud scale has the ability to federate S3-compatible storage from virtually any private or public source, and present the combined capacity in a single, centrally managed, global namespace.

You can install HCP for cloud scale on any server, in the cloud or on premise, that supports the minimum requirements.

HCP for cloud scale lets you manage and scale storage components. You can add storage components, monitor their states, and take them online or offline for purposes of maintenance and repair. The HCP for cloud scale system provides functions to send notification of alerts, track and monitor throughput and performance, and trace actions through the system.

Storage components, buckets, and objects

A storage component is an Amazon S3-compatible storage system, running independently, that is manageable by HCP for cloud scale as a back end to store object data. To an S3 client, the existence, type, and state of storage components are transparent.

HCP for cloud scale supports the following storage systems:

  • Amazon S3
  • Hitachi Content Platform (HCP)
  • HCP S Series Nodes
  • Any Amazon S3-compatible storage service

An HCP for cloud scale bucket is modeled on a storage service bucket. A bucket is a logical collection of secure data objects that is created and managed by a client application. HCP for cloud scale uses buckets to manage storage components, and an HCP for cloud scale site can be thought of as a logical collection of secure buckets. Buckets have associated metadata such as ownership and lifecycle status. HCP for cloud scale buckets are owned by an HCP for cloud scale user, and access is controlled on a per-bucket basis by Amazon ACL support using S3 APIs. Buckets are contained in a specific region; HCP for cloud scale supports one region.

Note
  1. HCP for cloud scale buckets are not stored in storage components, so HCP for cloud scale clients can create buckets even before adding storage components.
  2. Storage component buckets are created by storage component administrators, and are not visible to HCP for cloud scale clients.

An object consists of data and associated metadata. The metadata is a set of name-value pairs that describe the object. Every object is contained in a bucket. An object is handled as a single unit by all HCP for cloud scale transactions, services, and internal processes.

For information about Amazon S3, see Introduction to Amazon S3.

Data access

HCP for cloud scale supports the Amazon Simple Storage Service (S3) application programming interface (API), which allows client applications to store and retrieve unlimited amounts of data from configured storage services.

Data access control

HCP for cloud scale uses ownership and access control lists (ACLs) as data access control mechanisms in S3 APIs.

Ownership is implemented as follows:

  • An HCP for cloud scale bucket is owned by the user who creates the bucket, and the owner cannot be changed
  • A user has full control of the buckets that user owns
  • A user has full control of the objects that user creates
  • A user can only list the buckets that user owns

ACLs allow the assignment of privileges (read, write, or full control) to other user accounts besides the owner to access bucket and objects.

Data security

HCP for cloud scale supports encryption of data sent between systems ("in flight") and data stored persistently within the system ("at rest").

Certificate management

HCP for cloud scale uses Secure Sockets Layer (SSL) to provide security for both incoming and outgoing communications. To enable SSL security, two certificates are required:

  • System certificate: the certificate HCP for cloud scale uses for its GUI and APIs (incoming communications)
  • Client certificate: the certificates of IDPs, storage components, and SMTP servers (outgoing communications)
For a system certificate, HCP for cloud scale comes with its own self-signed SSL server certificate, which is generated and installed automatically when the system is installed. This certificate is not automatically trusted by web browsers. You can choose to trust this self-signed certificate or replace it by using one of three options:
  1. Upload a PKCS12 certificate chain and password and apply it as the active system certificate.
  2. Download a certificate signing request (CSR), then use it to obtain, upload, and apply a certificate signed by a certificate authority (CA).
  3. Generate a new self-signed certificate and apply it as the active system certificate.
For a client certificate, you need to upload the certificate of the clients HCP for cloud scale needs to access using SSL.

You can manage certificates, as well as view the installed certificates and their details, using the System Management application.

Data-in-flight encryption

HCP for cloud scale supports data-in-flight encryption (HTTPS) for all external communications. Data-in-flight encryption is always enabled for these data paths:

  • S3 API (HTTP is also enabled on a different port)
  • Management API
  • System Management App user interface (GUI)
  • Object Storage Management App GUI
You can enable or disable data-in-flight encryption for these data paths:
  • Between HCP for cloud scale and an identity provider (IDP) server
  • Between HCP for cloud scale and each application using TLS or SSL
  • Between HCP for cloud scale and each managed storage component
  • Between HCP for cloud scale and each SMTP server using SSL or STARTTLS
Communication among HCP for cloud scale instances are without data-in-flight encryption. Depending on your security requirements, you may need to set up an isolated internal network for your HCP for cloud scale site.
Data-at-rest encryption

HCP for cloud scale stores three kinds of data persistently:

  1. HCP for cloud scale services data
  2. HCP for cloud scale metadata and user-defined metadata
  3. User data (object data)
The first two kinds of data are handled by the hardware on which HCP for cloud scale instances are installed. If needed, you can install HCP for cloud scale on servers with encrypted disks. Data of the last kind is handled by storage components. If needed, you can use storage components that support data-at-rest encryption. Storage components can self-manage their keys, or HCP for cloud scale can facilitate customer-supplied keys following the S3 API specification.

Scalability of instances, service instances, and storage components

You can increase or decrease the capacity, performance, and availability of an HCP for cloud scale site by adding or removing the following:

  • Instances: Additional physical computer nodes or virtual machines
  • Service instances: Copies of services running on additional instances
  • Storage components: S3-compatible object storage used to store object data

In a multi-instance site, you might add additional instances if you want to improve system performance or you are running out of disk space on one or more instances. You might remove instances if you are retiring hardware, an instance is down and cannot be recovered, or you want to run a site with fewer instances.

In a multi-instance site, you can change where a service instance runs:

  • You can configure it to run on additional instances. For example, you can increase the number of instances of the S3-Gateway service to improve throughput of S3 API transactions without having to add a compute instance.
  • You can configure it run on fewer instances. For example, you can free up resources on an instance to run other services.
  • You can configure it to run on different instances. For example, you can move the service instances off a hardware instance to retire it.
  • For a floating service, instead of specifying a specific instance on which it runs, you can specify a pool of eligible instances, any of which can run the service.

Some services have a fixed number of instances and therefore cannot be scaled. These include:

  • Metadata-Coordination
  • Metadata-Gateway
  • Metadata-Cache

You might add additional storage components to a site under these circumstances:

  • The existing storage components are running out of available capacity
  • The existing storage components do not provide the performance you require
  • The existing storage components do not provide the functionality you require

Supported limits

HCP for cloud scale limits the number of instances (nodes) in a system to 160.

HCP for cloud scale does not limit the number of the following entities.

EntityMinimumMaximumNotes
BucketsNoneUnlimited
Users (external)NoneUnlimitedThe local user can do all operations including MAPI calls and S3 API calls. However, it is recommended that HCP for cloud scale be configured with an identity provider (IdP) with users to enforce role-based access control.
Groups (external)Unlimited
RolesUnlimited
ObjectsNoneUnlimited
Storage components1Unlimited

High availability

HCP for cloud scale provides high availability for multi-instance sites. High availability requires at least four instances. The best practice is to run at least three master instances, which run essential services, on separate physical hardware (or, if running on virtual machines, on at least three separate physical hosts), and to run HCP for cloud scale services on more than one instance.

Site availability

An HCP for cloud scale site has three master instances, and can tolerate the failure of one master instance without interruption of service. Even if two or all three master instances fail, HCP for cloud scale services may be functional (but you cannot move or scale service instances until master instances are restored).

Service availability

HCP for cloud scale services provide high availability as follows:

  • The Metadata Gateway service always has three service instances. When the system starts up, the nodes "elect a leader" using the raft consensus algorithm. The leader processes all GET and PUT requests. If the followers cannot identify the leader, they elect a new leader. The Metadata Gateway service can tolerate one service instance failure, and service remains available without loss of data, so long as at least two service instances are healthy.
  • The Metadata Coordination service always has one service instance. If that instance fails, HCP for cloud scale automatically starts another instance. Until startup is complete, the Metadata Gateway service cannot scale.
  • The Metadata Cache service always has one service instance. If that instance fails, HCP for cloud scale automatically starts another instance. Until startup is complete, performance decreases.
The rest of the HCP for cloud scale services remain available if HCP for cloud scale instances or service instances fail as long as at least one service instance remains healthy. Even if a service that only has one service instance fails, HCP for cloud scale will automatically start a new service instance.

Metadata availability

Metadata is available as long as two services are available:

  • S3 Gateway
  • Metadata Gateway

Object data availability

Object data is available as long as these items are available:

  • S3 Gateway service (at least one instance)
  • The storage component containing the requested data
  • At least two functioning Metadata Gateway service instances (of the required three)
The availability of object data depends on the storage component. For high availability of object data, you should use a storage component with high availability, such as HCP, HCP-S, and AWS S3. This is true as well for data protection.

Network availability

You can install each HCP for cloud scale instance with an internal and an external network interface. If you want to avoid networking single points of failure, you can:

  • Configure two external network interfaces in each HCP for cloud scale instance
  • Use two switches, and connect each network interface to one of them
  • Bind the two network interfaces (that is, as Active-Passive) into one virtual network interface
  • Install HCP for cloud scale using the virtual network interface

Failure recovery

HCP for cloud scale actively monitors the health and performance of the system and its resources, provides real-time visual health representations, issues alert messages when needed, and can automatically take action to recover from the following types of failures:

  • Instances (nodes)
  • Product services (software processes)
  • System services (software processes)
  • Storage components

Instance failure recovery

If an instance (a compute node) fails, HCP for cloud scale automatically adds new service instances to other available instances (compute nodes) to maintain the recommended minimum number of service instances. Data on the failed instance is not lost and remains consistent. However, while the instance is down, data redundancy may degrade.

HCP for cloud scale only adds new service instances automatically for floating services. Depending on the remaining number of instances and service instances running, you may need to add new service instances or deploy a new instance.

Service failure recovery

HCP for cloud scale monitors service instances and automatically restarts them if they are not healthy.

For floating services, you can configure a pool of eligible HCP for cloud scale instances and the number of service instances that should be running at any time. You can also set the minimum and maximum number of instances running each service. If a service instance failure causes the number of service instances to go below the minimum, HCP for cloud scale brings up another one on one of the HCP for cloud scale instances in the pool that doesn't already have that service instance running.

Persistent services run on the specific instances that you specify. If one of those service instances fails, HCP for cloud scale restarts the service instance in the same HCP for cloud scale instance. HCP for cloud scale does not automatically bring up a new service instance on a different HCP for cloud scale instance.

Storage component failure recovery

HCP for cloud scale performs regular health checks to detect storage component failures.

If HCP for cloud scale detects a failure, it sets the storage component state to INACCESSIBLE, so that HCP for cloud scale will not try to write new objects to it. You can configure HCP for cloud scale to send an alert when this event happens. While a storage component is down, the data in it is not accessible.

HCP for cloud scale keeps checking a failed storage component and, when it detects that the storage component is healthy again, automatically sets its state to ACTIVE. You can configure HCP for cloud scale to send an alert when this event happens as well. Once the storage component is repaired and brought back online, the data its contains is again accessible, and you can write new objects to it.

 

  • Was this article helpful?