Skip to main content
Hitachi Vantara Knowledge

About the Hitachi Content Platform

Hitachi Content Platform is the distributed, fixed-content, data storage system from Hitachi Vantara. HCP provides a cost-effective, scalable, easy-to-use repository that can accommodate all types of data, from simple text files to medical images to multigigabyte database images.

A fixed-content storage system is one in which the data cannot be modified. HCP uses write-once, read-many (WORM) storage technology and a variety of policies and services to ensure the integrity of the stored data and the efficient use of storage capacity. HCP also provides easy access to the repository for adding, retrieving, and deleting or shredding data.

HCP has an open architecture that insulates stored data from technology changes, as well as from changes in HCP itself due to product enhancements. This open architecture ensures that users will have access to their data long after it’s been added to the repository.

HCP runs on a networked redundant array of independent nodes (RAIN) or a SAN-attached array of independent nodes (SAIN). SAN stands for storage area network.

RAIN systems use only the internal storage in each node. SAIN systems use both the internal storage in each node and the storage in Fibre Channel SAN arrays.

HCP-VM systems run on virtual machines in a VMware® environment or in a KVM environment. HCP-VM systems function mostly as a RAIN system, with the virtual storage emulating internal storage.

Object-based storage

HCP stores objects in a repository. Each object permanently associates data HCP receives (for example, a document, an image, or a movie) with information about that data, called metadata.

An object encapsulates:

  • Fixed-content data

    An exact digital reproduction of data as it existed before it was stored in HCP. After it’s in the repository, this fixed-content data cannot be modified.

  • System metadata

    System-managed properties that describe the fixed-content data (for example, its size and creation date). System metadata includes policies, such as retention and data protection level, that influence how transactions and services affect the object.

  • Custom metadata

    Optional metadata that a user or application provides to further describe the object. Custom metadata is specified as one or more annotations, where each annotation is a discrete unit of information about the object. Annotations are typically specified in XML format.

    You can use custom metadata to create self-describing objects. Users and applications can use this metadata to understand and repurpose object content.

  • Access control list (ACL)

    Optional metadata consisting of a set of grants of permissions to perform various operations on the object. Permissions can be granted to individual users or to groups of users.

    ACLs are provided by users or applications and are specified in either XML or JSON format.

HCP also stores directories and symbolic links. These items have system metadata but no fixed-content data, custom metadata, or ACLs.

HCP can store multiple versions of an object, thus providing a history of how the data has changed over time. Each version is a separate object, with its own system metadata and, optionally, its custom metadata and ACL.

HCP supports multipart uploads with the Hitachi API for Amazon S3. Multipart uploads is the process of writing the data for an object to HCP in multiple parts. Multipart upload is supported only by the Hitachi API for Amazon S3. With a multipart upload, the data for an object is broken into multiple parts that are written to HCP independently of each other. Even though the data is written in multiple parts, the result of a multipart upload is a single object. An object for which the data is stored in multiple parts is called a multipart object.

Namespaces and tenants

An HCP repository is partitioned into namespaces. A namespace is a logical grouping of objects such that the objects in one namespace are not visible in any other namespace.

Namespaces provide a mechanism for separating the data stored for different applications, business units, or customers. For example, you could have one namespace for accounts receivable and another for accounts payable.

Namespaces also enable operations to work against selected subsets of objects. For example, you could perform a query that targets the accounts receivable and accounts payable namespaces but not the employees namespace.

Namespaces share the same underlying physical storage. This, together with the multitenancy feature described under "Tenants" below, enables HCP to provide support for cloud storage services.

HCP and default namespaces

An HCP system can have a maximum of 10,000 locally defined namespaces, including one special namespace called the default namespace. Applications are typically written against namespaces other than the default; these namespaces are called HCP namespaces. The default namespace is most often used with applications that existed before release 3.0 of HCP.

NoteReplication can cause an HCP system to have more than 10,000 namespaces.

The table below outlines the major differences between HCP namespaces and the default namespace.

FeatureHCPnamespacesDefault namespaces
Storage usage quotas
Object ownership (not related to POSIX UID)
Access control lists (ACLs) for objects
Object versioning
Multiple custom metadata annotations
Namespace ownership by users
RESTful HTTP/HTTPS API for data access
Non-RESTful HTTP/HTTPS protocol for data access
Data access authentication with HTTP/HTTPS
Hitachi API for Amazon S3, a RESTful, HTTP-based S3 compatible API for data access
NDMP for backup and restore
Tenants

Namespaces are owned and managed by administrative entities called tenants. A tenant typically corresponds to an organization, such as a company or a division or department within a company.

HCP supports two types of tenants:

  • HCP tenants

    Own HCP namespaces. An HCP system can have multiple HCP tenants, each of which can own multiple namespaces. You can limit the number of namespaces each HCP tenant can own. In addition to being owned by a tenant, each HCP namespace can have an owner that corresponds to an individual HCP user. The owner of a namespace automatically has permission to perform certain operations on that namespace.

  • default tenant

    Owns the default namespace and only that namespace. An HCP system can have only one default tenant.

An HCP system can have a maximum of 1,000 locally defined tenants, including the default tenant.

NoteReplication can cause an HCP system to have more than 1,000 tenants.

An HCP system has both system-level and tenant-level administrators:

  • System-level administrators

    Concerned with monitoring the HCP system hardware and software, monitoring overall repository usage, configuring features that apply across the HCP system, and managing system-level users.

  • Tenant-level administrators

    Concerned with monitoring namespace usage at the tenant and namespace level, configuring individual tenants and namespaces, managing tenant-level users, and controlling access to namespaces.

System-level administrators create tenants. Tenant-level administrators create HCP namespaces. The default namespace is created automatically when the default tenant is created.

You can create the default tenant and namespace only if allowed to do so by the system configuration.

Data access methods

HCP supports access to namespace content through namespace access protocols, the HCP Namespace Browser, the HCP metadata query API, the HCP Search Console, and HCP Data Migrator.

Namespace access protocols

HCP supports access to namespace content through several industry-standard protocols:

  • For HCP namespaces only:
    • A RESTful HTTP API (simply referred to as HTTP in the HCP documentation).
    • Hitachi API for Amazon S3, which is a RESTful, HTTP-based S3 compatible API. With the S3 compatible API, namespaces are called buckets.
    • HSwift, which is a RESTful, HTTP-based API that’s compatible with OpenStack Swift. With HSwift, namespaces are called containers.
  • For the default namespace only, a non-RESTful implementation of HTTP.
  • For all namespaces:
    • WebDAV
    • CIFS
    • NFS

These protocols support various operations: storing data, creating directories, viewing object data and metadata, viewing directories, modifying certain metadata, and deleting objects. You can use these protocols to access data with a web browser, third-party applications, Windows® Explorer, and other native Windows and Unix tools.

HCP allows special-purpose access to namespaces through SMTP. This protocol is used only for storing email.

For backup of the default namespace, HCP supports NDMP. Objects are backed up in OpenPGP format, which uses a tar file to package the files that represent an object. This standard format, which can be both signed and encrypted, allows backup objects to be restored to other storage systems.

HCP tenant administrators can create secure namespaces by enabling only HTTP and CIFS, configured to require authentication, and the S3 compatible and HSwift APIs, which always require authentication.

HCP Namespace Browser

The HCP Namespace Browser lets you manage content in and view information about HCP namespaces. With the Namespace Browser, you can:

  • List, view, and retrieve objects, including old versions of objects
  • View custom metadata and ACLs for objects, including old versions of objects
  • Store and delete objects
  • Create empty directories
  • Display namespace information, including:
    • The namespaces that you own or can access
    • Retention classes available for a given namespace
    • Permissions for namespace access
    • Namespace statistics such as the number of objects in a given namespace or the total capacity of the namespace

The Namespace Browser is not available for the default namespace. However, you can use a web browser to view the contents of that namespace.

HCP metadata query API

The HCP metadata query API lets you search HCP for objects that meet specified criteria. The API supports two types of queries:

  • Object-based queries search for objects based on object metadata. This includes both system metadata and the content of custom metadata and ACLs. The query criteria can also include the object location (that is, the namespace and/or directory that contains the object). These queries use a robust query language that lets you combine search criteria in multiple ways.

    Object-based queries search only for objects that currently exist in the repository. For objects with multiple versions, object-based queries return only the current version.

  • Operation-based queries search not only for objects currently in the repository but also for information about objects that have been deleted by a user or application, deleted through disposition, purged or pruned. For namespaces that support versioning, operation-based queries can return both current and old version of objects.

    Criteria for operation-based queries can include object status (for example, created or deleted), change time, index setting, and location.

The metadata query API returns object metadata only, not object data. The metadata is returned either in XML format, with each object represented by a separate element, or in JSON format, with each object represented by a separate name/value pair. For queries that return large numbers of objects, you can use paged requests.

HCP Search Console

The HCP Search Console is an easy-to-use web application that lets you search for and manage objects based on specified criteria. For example, you can search for objects that were stored before a certain date or that are larger than a specified size. You can then delete the objects listed in the search results or prevent those objects from being deleted. Similar to the metadata query API, the Search Console returns only object metadata, not object data.

By offering a structured environment for performing searches, the Search Console facilitates e-discovery, namespace analysis, and other activities that require the user to examine the contents of namespaces. From the Search Console, you can:

  • Open objects
  • Perform bulk operations on objects
  • Export search results in standard file formats for use as input to other application
  • Publish feeds to make search results available to web users

The Search Console works with either of these two search facilities:

  • HCP metadata query engine

    This facility is integrated with HCP and works internally to perform searches and return results to the Search Console. The metadata query engine is also used by the metadata query API. When working with the metadata query engine, the Search Console is called the Metadata Query Engine Console.

  • Hitachi Data Discovery Suite (DDS) search facility

    This facility interacts with HDDS, which performs searches and returns results to the HCP Search Console. To use the HDDS search facility, you need to first install and configure HDDS, which is a separate product from HCP. The HDDS search facility works only with version 2.x of HDDS.

    NoteCurrently, HDDS does not support the use of IPv6 networks for communication with HCP.

The Search Console can use only one search facility at any given time. The search facility is selected at the HCP system level. If no facility is selected, the HCP system does not support use of the Search Console to search namespaces.

Each search facility maintains its own index of objects in each search-enabled namespace and uses this index for fast retrieval of search results. The search facilities automatically update their indexes to account for new and deleted objects and changes to object metadata.

Not all namespaces support search. To find out whether a namespace is search-enabled, see your tenant administrator.

HCP Data Migrator

HCP Data Migrator (HCP-DM) is a high-performance, multithreaded, client-side utility for viewing, copying, and deleting data.

With HCP-DM, you can:

  • Copy objects, files, and directories between the local file system, HCP namespaces, default namespaces, and earlier HCAP archives
  • Delete individual objects, files, and directories and perform bulk delete operations
  • View the content of current and old versions of objects and the content of files
  • Purge all versions of an object
  • Rename files and directories on the local file system
  • View object, file, and directory properties
  • Change system metadata for multiple objects in a single operation
  • Add, replace, or delete custom metadata for objects
  • Add, replace, or delete ACLs for objects
  • Create empty directories

HCP-DM has both a graphical user interface (GUI) and a command-line interface (CLI).

Object representation

HCP represents objects differently based on the namespace access protocol the client is using.

Object representation with HTTP

With HTTP, HCP represents each object as a URL. The root of the object path in the URL is always rest.

Here’s an example of the URL for an object named wind.jpg in the images directory in a namespace named climate in a tenant named geo in an HCP system named hcp.example.com:

 http://climate.geo.hcp.example.com/rest/images/wind.jpg

Users and applications represent system metadata and identify custom metadata by using query parameters appended to the URLs. HCP returns system metadata in HTTP response headers and returns custom metadata in the format in which it was originally specified.

Object representation with the S3 compatible API

With the S3 compatible API, HCP represents each object as a URL. The exact format of this URL depends on how the application used to access the object handles user authentication.

The S3 compatible API does not have the concept of directories. Slashes in object names are simply part of the name and are not directory separators. Thus, with the S3 compatible API, objects do not have paths.

Here’s an example of one of the possible URLs for an object named images/wind.jpg in a namespace named climate in a tenant named geo in an HCP system named hcp.example.com:

http://climate.geo.hcp.example.com/hs3/images/wind.jpg

Users and applications represent system and custom metadata by using HTTP request headers. HCP returns system and custom metadata in HTTP response headers.

Object representation with the HSwift API

With the HSwift API, HCP represents each object as a URL. The exact format of this URL depends on how the application used to access the object handles user authentication.

HSwift does not have the concept of directories. Slashes in object names are simply part of the name and are not directory separators. Thus, objects in HSwift do not have paths.

Here’s an example of one of the possible URLs for an object named images/fire.jpg in a namespace named climate in a tenant named geo in an HCP system named hcp.example.com:

 http://api.climate.geo.hcp.example.com/swift/v1/geo/climate/images/fire.jpg

Users and applications represent system and custom metadata by using HTTP request headers. HCP returns system and custom metadata in HTTP response headers.

Object representation with other namespace access protocols

For namespace access protocols other than the HTTP and S3 compatible APIs, HCP includes a standard POSIX file system called HCP-FS that represents each object as a set of files. One of these files has the same name as the object. This file contains the fixed-content data. When downloaded or opened, this file has the same content as the originally stored item.

The other files that HCP-FS presents contain object metadata. These files, most of which are plain text, are called metafiles.

All files containing fixed-content data are in a directory hierarchy headed by data. All metafiles are in a directory hierarchy headed by metadata.

With this view of objects as conventional files and directories, HCP supports routine file-level calls and enables users and applications to find fixed-content data in familiar ways.

 

  • Was this article helpful?