Skip to main content

We've Moved!

Product Documentation has moved to docs.hitachivantara.com
Hitachi Vantara Knowledge

Introduction to Hitachi Content Platform


Hitachi Content Platform (HCP) is a distributed storage system designed to support large, growing repositories of fixed-content data. HCP stores objects that include both data and metadata that describes the data.

HCP provides access to stored objects through a variety of industry-standard protocols, as well as through various HCP-specific interfaces.

This chapter introduces basic HCP concepts and includes information on what you can do with an HCP namespace.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

About Hitachi Content Platform


Hitachi Content Platform is a combination of hardware and software that provides an object-based data storage environment. An HCP repository stores all types of data, from simple text files to medical images to multigigabyte database images.

HCP provides easy access to the repository for adding, retrieving, and deleting data. HCP uses write-once, read-many (WORM) storage technology and a variety of policies and internal processes to ensure the integrity of the stored data and the efficient use of storage capacity

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Object-based storage


HCP stores objects in the repository. Each object permanently associates data HCP receives (for example, a document, an image, or a movie) with information about that data, called metadata.

An object encapsulates:

Fixed-content data — An exact digital reproduction of data as it existed before it was stored. Once it’s in the repository, this fixed-content data cannot be modified.

System metadata — System-managed properties that describe the fixed-content data (for example, its size and creation date). System metadata consists of POSIX metadata as well as HCP-specific settings, such as retention and data protection level, that influence how transactions and internal processes affect the object.

Custom metadata — Optional metadata that a user or application provides to further describe the object. Custom metadata is specified as one or more annotations, where each annotation is a discrete unit of information about the object. Annotations are typically specified in XML format.

You can use custom metadata to create self-describing objects. Users and applications can use this metadata to understand and repurpose the object content.

Access control list (ACL) — Optional metadata consisting of a set of grants of permissions to perform various operations on the object. Permissions can be granted to individual users or to groups of users.

ACLs are provided by users or applications and are specified in either XML or JSON format.

HCP can store multiple versions of an object, thus providing a history of how the data has changed over time. Each version is a separate object, with its own system metadata and, optionally, its own custom metadata and ACL.

HCP supports multipart uploads with the Hitachi API for Amazon S3. With a multipart upload, the data for an object is broken into multiple parts that are written to HCP independently of each other. Even though the data is written in multiple parts, the result of a multipart upload is a single object. An object for which the data is stored in multiple parts is called a multipart object.

NoteWebHelp.png

Note: Multipart uploads are possible only with the S3 compatible API, but objects created by multipart uploads can be managed and retrieved with the REST and HSwift namespace access protocols. Such objects cannot be managed or retrieved with the WebDAV, CIFS, and NFS protocols.

In addition to objects, HCP also stores directories and symbolic links in the repository. Directories and symbolic links have POSIX metadata but no fixed-content data, HCP-specific metadata, custom metadata, or ACLs.

NoteWebHelp.png

Note: An object is equivalent to a WebDAV resource. A directory is equivalent to a WebDAV collection.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Namespaces and tenants


An HCP repository is partitioned into namespaces. A namespace is a logical grouping of objects such that the objects in one namespace are not visible in any other namespace.

Namespaces provide a mechanism for separating the data stored for different applications, business units, or customers. For example, you could have one namespace for accounts receivable and another for accounts payable.

Namespaces also enable operations to work against selected subsets of repository objects. For example, you could perform a query that targets the accounts receivable and accounts payable namespaces but not the employees namespace.

Namespaces are owned and managed by administrative entities called tenants. A tenant typically corresponds to an organization such as a company or a division or department within a company.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Namespace access


HCP supports access to namespace content through:

Several industry-standard protocols

The HCP Namespace Browser

The HCP metadata query API

The HCP Search Console

HCP Data Migrator

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Namespace access protocols


HCP supports access to namespace content through several industry-standard protocols:

A RESTful HTTP API (called REST in HCP)

Hitachi API for Amazon S3, which is a RESTful, HTTP-based S3 compatible API

HSwift, which is a RESTful, HTTP-based API that’s compatible with OpenStack Swift

WebDAV

CIFS

NFS

These protocols support various operations: storing data, creating directories, viewing object data and metadata, viewing directories, modifying certain metadata, and deleting objects. You can use these protocols to access data with a web browser, third-party applications, Windows Explorer, and other native Windows and Unix tools.

HCP can be configured to allow special-purpose access to a namespace through the SMTP protocol. This protocol is used only for storing email. HCP automatically generates directory paths and object names for email objects stored this way. For information on how HCP names email objects, see Naming conventions for email objects.

Objects added to the namespace through any protocol are immediately accessible through any other protocol.

The namespace access protocols are configured separately for each namespace and are enabled or disabled independently of each other. If you cannot access a namespace through a given protocol, you can ask your tenant administrator to enable that protocol.

The REST, S3 compatible, and CIFS protocols can be configured to require authentication. To use a protocol that requires authentication, users and applications must present valid credentials for access to the namespace.

If the REST, S3 compatible, or CIFS protocol is enabled but is not configured to require authentication, users and applications can access the namespace anonymously, without presenting any credentials.

The WebDAV and NFS protocols do not support authenticated access; when you use these protocols, you always access the namespace anonymously. For the WebDAV protocol, this is true even when WebDAV basic authentication is enabled for the namespace.

To find out the authentication requirements for the namespace you want to access, contact your tenant administrator.

For more information on:

The REST API, see HTTP

The S3 compatible API, see Using the Hitachi API for Amazon S3

The HSwift API, see Using the HCP OpenStack HSwift API

The WebDAV API, see WebDAV

WebDAV basic authentication, see Basic authentication with WebDAV

The CIFS protocol, see CIFS

The NFS protocol, see NFS

The figure below shows the relationship between original data, objects in a namespace, and the supported namespace access protocols.

1_1.jpg

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP Namespace Browser


The HCP Namespace Browser lets you manage content in and view information about namespaces. With the Namespace Browser, you can:

List, view, and retrieve objects, including old versions of objects

View custom metadata and ACLs for objects, including old versions of objects

Store and delete objects

Create empty directories

Display namespace information, including:

oThe namespaces that you own or can access

oRetention classes available for a given namespace

oPermissions for namespace access

oNamespace statistics such as the number of objects in a given namespace or the total capacity of the namespace

For information on using the Namespace Browser, see Namespace Browser.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP metadata query API


The HCP metadata query API lets you search HCP for objects that meet specified criteria. The API supports two types of queries:

Object-based queries search for objects based on object metadata. This includes both system metadata and the content of custom metadata and ACLs. The query criteria can also include the object location (that is, the namespace and/or directory that contains the object). These queries use a robust query language that lets you combine search criteria in multiple ways.

Object-based queries search only for objects that currently exist in the repository. For objects with multiple versions, object-based queries return only the current version.

Operation-based queries search not only for objects currently in the repository but also for information about objects that have been deleted by a user or application, deleted through disposition, purged, or pruned. For namespaces that support versioning, operation-based queries can return both current and old versions of objects.

Criteria for operation-based queries can include object status (for example, created or deleted), change time, index setting, and location.

The metadata query API returns object metadata only, not object data. The metadata is returned either in XML format, with each object represented by a separate element, or in JSON format, with each object represented by a separate name/value pair. For queries that return large numbers of objects, you can use paged requests.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP Search Console


The HCP Search Console is an easy-to-use web application that lets you search for and manage objects based on specified criteria. For example, you can search for objects that were stored before a certain date or that are larger than a specified size. You can then delete the objects listed in the search results or prevent those objects from being deleted. Similar to the metadata query API, the Search Console returns only object metadata, not object data.

By offering a structured environment for performing searches, the Search Console facilitates e-discovery, namespace analysis, and other activities that require the user to examine the contents of namespaces. From the Search Console, you can:

Open objects

Perform bulk operations on objects

Export search results in standard file formats for use as input to other applications

Publish feeds to make search results available to web users

The Search Console works with either of these two search facilities:

The HCP metadata query engine — This facility is integrated with HCP and works internally to perform searches and return results to the Search Console. The metadata query engine is also used by the metadata query API.

NoteWebHelp.png

Note: When working with the metadata query engine, the Search Console is called the Metadata Query Engine Console

The Hitachi Data Discovery Suite (DDS) search facility — This facility interacts with HDDS, which performs searches and returns results to the HCP Search Console. HDDS is a separate product from HCP.

The Search Console can use only one search facility at any given time. The search facility is selected at the HCP system level. If no facility is selected, the HCP system does not support use of the Search Console to search namespaces.

Each search facility maintains its own index of objects in each search-enabled namespace and uses this index for fast retrieval of search results. The search facilities automatically update their indexes to account for new and deleted objects and changes to object metadata.

NoteWebHelp.png

Note: Not all namespaces support search. To find out whether a namespace is search-enabled, see your tenant administrator.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP Data Migrator


HCP Data Migrator (HCP-DM) is a high-performance, multithreaded, client-side utility for viewing, copying, and deleting data. With HCP-DM, you can:

Copy objects, files, and directories between the local file system, HCP namespaces, default namespaces, and earlier HCAP archives

Delete individual objects, files, and directories and perform bulk delete operations

View the content of current and old versions of objects and the content of files

Purge all versions of an object

Rename files and directories on the local file system

View object, file, and directory properties

Change system metadata for multiple objects in a single operation

Add, replace, or delete custom metadata for objects

Add, replace, or delete ACLs for objects

Create empty directories

HCP-DM has both a graphical user interface (GUI) and a command-line interface (CLI).

For information on installing and using HCP-DM, see Using HCP Data Migrator.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP nodes


The core hardware for an HCP system consists of servers that are networked together. These servers are called nodes.

When you access an HCP system, your point of access is an individual node. To identify the system, however, you can use either the domain name of the system or the IP address of an individual node. When you use the domain name, HCP selects the access node for you. This helps ensure an even distribution of the processing load.

For information on when to use an IP address instead of the domain name, see Hostname and IP address considerations.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Permissions


To access a namespace and take action in it, clients must have the necessary permissions. The table below describes the possible permissions and the operations they allow.

Permission

Operations

Browse

List directory contents.

Check for directory existence.

Read

Retrieve objects and system metadata.

Check for object existence.

List annotations.

Check for and retrieve annotations.

Read operations also require browse permission.

Read ACL

Check for and retrieve ACLs.

Write

Store objects.

Create directories.

Modify system metadata.

Add and replace annotations.

Write ACL

Add, replace, and delete ACLs.

Delete

Delete objects, empty directories, annotations, and ACLs.

Purge

Delete objects and their old versions (also requires delete permission).

Privileged

Delete or purge objects regardless of retention (also requires delete or purge permissions).

Place objects on hold or release objects from hold (also requires write permission).

Change owner

Change object owners.

Search

Search for objects (also requires browse and read permissions). For information on searching for objects, see HCP Metadata Query API Reference and Searching Namespaces.

NoteWebHelp.png

Note: When using the CIFS protocol with a Windows client, you need both read and write permissions to store objects.

Data access permission mask

The operations allowed in a namespace are determined by a data access permission mask for the namespace. Data access permission masks are set at the system, tenant, and namespace levels.

The effective permissions for a namespace are the operations that are allowed by the mask at all three levels. That is, to be in effect for a namespace, a permission must be included in the system-level permission mask, the tenant-level permission mask, and the namespace-level permission mask.

User permissions

To perform an operation in a namespace, the operation must be allowed by the effective permission mask and by your user permissions. The permissions for what you can do in a namespace come from your user account (if you’re an authenticated user), the namespace configuration, and, for individual objects, the object ACL.

For information on the permissions that can be granted by an ACL, see ACL permissions.

NoteWebHelp.png

Note: ACLs are enabled on a per-namespace basis. In namespaces where ACLs are enabled, the namespace can be configured to either enforce or ignore the permissions granted by ACL. To find out the ACLs settings for a namespace, contact your tenant administrator.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Replication


Replication is a process that supports configurations in which selected tenants and namespaces are maintained on two or more HCP systems and the objects in those namespaces are managed across those systems. This cross-system management helps ensure that data is well-protected against the unavailability or catastrophic failure of a system.

A replication topology is a configuration of HCP systems that are related to each other through replication. Typically, the systems in a replication topology are in separate geographic locations and are connected by a high-speed wide area network. This arrangement provides geographically distributed data protection (called geo-protection).

You can read from namespaces on all systems where those namespaces are replicated. The replication topology, which is configured at the system level, determines the systems on which you can write to namespaces.

Replication has several purposes, including:

If a system in a replication topology becomes unavailable (for example, due to network issues), another system in the topology can provide continued data availability.

If a system in a replication topology suffers irreparable damage, another system in the topology can serve as a source for disaster recovery.

If multiple HCP systems are widely separated geographically, each system may be able to provide faster data access for some applications than the other systems can, depending on where the applications are running.

If an object cannot be read from one system in a replication topology (for example, because a node is unavailable), HCP can try to read it from another system in the topology. Whether HCP tries to do this depends on the namespace configuration.

If a system in a replication topology is unavailable, HTTP requests to that system can be automatically serviced by another system in the topology. Whether HCP tries to do this depends on the namespace configuration.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Operations


You use namespace access protocols to perform operations in a namespace. The operations you can perform depend on the protocol you’re using.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Operation restrictions


The operations you can perform in a namespace are subject to these restrictions:

The namespace access protocol must support the operation.

If the namespace protocol requires client authentication, you need to provide valid user credentials.

The namespace access protocol must be configured to allow access to the namespace from your client IP address.

Both the effective namespace permission mask and your permissions must allow the operation.

For information on user permissions, see Permissions.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Supported operations


The table below lists the operations HCP supports and indicates which protocols you can use to perform those operations. Some operations relate to specific types of metadata. For more information on this metadata, see Object properties.

Operation HTTP WebDAV CIFS NFS

Write data from files or memory to a namespace to create an object

Transmit data to and from HCP in gzip-compressed format

     

Check for object existence

   

View the content of an object

Copy an object

Store new versions of existing objects

     

Append to existing objects

   

Delete an object that’s not under retention

Delete an object that’s under retention if the namespace configuration allows it

     

View object metadata

View a metafile

 

Override default index, retention, and shred settings when storing an object

     

Change the retention setting for an object

Hold or release an object

   

Enable shredding for an object

Change the index setting for an object

Change object ownership (not related to POSIX UID)

     

Change the POSIX UID and GID for an object

   

Change the POSIX permission settings for an object

   

Change the POSIX atime or mtime value for an object

   

Store, replace, or delete an annotation for an object

     

Store, replace, or delete the default annotation for an object

   

Store or retrieve object data and an annotation in a single operation

     

Store or retrieve object data and the default annotation in a single operation

     

Check for annotation existence

     

Check for existence of the default annotation

   

List annotations

     

Read an annotation

     

Read the default annotation

Store, replace, or delete an ACL for an object

     

Check for ACL existence

   

Read ACLs

Create an empty directory in a namespace

View the namespace directory structure, not including metadirectories

View the namespace directory structure, including both directories and metadirectories

 

Rename an empty directory (unless atime synchronization is enabled)

 

Delete an empty directory

Create a symbolic link

Read through a symbolic link to an object

Delete a symbolic link

List the namespaces accessible to you

     

List namespace statistics

     

List namespace permissions for the user

     

List the retention classes available in the namespace

     

These considerations apply to symbolic links:

If you use CIFS to create a symbolic link, you can read through the link only with CIFS. You cannot use CIFS to read through a symbolic link created with NFS.

HTTP and WebDAV support for reading through symbolic links is limited to retrieving object data. Other HTTP and WebDAV operations on symbolic links may produce unexpected results.

TipWebHelp.png

Tip: You can use the HCP Search Console to delete, purge, hold, release, change ownership, or modify ACLs on multiple objects with a single operation.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Prohibited operations


HCP never lets you:

Rename an object.

Rename a nonempty directory.

Overwrite a successfully stored object. However, if versioning is enabled, you can write new versions of an object using the REST, S3 compatible, or HSwift API.

Modify the fixed-content portion of an object.

Delete a directory that contains one or more objects.

Shorten the retention period of an object.

Store a file (other than a file containing custom metadata), directory, or symbolic link anywhere in the metadata structure.

Delete a metafile (other than a metafile containing custom metadata) or metadirectory.

Create a hard link.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

 

  • Was this article helpful?