Skip to main content

We've Moved!

Product Documentation has moved to docs.hitachivantara.com
Hitachi Vantara Knowledge

Introduction to Hitachi Content Platform


Hitachi Content Platform (HCP) is a robust storage system designed to support large, growing repositories of fixed-content data. HCP stores objects that include both data and metadata that describes that data. Objects exist in buckets, which are logical partitions of the repository.

HCP provides access to the repository through a variety of industry-standard protocols, as well as through various HCP-specific interfaces. One of these interfaces is the Hitachi API for Amazon S3 — a RESTful, HTTP-based API that is compatible with Amazon S3.

This section of the Help introduces basic HCP concepts and includes information on what you can do with this S3 compatible API.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

About Hitachi Content Platform


Hitachi Content Platform is a combination of hardware and software that provides an object-based data storage environment. An HCP repository stores all types of data, from simple text files to medical images to multigigabyte database images.

HCP provides easy access to the repository for adding, retrieving, and deleting data. HCP uses write-once, read-many (WORM) storage technology and a variety of policies and internal processes to ensure the integrity and availability of the stored data.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Object-based storage


HCP stores objects in a repository. Each object permanently associates data HCP receives (for example, a document, an image, or a movie) with information about that data, called metadata.

An object encapsulates:

Fixed-content data — An exact digital reproduction of data as it existed before it was stored in HCP. Once it’s in the repository, this fixed-content data cannot be modified.

System metadata — System-managed properties that describe the fixed-content data (for example, its size and creation date). System metadata includes policies, such as retention, that influence how transactions and internal processes affect the object.

Custom metadata — Optional metadata that a user or application provides to further describe the object. Custom metadata is specified as one or more annotations, where each annotation is a discrete unit of information about the object.

You can use custom metadata to create self-describing objects. Users and applications can use this metadata to understand and repurpose object content.

Access control list (ACL) — Optional metadata consisting of a set of grants of permissions to perform various operations on the object. Permissions can be granted to individual users or to groups of users.

Like custom metadata, ACLs are provided by users or applications.

HCP can store multiple versions of an object, thus providing a history of how the data has changed over time. Each version is a separate object, with its own system metadata and, optionally, its own custom metadata and ACL.

HCP supports multipart uploads with the Hitachi API for Amazon S3. With a multipart upload, the data for an object is broken into multiple parts that are written to HCP independently of each other. Even though the data is written in multiple parts, the result of a multipart upload is a single object. An object for which the data is stored in multiple parts is called a multipart object.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Buckets and tenants


An HCP repository is partitioned into buckets. A bucket is a logical grouping of objects such that the objects in one bucket are not visible in any other bucket. Buckets are also called namespaces.

Buckets provide a mechanism for separating the data stored for different applications, business units, or customers. For example, you could have one bucket for accounts receivable and another for accounts payable.

Buckets also enable operations to work against selected subsets of objects. For example, you could perform a query that targets the accounts receivable and accounts payable buckets but not the employees bucket.

Buckets are owned and managed by administrative entities called tenants. A tenant typically corresponds to an organization, such as a company or a division or department within a company.

In addition to being owned by a tenant, each bucket can have an owner that corresponds to an individual HCP user. The owner of a bucket automatically has permission to perform certain operations on that bucket.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP nodes


The core hardware for an HCP system consists of servers that are networked together. These servers are called nodes.

When you access an HCP system, your point of access is an individual node. To identify the system, however, you can use either the domain name of the system or the IP address of an individual node. When you use the domain name, HCP selects the access node for you. This helps ensure an even distribution of the processing load.

For information on the URLs you can use to access an HCP system, see URLs for access to HCP. For information on when to use an IP address instead of a domain name, see Hostname and IP address considerations.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Replication


Replication is a process that supports configurations in which selected tenants and buckets are maintained on two or more HCP systems and the objects in those buckets are managed across those systems. This cross-system management helps ensure that data is well-protected against the unavailability or catastrophic failure of a system.

A replication topology is a configuration of HCP systems that are related to each other through replication. Typically, the systems in a replication topology are in separate geographic locations and are connected by a high-speed wide area network. This arrangement provides geographically distributed data protection (called geo-protection).

You can read from buckets on all systems where those buckets are replicated. The replication topology, which is configured at the system level, determines the systems on which you can write to buckets.

Replication has several purposes, including:

If a system in a replication topology becomes unavailable (for example, due to network issues), another system in the topology can provide continued data availability.

If a system in a replication topology suffers irreparable damage, another system in the topology can serve as a source for disaster recovery.

If multiple HCP systems are widely separated geographically, each system may be able to provide faster data access for some applications than the other systems can, depending on where the applications are running.

If an object cannot be read from one system in a replication topology (for example, because a node is unavailable), HCP can try to read it from another system in the topology. Whether HCP tries to do this depends on the bucket configuration.

If a system in a replication topology is unavailable, HTTP requests to that system can be automatically serviced by another system in the topology. Whether HCP tries to do this depends on the bucket configuration.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

About the Hitachi API for Amazon S3


The Hitachi API for Amazon S3 is a RESTful, HTTP-based API that is compatible with Amazon S3. Using this API, you can:

Create buckets (PUT bucket)

List the buckets you own (GET service)

Check the existence of a bucket (HEAD bucket)

Add ACLs to existing buckets (PUT bucket ACL)

Retrieve ACLs for buckets (GET bucket ACL)

Enable or suspend object versioning for buckets you own (PUT bucket versioning)

Check the status of object versioning for buckets you own (GET bucket versioning)

List objects that are in a bucket (GET bucket)

List versions of objects that are in a bucket (GET bucket versions)

List in-progress multipart uploads in a bucket (GET bucket list multipart uploads)

Delete buckets you own, as long as the buckets don’t have any objects in them (DELETE bucket)

Store objects in a bucket (PUT object)

Create folders in a bucket (PUT object)

Add custom metadata to existing objects, where the custom metadata is specified as property/value pairs (PUT object copy)

Check the existence of an object or folder (HEAD object)

Retrieve custom metadata for objects (HEAD object)

Add ACLs to existing objects (PUT object ACL)

Retrieve ACLs for objects (GET object ACL)

Copy objects (PUT object copy)

Retrieve objects (GET object)

Delete objects (DELETE object)

Perform multipart uploads (POST object initiate multipart upload, PUT object upload part, PUT object upload part copy, and POST object complete multipart upload)

List the parts of in-progress multipart uploads (GET object list parts)

Abort multipart uploads (DELETE object abort multipart upload)

To use the S3 compatible API to perform the operations listed above, you can write applications that use any standard HTTP client library. The S3 compatible API is also compatible with many third-party tools that support Amazon S3. For information on configuring third-party tools for use with the S3 compatible API, see Using third-party tools with the Hitachi API for Amazon S3.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Other bucket access methods


HCP allows access to bucket (namespace) content through:

Several namespace access protocols

The Namespace Browser

A metadata query API

The Search Console

HCP Data Migrator

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Namespace access protocols


Along with the S3 compatible API, HCP supports access to namespace content through these industry-standard protocols: a RESTful, HTTP-based API called REST, WebDAV, CIFS, and NFS. HCP also supports access to namespace content through an OpenStack® Swift-compatible API called HSwift.

Using the supported protocols, you can access namespaces programmatically with applications, interactively with a command-line tool, or through a GUI. You can use these protocols to perform actions such as storing objects in a namespace, viewing and retrieving objects, changing object metadata, and deleting objects.

HCP allows special-purpose access to namespaces through the SMTP protocol. This protocol is used only for storing email.

The namespace access protocols are configured separately for each namespace and are enabled or disabled independently of each other.

When you use the S3 compatible API to create a namespace (bucket), both the S3 compatible API and the REST API are automatically enabled for that namespace. Additionally, both the HTTP and HTTPS ports are open for both protocols (that is, the namespace can be accessed with or without SSL security).

Tenant administrators can enable and disable namespace access protocols for any namespace. This includes enabling the S3 compatible API for namespaces created through other HCP interfaces and disabling the S3 compatible API for namespaces created using the S3 compatible API.

TipWebHelp.png

Tip: You can ask your tenant administrator to close the HTTP port for the namespaces you create, thereby allowing only secure access to those namespaces.

Objects added to a namespace through any protocol, including the S3 compatible API, are immediately accessible through any other protocol that’s enabled for the namespace.

For information on using namespace access protocols other than the S3 compatible API, see Namespace access protocols and About the HCP HSwift API.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP Namespace Browser


The Namespace Browser lets you manage content in and view information about HCP namespaces. With the Namespace Browser, you can:

Store objects

List, view, retrieve, and delete objects, including old versions of objects

View custom metadata and ACLs for objects, including old versions of objects

Create empty directories

Display namespace information

For more information on the Namespace Browser, see About the Namespace Browser.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP metadata query API


The HCP metadata query API lets you search HCP for objects that meet specified criteria. The API supports two types of queries:

Object-based queries search for objects based on object metadata. This includes both system metadata and the content of custom metadata and ACLs. The query criteria can also include the object location (that is, the namespace and/or directory that contains the object). These queries use a robust query language that lets you combine search criteria in multiple ways.

Object-based queries search only for objects that currently exist in the repository. For objects with multiple versions, object-based queries return only the current version.

Operation-based queries search not only for objects currently in the repository but also for information about objects that have been deleted. For namespaces that support versioning, operation-based queries can return both current and old versions of objects.

Criteria for operation-based queries can include object status (for example, created or deleted), change time, index setting, and location.

The metadata query API returns object metadata only, not object data. The metadata is returned either in XML format, with each object represented by a separate element, or in JSON format, with each object represented by a separate name/value pair. For queries that return large numbers of objects, you can use paged requests.

For information on using the metadata query API, seeIntroduction to the HCP metadata query API.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP Search Console


The HCP Search Console is an easy-to-use web application that lets you search for and manage objects based on specified criteria. For example, you can search for objects that were stored before a certain date or that are larger than a specified size. You can then delete the objects listed in the search results or prevent those objects from being deleted. Similar to the metadata query API, the Search Console returns only object metadata, not object data.

By offering a structured environment for performing searches, the Search Console facilitates e-discovery, namespace analysis, and other activities that require the user to examine the contents of namespaces. From the Search Console, you can:

Open objects

Perform bulk operations on objects

Export search results in standard file formats for use as input to other applications

Publish feeds to make search results available to web users

The Search Console works with either of these two search facilities:

The HCP metadata query engine — This facility is integrated with HCP and works internally to perform searches and return results to the Search Console. The metadata query engine is also used by the metadata query API.

NoteWebHelp.png

Note: When working with the metadata query engine, the Search Console is called the Metadata Query Engine Console.

The Hitachi Data Discovery Suite (DDS) search facility — This facility interacts with HDDS, which performs searches and returns results to the HCP Search Console. HDDS is a separate product from HCP.

The Search Console can use only one search facility at any given time. The search facility is selected at the HCP system level. If no facility is selected, the HCP system does not support use of the Search Console to search namespaces.

Each search facility maintains its own index of objects in each search-enabled namespace and uses this index for fast retrieval of search results. The search facilities automatically update their indexes to account for new and deleted objects and changes to object metadata.

For more information on using the Search Console, see About searching namespaces.

NoteWebHelp.png

Note: Not all namespaces support search. To learn whether a namespace is search-enabled, see your tenant administrator.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

HCP Data Migrator


HCP Data Migrator (HCP-DM) is a high-performance, multithreaded, client-side utility for viewing, copying, and deleting data. With HCP-DM, you can:

Copy objects, files, and directories between the local file system, HCP namespaces, default namespaces, and earlier HCAP archives

Delete individual objects, files, and directories and perform bulk delete operations

View the content of current and old versions of objects and the content of files

Purge all versions of an object

Rename files and directories on the local file system

View object, file, and directory properties

Change system metadata for multiple objects in a single operation

Add, replace, or delete custom metadata for objects

Add, replace, or delete ACLs for objects

Create empty directories

HCP-DM has both a graphical user interface (GUI) and a command-line interface (CLI).

For information on installing and using HCP-DM, see Using HCP Data Migrator.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

User accounts


To use the S3 compatible API to create and manage buckets, you need a user account that’s configured to allow you to take those actions. To work with objects in a bucket, you may or may not need a user account. This depends on how the S3 compatible API is configured for the bucket.

By default, when you create a bucket, both the S3 compatible API and the REST API are configured to require users to have user accounts in order to work with objects in that bucket. You cannot use the the S3 compatible API to change this configuration. However, tenant administrators can change this configuration for the buckets you create.

A user account can be either an account created by a tenant administrator in HCP or, if the tenant is configured to support Active Directory® (AD) authentication, an AD user account that HCP recognizes. (With an AD user account, you cannot create buckets.)

When you use the S3 compatible API with a user account, you provide credentials that are based on the username and password for your account. HCP checks these credentials to ensure that they are valid. The process of checking credentials is called user authentication. If the credentials you supply are valid, you are an authenticated user.

When you use the S3 compatible API without a user account, you are an anonymous user.

For more information on using the S3 compatible API with or without a user account, see Authentication.

NoteWebHelp.png

Note: If the S3 compatible API is not working for you, the reason may be either that the tenant is not configured to support the API or that your user account is not configured to allow the operation you’re trying to perform. To resolve the problem, contact your tenant administrator.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Data access permissions


Data access permissions allow you to access bucket content through the various HCP interfaces. You get these permissions either from your user account or from the bucket configuration.

Data access permissions are bucket specific. That is, they are granted separately for individual buckets.

Each data access permission allows you to perform certain operations. However, not all operations allowed by data access permissions apply to every HCP interface. For example, you can view and retrieve ACLs through the REST API and the S3 compatible API but not through any other namespace access protocol.

Although many of the operations allowed by data access permissions are not supported by the S3 compatible API, a tenant administrator can give you permission for those operations. You can then perform them through other HCP interfaces that support them.

The data access permissions that you can have for a bucket are:

Browse — Lets you list bucket contents.

Read — Lets you:

oView and retrieve objects in the bucket, including the system and custom metadata for objects

oView and retrieve previous versions of objects

oList annotations for objects

oCheck the existence of objects

Users with read permission also have browse permission.

Read ACL — Lets you view and retrieve bucket and object ACLs.

Write — Lets you:

oAdd objects to the bucket

oModify system metadata (except retention hold) for objects in the bucket

oAdd or replace custom metadata for objects in the bucket

Write ACL — Lets you add, replace, and delete bucket and object ACLs.

Change owner — Lets you change the bucket owner and the owners of objects in the bucket.

Delete — Lets you delete objects, custom metadata, and bucket and object ACLs.

Purge — Lets you delete all versions of an object with a single operation. Users with purge permission also have delete permission.

Privileged — Lets you:

oDelete or purge objects that are under retention, provided that you also have delete or purge permission for the bucket

oHold or release objects, provided that you also have write permission for the bucket

Search — Lets you use the HCP metadata query API and the HCP Search Console to query or search the bucket for objects that meet specified criteria. Users with search permission also have read permission.

If you have any data access permissions for a bucket, you can view information about that bucket through the HTTP protocol and Namespace Browser.

For more information on:

Bucket and object ACLs, see Access control lists

Object versions, see Versioning

Object owners, see Object owners

Object retention and hold, see Retention

The HCP Search Console, see HCP Search Console

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Examples in this book


This book contains instructions and examples for using the S3 compatible API to perform the operations listed in About the Hitachi API for Amazon S3. The examples use a command-line tool called s3curl. s3curl is freely available open-source software. You can download it from  http://aws.amazon.com/code/128.

After downloading s3curl, you need to configure it to work with HCP. For information on doing this, see Using third-party tools with the Hitachi API for Amazon S3.

The examples in this book are based on a bucket named finance in which these objects are stored:

AcctgBestPractices.doc (four versions stored and one deleted)
acctg/AcctgRR-Summary
acctg/budget_proposals/BudgProp-2019
hum_res/budget_proposals/BudgProp-2019
mktg/budget_proposals/BudgProp-2019
mktg/campaign_GoGetEm_expenses.xls (two versions stored)
mktg/campaign_LiveIt_expenses.xls
quarterly_rpts/Q2_2018.ppt
quarterly_rpts/Q3_2018.ppt
quarterly_rpts/Q4_2018.ppt
sales/budget_proposals/BudgProp-2019
sales_quotas_2019.pdf

The finance bucket also contains in-progress multipart uploads for these objects:

acctg/AcctgAtExampleCorp-Advanced.mov
acctg/AcctgAtExampleCorp-Introduction.mov
acctg/RulesAndRegulations.pdf
sales/RulesAndRegulations.pdf

For information on multipart uploads, see Working with multipart uploads.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

 

  • Was this article helpful?