Introduction to Hitachi Content Platform
Hitachi Content Platform (HCP) is a distributed storage system designed to support large, growing repositories of fixed-content data. HCP stores objects that include both data and metadata that describes the data.
HCP provides access to stored objects through a variety of industry-standard protocols, as well as through various HCP-specific interfaces.
This chapter introduces basic HCP concepts and includes information on what you can do with an HCP namespace.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
About Hitachi Content Platform
Hitachi Content Platform is a combination of hardware and software that provides an object-based data storage environment. An HCP repository stores all types of data, from simple text files to medical images to multigigabyte database images.
HCP provides easy access to the repository for adding, retrieving, and deleting data. HCP uses write-once, read-many (WORM) storage technology and a variety of policies and internal processes to ensure the integrity of the stored data and the efficient use of storage capacity
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Object-based storage
HCP stores objects in the repository. Each object permanently associates data HCP receives (for example, a document, an image, or a movie) with information about that data, called metadata.
An object encapsulates:
•Fixed-content data — An exact digital reproduction of data as it existed before it was stored. Once it’s in the repository, this fixed-content data cannot be modified.
•System metadata — System-managed properties that describe the fixed-content data (for example, its size and creation date). System metadata consists of POSIX metadata as well as HCP-specific settings, such as retention and data protection level, that influence how transactions and internal processes affect the object.
•Custom metadata — Optional metadata that a user or application provides to further describe the object. Custom metadata is specified as one or more annotations, where each annotation is a discrete unit of information about the object. Annotations are typically specified in XML format.
You can use custom metadata to create self-describing objects. Users and applications can use this metadata to understand and repurpose the object content.
•Access control list (ACL) — Optional metadata consisting of a set of grants of permissions to perform various operations on the object. Permissions can be granted to individual users or to groups of users.
ACLs are provided by users or applications and are specified in either XML or JSON format.
HCP can store multiple versions of an object, thus providing a history of how the data has changed over time. Each version is a separate object, with its own system metadata and, optionally, its own custom metadata and ACL.
HCP supports multipart uploads with the Hitachi API for Amazon S3. With a multipart upload, the data for an object is broken into multiple parts that are written to HCP independently of each other. Even though the data is written in multiple parts, the result of a multipart upload is a single object. An object for which the data is stored in multiple parts is called a multipart object.
Note: Multipart uploads are possible only with the S3 compatible API, but objects created by multipart uploads can be managed and retrieved with the REST and HSwift namespace access protocols. Such objects cannot be managed or retrieved with the WebDAV, CIFS, and NFS protocols. |
In addition to objects, HCP also stores directories and symbolic links in the repository. Directories and symbolic links have POSIX metadata but no fixed-content data, HCP-specific metadata, custom metadata, or ACLs.
Note: An object is equivalent to a WebDAV resource. A directory is equivalent to a WebDAV collection. |
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Namespaces and tenants
An HCP repository is partitioned into namespaces. A namespace is a logical grouping of objects such that the objects in one namespace are not visible in any other namespace.
Namespaces provide a mechanism for separating the data stored for different applications, business units, or customers. For example, you could have one namespace for accounts receivable and another for accounts payable.
Namespaces also enable operations to work against selected subsets of repository objects. For example, you could perform a query that targets the accounts receivable and accounts payable namespaces but not the employees namespace.
Namespaces are owned and managed by administrative entities called tenants. A tenant typically corresponds to an organization such as a company or a division or department within a company.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Namespace access
HCP supports access to namespace content through:
•Several industry-standard protocols
•The HCP Namespace Browser
•The HCP metadata query API
•The HCP Search Console
•HCP Data Migrator
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Namespace access protocols
HCP supports access to namespace content through several industry-standard protocols:
•A RESTful HTTP API (called REST in HCP)
•Hitachi API for Amazon S3, which is a RESTful, HTTP-based S3 compatible API
•HSwift, which is a RESTful, HTTP-based API that’s compatible with OpenStack Swift
•WebDAV
•CIFS
•NFS
These protocols support various operations: storing data, creating directories, viewing object data and metadata, viewing directories, modifying certain metadata, and deleting objects. You can use these protocols to access data with a web browser, third-party applications, Windows Explorer, and other native Windows and Unix tools.
HCP can be configured to allow special-purpose access to a namespace through the SMTP protocol. This protocol is used only for storing email. HCP automatically generates directory paths and object names for email objects stored this way. For information on how HCP names email objects, see Naming conventions for email objects.
Objects added to the namespace through any protocol are immediately accessible through any other protocol.
The namespace access protocols are configured separately for each namespace and are enabled or disabled independently of each other. If you cannot access a namespace through a given protocol, you can ask your tenant administrator to enable that protocol.
The REST, S3 compatible, and CIFS protocols can be configured to require authentication. To use a protocol that requires authentication, users and applications must present valid credentials for access to the namespace.
If the REST, S3 compatible, or CIFS protocol is enabled but is not configured to require authentication, users and applications can access the namespace anonymously, without presenting any credentials.
The WebDAV and NFS protocols do not support authenticated access; when you use these protocols, you always access the namespace anonymously. For the WebDAV protocol, this is true even when WebDAV basic authentication is enabled for the namespace.
To find out the authentication requirements for the namespace you want to access, contact your tenant administrator.
For more information on:
•The REST API, see HTTP
•The S3 compatible API, see Using the Hitachi API for Amazon S3
•The HSwift API, see Using the HCP OpenStack HSwift API
•The WebDAV API, see WebDAV
•WebDAV basic authentication, see Basic authentication with WebDAV
•The CIFS protocol, see CIFS
•The NFS protocol, see NFS
The figure below shows the relationship between original data, objects in a namespace, and the supported namespace access protocols.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
HCP Namespace Browser
The HCP Namespace Browser lets you manage content in and view information about namespaces. With the Namespace Browser, you can:
•List, view, and retrieve objects, including old versions of objects
•View custom metadata and ACLs for objects, including old versions of objects
•Store and delete objects
•Create empty directories
•Display namespace information, including:
oThe namespaces that you own or can access
oRetention classes available for a given namespace
oPermissions for namespace access
oNamespace statistics such as the number of objects in a given namespace or the total capacity of the namespace
For information on using the Namespace Browser, see Namespace Browser.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
HCP metadata query API
The HCP metadata query API lets you search HCP for objects that meet specified criteria. The API supports two types of queries:
•Object-based queries search for objects based on object metadata. This includes both system metadata and the content of custom metadata and ACLs. The query criteria can also include the object location (that is, the namespace and/or directory that contains the object). These queries use a robust query language that lets you combine search criteria in multiple ways.
Object-based queries search only for objects that currently exist in the repository. For objects with multiple versions, object-based queries return only the current version.
•Operation-based queries search not only for objects currently in the repository but also for information about objects that have been deleted by a user or application, deleted through disposition, purged, or pruned. For namespaces that support versioning, operation-based queries can return both current and old versions of objects.
Criteria for operation-based queries can include object status (for example, created or deleted), change time, index setting, and location.
The metadata query API returns object metadata only, not object data. The metadata is returned either in XML format, with each object represented by a separate element, or in JSON format, with each object represented by a separate name/value pair. For queries that return large numbers of objects, you can use paged requests.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
HCP Search Console
The HCP Search Console is an easy-to-use web application that lets you search for and manage objects based on specified criteria. For example, you can search for objects that were stored before a certain date or that are larger than a specified size. You can then delete the objects listed in the search results or prevent those objects from being deleted. Similar to the metadata query API, the Search Console returns only object metadata, not object data.
By offering a structured environment for performing searches, the Search Console facilitates e-discovery, namespace analysis, and other activities that require the user to examine the contents of namespaces. From the Search Console, you can:
•Open objects
•Perform bulk operations on objects
•Export search results in standard file formats for use as input to other applications
•Publish feeds to make search results available to web users
The Search Console works with either of these two search facilities:
•The HCP metadata query engine — This facility is integrated with HCP and works internally to perform searches and return results to the Search Console. The metadata query engine is also used by the metadata query API.
Note: When working with the metadata query engine, the Search Console is called the Metadata Query Engine Console |
•The Hitachi Data Discovery Suite (DDS) search facility — This facility interacts with HDDS, which performs searches and returns results to the HCP Search Console. HDDS is a separate product from HCP.
The Search Console can use only one search facility at any given time. The search facility is selected at the HCP system level. If no facility is selected, the HCP system does not support use of the Search Console to search namespaces.
Each search facility maintains its own index of objects in each search-enabled namespace and uses this index for fast retrieval of search results. The search facilities automatically update their indexes to account for new and deleted objects and changes to object metadata.
Note: Not all namespaces support search. To find out whether a namespace is search-enabled, see your tenant administrator. |
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
HCP Data Migrator
HCP Data Migrator (HCP-DM) is a high-performance, multithreaded, client-side utility for viewing, copying, and deleting data. With HCP-DM, you can:
•Copy objects, files, and directories between the local file system, HCP namespaces, default namespaces, and earlier HCAP archives
•Delete individual objects, files, and directories and perform bulk delete operations
•View the content of current and old versions of objects and the content of files
•Purge all versions of an object
•Rename files and directories on the local file system
•View object, file, and directory properties
•Change system metadata for multiple objects in a single operation
•Add, replace, or delete custom metadata for objects
•Add, replace, or delete ACLs for objects
•Create empty directories
HCP-DM has both a graphical user interface (GUI) and a command-line interface (CLI).
For information on installing and using HCP-DM, see Using HCP Data Migrator.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
HCP nodes
The core hardware for an HCP system consists of servers that are networked together. These servers are called nodes.
When you access an HCP system, your point of access is an individual node. To identify the system, however, you can use either the domain name of the system or the IP address of an individual node. When you use the domain name, HCP selects the access node for you. This helps ensure an even distribution of the processing load.
For information on when to use an IP address instead of the domain name, see Hostname and IP address considerations.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Permissions
To access a namespace and take action in it, clients must have the necessary permissions. The table below describes the possible permissions and the operations they allow.
Permission |
Operations |
---|---|
Browse |
•List directory contents. •Check for directory existence. |
Read |
•Retrieve objects and system metadata. •Check for object existence. •List annotations. •Check for and retrieve annotations. Read operations also require browse permission. |
Read ACL |
•Check for and retrieve ACLs. |
Write |
•Store objects. •Create directories. •Modify system metadata. •Add and replace annotations. |
Write ACL |
•Add, replace, and delete ACLs. |
Delete |
•Delete objects, empty directories, annotations, and ACLs. |
Purge |
•Delete objects and their old versions (also requires delete permission). |
Privileged |
•Delete or purge objects regardless of retention (also requires delete or purge permissions). •Place objects on hold or release objects from hold (also requires write permission). |
Change owner |
•Change object owners. |
Search |
•Search for objects (also requires browse and read permissions). For information on searching for objects, see HCP Metadata Query API Reference and Searching Namespaces. |
Note: When using the CIFS protocol with a Windows client, you need both read and write permissions to store objects. |
Data access permission mask
The operations allowed in a namespace are determined by a data access permission mask for the namespace. Data access permission masks are set at the system, tenant, and namespace levels.
The effective permissions for a namespace are the operations that are allowed by the mask at all three levels. That is, to be in effect for a namespace, a permission must be included in the system-level permission mask, the tenant-level permission mask, and the namespace-level permission mask.
User permissions
To perform an operation in a namespace, the operation must be allowed by the effective permission mask and by your user permissions. The permissions for what you can do in a namespace come from your user account (if you’re an authenticated user), the namespace configuration, and, for individual objects, the object ACL.
For information on the permissions that can be granted by an ACL, see ACL permissions.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Replication
Replication is a process that supports configurations in which selected tenants and namespaces are maintained on two or more HCP systems and the objects in those namespaces are managed across those systems. This cross-system management helps ensure that data is well-protected against the unavailability or catastrophic failure of a system.
A replication topology is a configuration of HCP systems that are related to each other through replication. Typically, the systems in a replication topology are in separate geographic locations and are connected by a high-speed wide area network. This arrangement provides geographically distributed data protection (called geo-protection).
You can read from namespaces on all systems where those namespaces are replicated. The replication topology, which is configured at the system level, determines the systems on which you can write to namespaces.
Replication has several purposes, including:
•If a system in a replication topology becomes unavailable (for example, due to network issues), another system in the topology can provide continued data availability.
•If a system in a replication topology suffers irreparable damage, another system in the topology can serve as a source for disaster recovery.
•If multiple HCP systems are widely separated geographically, each system may be able to provide faster data access for some applications than the other systems can, depending on where the applications are running.
•If an object cannot be read from one system in a replication topology (for example, because a node is unavailable), HCP can try to read it from another system in the topology. Whether HCP tries to do this depends on the namespace configuration.
•If a system in a replication topology is unavailable, HTTP requests to that system can be automatically serviced by another system in the topology. Whether HCP tries to do this depends on the namespace configuration.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Operations
You use namespace access protocols to perform operations in a namespace. The operations you can perform depend on the protocol you’re using.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Operation restrictions
The operations you can perform in a namespace are subject to these restrictions:
•The namespace access protocol must support the operation.
•If the namespace protocol requires client authentication, you need to provide valid user credentials.
•The namespace access protocol must be configured to allow access to the namespace from your client IP address.
•Both the effective namespace permission mask and your permissions must allow the operation.
For information on user permissions, see Permissions.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Supported operations
The table below lists the operations HCP supports and indicates which protocols you can use to perform those operations. Some operations relate to specific types of metadata. For more information on this metadata, see Object properties.
Operation | HTTP | WebDAV | CIFS | NFS |
---|---|---|---|---|
Write data from files or memory to a namespace to create an object |
✓ | ✓ | ✓ | ✓ |
Transmit data to and from HCP in gzip-compressed format |
✓ | |||
Check for object existence |
✓ | ✓ | ||
View the content of an object |
✓ | ✓ | ✓ | ✓ |
Copy an object |
✓ | ✓ | ✓ | ✓ |
Store new versions of existing objects |
✓ | |||
Append to existing objects |
✓ | ✓ | ||
Delete an object that’s not under retention |
✓ | ✓ | ✓ | ✓ |
Delete an object that’s under retention if the namespace configuration allows it |
✓ | |||
View object metadata |
✓ | ✓ | ✓ | ✓ |
View a metafile |
✓ | ✓ | ✓ | |
Override default index, retention, and shred settings when storing an object |
✓ | |||
Change the retention setting for an object |
✓ | ✓ | ✓ | ✓ |
Hold or release an object |
✓ | ✓ | ||
Enable shredding for an object |
✓ | ✓ | ✓ | ✓ |
Change the index setting for an object |
✓ | ✓ | ✓ | ✓ |
Change object ownership (not related to POSIX UID) |
✓ | |||
Change the POSIX UID and GID for an object |
✓ | ✓ | ||
Change the POSIX permission settings for an object |
✓ | ✓ | ||
Change the POSIX atime or mtime value for an object |
✓ | ✓ | ||
Store, replace, or delete an annotation for an object |
✓ | |||
Store, replace, or delete the default annotation for an object |
✓ | ✓ | ||
Store or retrieve object data and an annotation in a single operation |
✓ | |||
Store or retrieve object data and the default annotation in a single operation |
✓ | |||
Check for annotation existence |
✓ | |||
Check for existence of the default annotation |
✓ | ✓ | ||
List annotations |
✓ | |||
Read an annotation |
✓ | |||
Read the default annotation |
✓ | ✓ | ✓ | ✓ |
Store, replace, or delete an ACL for an object |
✓ | |||
Check for ACL existence |
✓ | ✓ | ||
Read ACLs |
✓ | ✓ | ✓ | ✓ |
Create an empty directory in a namespace |
✓ | ✓ | ✓ | ✓ |
View the namespace directory structure, not including metadirectories |
✓ | ✓ | ✓ | ✓ |
View the namespace directory structure, including both directories and metadirectories |
✓ | ✓ | ✓ | |
Rename an empty directory (unless atime synchronization is enabled) |
✓ | ✓ | ✓ | |
Delete an empty directory |
✓ | ✓ | ✓ | ✓ |
Create a symbolic link |
✓ | ✓ | ✓ | ✓ |
Read through a symbolic link to an object |
✓ | ✓ | ✓ | ✓ |
Delete a symbolic link |
✓ | ✓ | ✓ | ✓ |
List the namespaces accessible to you |
✓ | |||
List namespace statistics |
✓ | |||
List namespace permissions for the user |
✓ | |||
List the retention classes available in the namespace |
✓ |
These considerations apply to symbolic links:
•If you use CIFS to create a symbolic link, you can read through the link only with CIFS. You cannot use CIFS to read through a symbolic link created with NFS.
•HTTP and WebDAV support for reading through symbolic links is limited to retrieving object data. Other HTTP and WebDAV operations on symbolic links may produce unexpected results.
Tip: You can use the HCP Search Console to delete, purge, hold, release, change ownership, or modify ACLs on multiple objects with a single operation. |
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.
Prohibited operations
•Rename an object.
•Rename a nonempty directory.
•Overwrite a successfully stored object. However, if versioning is enabled, you can write new versions of an object using the REST, S3 compatible, or HSwift API.
•Modify the fixed-content portion of an object.
•Delete a directory that contains one or more objects.
•Shorten the retention period of an object.
•Store a file (other than a file containing custom metadata), directory, or symbolic link anywhere in the metadata structure.
•Delete a metafile (other than a metafile containing custom metadata) or metadirectory.
•Create a hard link.
Trademarks and Legal Disclaimer
© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.