Skip to main content
Hitachi Vantara Knowledge

Introduction to searching in HCP


Hitachi Content Platform (HCP) is a distributed storage system designed to support large amounts of data. HCP provides access to the stored data through a variety of industry-standard protocols, as well as through an integrated Search Console.

The Search Console enables you to search for objects stored in HCP using either of two search facilities: the metadata query engine or the HDDS search facility.

NoteWebHelp.png

Note: When working with the metadata query engine, the Search Console is called the Metadata Query Engine Console.

This chapter provides an introduction to HCP and searching namespaces, including how to use the Search Console, the types of queries you can construct, and what search results look like.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

About Hitachi Content Platform


Hitachi Content Platform is the distributed, fixed-content, data storage system from Hitachi Vantara. HCP provides a cost-effective, scalable, easy-to-use repository that can accommodate all types of data, from simple text files to medical image files to multigigabyte database images.

A fixed-content storage system is one in which the data cannot be modified. HCP uses write-once, read-many (WORM) storage technology, and a variety of policies and internal processes to ensure the integrity of the stored data.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Objects


HCP stores objects in a repository. Each object permanently associates data HCP receives (for example, a document, an image, or a movie) with information about that data. This information is called metadata.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Namespace and tenants


An HCP repository is partitioned into namespaces. A namespace is a logical grouping of objects such that the objects in one namespace are not visible in any other namespace.

Namespaces provide a mechanism for separating the data stored for different applications. For example, one namespace could store accounts-receivable data while another stores accounts-payable data.

Namespaces are owned and managed by administrative entities called tenants. A tenant typically corresponds to an organization, such as a company or a division or department within a company. A tenant can also correspond to an individual person.

A tenant can own multiple namespaces. However, one special tenant, named default, owns only one namespace, named default. This namespace has some different properties from other namespaces. These differences are pointed out, where applicable, in this book.

NoteWebHelp.png

Note: This book refers to all tenants except the default tenant as HCP tenants. The namespaces owned by HCP tenants are called HCP namespaces.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Object metadata


HCP automatically generates metadata for each object. Some of this metadata is specific to HCP. Examples of this type of metadata are the retention setting, object creation date, and cryptographic hash value.

Objects also have POSIX metadata. POSIX is a set of standards that defines an application programming interface (API) for software designed to run under heterogeneous operating systems. These standards include specific types of metadata, such as permissions and ownership.

Users and applications can override the defaults for some HCP-specific and POSIX metadata when they add an object to a namespace. They can also change certain metadata values for existing objects.

Users can create their own custom metadata to associate additional descriptive information with an object. Custom metadata is specified as annotations, where each annotation is a discrete unit of information about the object.

Custom metadata enables the creation of self-describing objects. Future users and applications can use this metadata to understand and repurpose object content.

When added to a namespace, custom metadata becomes part of the target object. Custom metadata is typically but not necessarily formatted as XML.

Users can also associate access control lists (ACLs) with objects. An ACL grants permissions for an individual object to specified users or groups of users.

When added to a namespace, an ACL becomes part of the target object. When viewed or specified in the Search Console, ACLs are formatted as XML.

ACLs are enabled on a per-namespace basis. In namespaces where ACLs are enabled, the namespace can be configured to either enforce or ignore the permissions granted by ACLs.

For more information on metadata, see Understanding returned metadata, Showing results details, Using a Namespace, and Using the Default Namespace.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Retention settings


Each object has a retention setting that specifies how long the object must remain in its namespace before it can be deleted; this duration is called the retention period. While an object cannot be deleted due to its retention setting, it is said to be under retention.

The retention setting for an object can be:

A specific date and time — This is the time before which the object cannot be deleted. If this is a date in the past, this setting is displayed as Expired in the Search Console.

One of these special values:

oDeletion Allowed — The object can be deleted at any time. This value is displayed as Expired in the Search Console.

oDeletion Prohibited — The object can never be deleted.

oInitial Unspecified — The object does not yet have a specific retention setting and cannot be deleted until it has a setting that allows deletion.

A retention class — This is a named retention setting. It can be a duration (such as seven years) or one of the special values listed above.

Retention classes are namespace specific. That is, an object in one namespace cannot be assigned a retention class that’s defined in a different namespace.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Retention mode


Retention mode is a property of a namespace that affects which operations are allowed on objects under retention. A namespace can be in either of two retention modes:

In compliance mode, objects that are under retention cannot be deleted through any mechanism. Additionally, the duration of a retention class cannot be shortened, and retention classes cannot be deleted.

In enterprise mode, users and applications can delete objects under retention if they have specific permission to do so. This is called privileged delete.

Also in enterprise mode, the duration of a retention class can be shortened, and retention classes can be deleted.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

About searching namespaces


HCP lets you search namespaces for objects that meet specified criteria. This capability supports search and discovery to satisfy government requirements and provides support for audits and litigation. You can use the results of a search to analyze namespace contents and manipulate groups of objects.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Search Console


HCP provides an interactive interface for searching namespaces. This interface, called the Search Console, is a web application that offers a structured environment for creating and executing queries. You can also use the Search Console to perform these operations on groups of objects: hold, release, delete, purge, privileged delete, privileged purge, change owner, and set ACLs.

A query is a request you submit that contains a collection of criteria that each object in the search results must satisfy. The response to a query is metadata about the objects that meet the query criteria. You can use this metadata to retrieve objects of interest. Additionally, from the Search Console, you can export the metadata for use as input to other applications.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Search facilities


The Search Console works with either of these search facilities:

The metadata query engine — This facility is integrated with HCP and is also used by the metadata query API, which is a programmatic interface for querying namespaces.

The Hitachi Data Discovery Suite (HDDS) search facility — This facility interacts with HDDS, which performs searches and returns results to the HCP Search Console. HDDS is a separate product from HCP.

This book covers aspects of HDDS that are specific to HCP. For more information on HDDS, see the HDDS documentation.

Only one search facility can be selected for use with the Search Console at any given time. This facility, called the active search facility, is selected at the HCP system level. If no search facility is selected, the HCP system does not support searching namespaces.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Indexes


Each search facility maintains an index of objects. The index maintained by the metadata query engine resides in HCP. The index maintained by the HDDS search facility resides in HDDS.

The metadata query engine index is based on system metadata, custom metadata that is well-formed XML, and ACLs. The index maintained by the HDDS search facility is based on object data and metadata.

Indexing is enabled on a per-namespace basis. If a namespace is not indexed, searches do not return any results for objects in the namespace.

Indexing of custom metadata is also enabled on a per-namespace basis. If indexing of custom metadata is disabled for a namespace, the index associated with the metadata query engine does not include custom metadata for objects in the namespace.

HCP namespaces can be configured to store multiple versions of objects. Each index, however, includes only the most current version of an object.

To maintain its index, each search facility periodically checks indexable namespaces for new objects and for objects with metadata that has changed since the last check. When it finds new or changed information, it updates its index. The amount of time a search facility takes to update its index depends on the amount of information to be indexed.

NoteWebHelp.png

Note: If an index update includes a large amount of information, new objects or objects with changed metadata may be unavailable to searches until the update is complete.

Metadata query engine indexing of custom metadata can be configured as follows:

Specific content properties can be indexed. For information on content properties see Content properties.

Specific annotations in a namespace can be excluded from indexing.

Indexing can be enabled or disabled for the full text of custom metadata.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Content properties


Custom metadata in a namespace can be indexed based content properties. A content property is a named construct used to extract an element or attribute value from custom metadata that's well-formed XML. Each content property has a data type that determines how the property values are treated by the metadata query engine. Additionally, a content property is defined as either single-valued or multivalued. A multivalued property can extract the values of multiple occurrences of the same element or attribute from the XML.

Content properties are grouped into content classes, and each namespace can be associated with a set of content classes. The content properties that belong to a content class associated with the namespace are indexed for the namespace. Content classes are defined at the tenant level, so multiple namespaces can be associated with the same content class.

For example, if the namespace Personnel is associated with the content class MedInfo, and the content property DrName is a member of the content class, the query engine will use the DrName content property to index the custom metadata in the Personnel namespace.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Index settings


Each object has an index setting that the metadata query engine uses to determine whether to index custom metadata for the object. The metadata query engine always indexes object metadata and ACLs regardless of the index setting on an object.

Index settings do not affect HDDS search facility indexing.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Extracted metadata


In addition to object data and metadata, the index maintained by the HDDS search facility includes extracted metadata. Extracted metadata is metadata that’s specific to a document format. Examples of this type of metadata are the author and title of a stored document.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Searchable namespaces


For a namespace to be searchable in the Search Console:

The namespace must be indexed by the active search facility.

The namespace must be configured to allow searches. This property of a namespace is separate from whether the namespace is indexed.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Using the Search Console


To use the Search Console, you need one of these:

A tenant-level user account that is defined in HCP

If HCP is configured to support Active Directory® (AD), an AD user account that is recognized at the tenant level

A system-level user account that is defined in HCP and has the search role

An AD user account that is recognized at the system level and has the search role

Additionally, to perform searches while the HDDS search facility is active, you need an HDDS username and password. After logging into the Search Console, you need to set your HDDS username and password in the Console. To get an HDDS username and password, see your HCP or HDDS administrator.

When you log into the Search Console with a tenant-level user account that’s defined in HCP or an AD account recognized at the tenant level, you access the Console for a specific HCP tenant. You can search only searchable namespaces owned by that tenant. Your user account specifies which of those namespaces you have permission to search. If you don’t have permission to search a given namespace, search results don’t include any objects from that namespace.

When you log into the Search Console with a system-level user account that’s defined in HCP or an AD user account recognized at the system level, you can search the default namespace and you may also be able to search the searchable namespaces belonging to one or more HCP tenants. This depends on the configuration of those tenants.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Search Console URL


Tenant-level users and system-level users specify different URLs to access the Search Console. In either case, access to the Console requires the use of SSL security with HTTP (HTTPS).

NoteWebHelp.png

Note: If you inadvertently use http instead of https in the URL, the browser prompts you to open or save a file. Cancel out of the prompt and try again, this time using https.

Search Console URL for tenant-level users

To access the Search Console with a tenant-level user account, you use a URL with this format:

https://tenant-url-name.hcp-domain-name:8888

For example, to access the Search Console for the tenant named Finance in the HCP system named hcp.example.com, you use this URL:

https://finance.hcp.example.com:8888

Typically, HCP relies on DNS for hostname resolution. If this is not the case, you need to provide a mapping of the tenant hostname to an IP address for the HCP system. You specify this mapping in the c:\windows\system32\drivers\etc\hosts file (Windows®), the /etc/hosts file (Unix), or the /private/etc/hosts file (Mac OS® X) on the client.

Each line in a hosts file is a mapping of a hostname to an IP address. So, for example, if one of the IP addresses for the HCP system is 192.168.210.16, you would add this line to the hosts file on the client to enable access to the Search Console for the Finance tenant:

192.168.210.16 finance.hcp.example.com

For the IP addresses for the HCP system, see your HCP tenant administrator.

Search Console URL for system-level users

To access the Search Console with a system-level user account, you use a URL with this format:

https://search.hcp-domain-name:8888

For example, to access the Search Console for the HCP system named hcp.example.com, you use this URL:

https://search.hcp.example.com:8888

You can also use an IP address for the HCP system to access the Search Console with a system-level user account:

https://hcp-ip-address:8888

For example, to access the Search Console for an HCP system that has an IP address that is 192.168.210.16, you use this URL:

https://192.168.210.16:8888

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Search Console sessions


A Search Console session begins when you take one of these actions:

Log into the Console using an HCP user account.

Access a Console page while logged into Windows with an AD user account that HCP recognizes. This is called single sign-on.

For single sign-on to work, your web browser must be configured to support it. For more information on this, see Browser configuration for single sign-on with Active Directory.

Log into the Console using a recognized AD user account other than the one with which you’re currently logged into Windows.

A session ends when you log out.

During a session, if you don’t take any action for a certain amount of time, the Console automatically logs you out if you explicitly logged in or, in the case of single sign-on, returns you to the Simple Search page. The exact amount of idle time allowed is determined by the tenant configuration.

Logging in

To log into the Search Console:

1.Open a web browser window.

2.In the address field, enter the URL for the Search Console.

One of these happens:

oIf all of these are true, you are automatically logged into the Search Console, and the Simple Search page appears:

The tenant is configured to support AD authentication.

Your web browser is configured to support single sign-on with AD. For information on this, see Browser configuration for single sign-on with Active Directory.

You are currently logged into Windows with a recognized AD user account.

In this case, no further action is required.

oIf the tenant is configured to support AD authentication but any of the following apply, a message appears indicating that single sign-on was not possible:

Your web browser is not configured to support single sign-on.

You are not currently logged into Windows with a recognized AD user account.

You are not on a Windows computer.

In these cases, you need to click on Console login page in the message to display the Search Console login page.

oIf the tenant is not configured to support AD authentication, the Search Console login page appears.

3.On the Search Console login page, if HCP is configured to support AD, take either of these actions in the Domain field:

oIf you’re using an HCP user account, select the fully qualified name of the HCP system.

oIf you’re using a recognized AD user account, select the AD domain in which your user account is defined.

If HCP is not configured to support AD, the login page does not display the Domain field.

4.In the Username field, type your username.

5.In the Password field, type your case-sensitive password.

ImportantWebHelp.png

Important: If you’re using an HCP user account, you should change your password as soon as possible the first time you log into the Search Console.

6.Click on Log In.

The Search Console displays the Simple Search page (shown below) or, if you logged in using a user account that’s defined in HCP and you’re required to change your password, the Account Management page. You also set your HDDS credentials on the Account Management page if the HDDS search facility is active.

SimpleSearchPage.png

For information on:

oUsing the Simple Search page, see Working with simple searches.

oChanging your password, see Changing your password.

oSetting your HDDS credentials, see Setting HDDS credentials.

Logging out

To log out of the Search Console:

1.Click on Log out as in the top right corner of the page.

The Console returns to the login page. To continue using the Search Console, you need to log in again.

2.If you explicitly logged in, close the browser window to ensure that other users cannot go back into the Search Console using the credentials you used to log in.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Search Console pages and navigation


Each page in the Search Console lets you perform a specific activity. To navigate among the pages, you can use the tabs at the top of the page. You can also use shortcut keys for navigation. Each link that has a shortcut key has one letter underlined. To use the shortcut key, follow the convention for the browser you’re using.

While the metadata query engine is active, on each page, the Search Console indicates how current the index is by showing the date and time before which eligible objects are guaranteed to be indexed. That is, any eligible object that was added to a namespace or that had a metadata change before the indicated date and time is guaranteed to be indexed. Objects that were added or had metadata changes after that date and time may or may not be indexed.

Search Console pages do not automatically refresh themselves while they remain open. To see the latest results for the current search, use your browser refresh button.

When you switch from one page to another, the Console does not retain the search on the original page (neither the query nor the search results). To see that search again, you can either recreate it or use the browser back button to return to it. Alternatively, you can save it before you switch pages. For information on saving a search, see Working with saved queries.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Viewing search documentation


HCP documentation is available online in PDF format. To view a document from the Search Console, take either of these actions:

In the top right corner of the Search Console window, place the cursor on Documentation. Then, in the dropdown menu, click on the document you want.

In the top right corner of the Search Console window, click on Documentation. Then, on the Documentation page, click on the document you want.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Changing your password


If you used a locally authenticated HCP user account to log into the Search Console, you can change your password in this Console. When you change your password in the Search Console, the password also changes for any other HCP interfaces to which your user account gives you access.

To change your password in the Search Console:

1.Click on the Account Management tab.

2.On the Account Management page, in the Change Password for User section:

oIn the Existing Password field, type your current password.

oIn the New Password field, type your new password. Passwords can be up to 64 characters long, are case sensitive, and can contain any valid UTF-8 characters, including white space.

To be valid, a password must include at least one character from two of these three groups: alphabetic, numeric, and other.

The minimum length for passwords is site specific. Typically, it’s six or eight characters.

oIn the Confirm New Password field, type the new password again.

3.Click on Submit.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

Setting HDDS credentials


To set your HDDS credentials while the HDDS search facility is active:

1.In the HCP Search Console, click on the Account Management tab.

2.On the Account Management page, in the Set HDDS Credentials section, enter your case-sensitive HDDS username and password.

3.Click on Test to ensure that your username and password are valid.

If the test fails, contact your HCP tenant administrator. HDDS may be unavailable, or your credentials may be invalid.

4.Click on Save.

NoteWebHelp.png

Notes: 

Once set, your HDDS credentials are saved for future Search Console sessions.

If the test fails, you can still set your HDDS credentials by clicking on Save. However, your queries will not return any results until you provide valid HDDS credentials.

© 2015, 2019 Hitachi Vantara Corporation. All rights reserved.

 

  • Was this article helpful?