The HCP metadata query API is a RESTful HTTP API that lets you query HCP for objects that meet specific criteria. In response to a query, HCP returns metadata for the matching objects. With the metadata query API, you can query not only for objects currently in the repository but also for information about objects that have been deleted from the repository.
About the metadata query API
The HCP metadata query API lets you query namespaces for objects that match criteria you specify. Query criteria can be based on system metadata, custom metadata, ACLs, and operations performed on objects. The API does not support queries based on object content.
In response to a query, HCP returns metadata for objects that match query criteria. It does not return object data.
The metadata query API supports two types of queries, object-based queries and operation-based queries.
A single query can return metadata for objects in multiple namespaces, including a combination of HCP namespaces and the default namespace. For HCP namespaces that support versioning, operation-based queries can return metadata for both current and old versions of objects.
To support object-based queries, HCP maintains an index of objects in the repository.
To access HCP through the metadata query API, you use the HTTP POST method. With this method, you specify query criteria in the request body. In the request body you also specify what information you want in the query results.
The API accepts query criteria in XML or JSON format and can return results in either format. For example, you could use XML to specify the query criteria and request that the response be JSON.
Because a large number of matching objects can result in a very large response, the metadata query API lets you limit the number of results returned for a single request. You can retrieve metadata for all the matching objects by using multiple requests. This process is called using a paged query.
Types of queries
The metadata query API supports two types of queries: object-based queries and operation-based queries. These query types have different request formats and return different information about objects in the result set. However, they have similar response formats.
Object-based queries search for objects currently in the repository based on any combination of system metadata, object paths, custom metadata that’s well-formed XML, ACLs, and content properties. With object-based queries, you use a robust query language to construct query criteria.
In response to an object-based query, HCP returns a set of results, each of which identifies an object and contains metadata for the object. With object-based queries, you can specify sort criteria to manage the order in which results are returned. You can specify facet criteria to return summary information about object properties that appear in the result set.
Operation-based queries search for objects based on any combination of create, delete, and disposition operations and, for HCP namespaces that support versioning, purge and prune operations. Operation-based queries are useful for applications that need to track changes to namespace content.
In response to an operation-based query, HCP returns a set of operation records, each of which identifies an object and an operation on the object and contains additional metadata for the object.
By default, for both types of queries, HCP returns only basic information about the objects that meet the query criteria. This information includes the object URL, the version ID, the operation type, and the change time.
If you specify a
verbose entry with a value of
true in the request body, HCP returns complete system metadata for the object or operation. If you aren’t interested in the complete system metadata, you can specify the
objectProperties entry with only the system metadata you want.
Object-based query results
Object-based queries return information about objects that currently exist in the repository. For objects with multiple versions, these queries return information only for the current version.
Object-based queries return information only about objects that have been indexed.
Operation-based query results
HCP maintains records of object creation, deletion, disposition, prune, and purge operations (also called transactions). These records can be retrieved through operation-based queries. The HCP system configuration determines how long HCP keeps deletion, disposition, prune, and purge records. HCP keeps creation records for as long as the object exists in the repository.
Each record has a change time. For creation records, this is the time the object was last modified. For deletion, disposition, prune, and purge records, the change time identifies the time of the operation.
If versioning is enabled for an HCP namespace, the types of records that are returned by an operation-based query depend on the query request parameters. However, the following the rules determine which operation records can be returned:
- HCP returns a creation record for the current version of an object, as long as this version is not a delete marker.
- HCP returns creation records for old versions of an object.
- HCP returns creation records for versions of both deleted objects and disposed objects.
- HCP returns a single purge record for each purge operation. It does not return records for the individual versions of the purged object.
- HCP returns deletion, disposition, prune, and purge records until it removes them from the system.
If you create and then delete an object while versioning is disabled, HCP keeps only the deletion record and not the creation record. Operation-based queries return the deletion record until HCP removes that record from the system.
If you create an object and then HCP disposes of that object while versioning is disabled, HCP keeps only the disposition record and not the creation record. Operation-based queries return the disposition record until HCP removes that record from the system.
If versioning was enabled at an earlier time but is no longer enabled, operation-based queries continue to return records of all operations performed during that time. If you delete an object while versioning is disabled or if HCP disposes of an object while versioning is disabled, operation-based queries do not return any creation records for that object, regardless of whether versioning was enabled when it was created.
With paged queries, you issue multiple requests that each retrieve a limited number of results. You would use a paged query, for example, if:
- The size of the response to a single request would reduce the efficiency of the client. In this situation, you can use a paged query to prevent overloading the client. The client can process the results in each response before requesting additional data.
- The application issuing the query handles a limited number of objects at a time. For example, an application that lists a given number of objects at a time on a web page would use a paged query in which each request returned that number of results.
The criteria for paged queries differ between object-based queries and operation-based queries.
To support object-based queries, HCP maintains an index of objects in the repository. This index is based on object paths, system metadata, custom metadata that’s well formed XML, and ACLs.
Indexing is enabled on a per-namespace basis. If a namespace is not indexed, object-based queries do not return results for objects in the namespace.
HCP periodically checks indexable namespaces for new objects and for objects with metadata that has changed since the last check. When it finds new or changed information, it updates the index. The amount of time HCP takes to update the index depends on the amount of information to be indexed. New or changed information is not reflected in the results of object-based queries until the information is indexed.
Indexing of custom metadata can be configured in these ways:
- Specific content properties can be indexed.
- Specific annotations can be excluded from being indexed. An annotation is a discrete unit of custom metadata
- Custom metadata contents can be optionally indexed for full-text searching.
If indexing of custom metadata is enabled for a namespace, these rules determine whether custom metadata is indexed for an object:
- The custom metadata must be well-formed XML
- The custom metadata must be smaller than one MB.
- The object must have an index setting of
- If custom metadata is not indexed for an object, object-based queries that are based on custom metadata do not return results for that object.
A content property is a named construct used to extract an element or attribute value from custom metadata that's well-formed XML. Each content property has a data type that determines how the property values are treated when indexing and searching.
A content property is defined as either single-valued or multivalued. A multivalued property can extract the values of multiple occurrences of the same element or attribute from the XML.
The XML below shows XML elements with multiple occurrences of two elements,
rank within the element
<record> <weeklyRank> <date> dd/MM/yyyy </date> <rank> (rank) </rank> </weeklyRank> <weeklyRank> <date> dd/MM/yyyy </date> <rank> (rank) </rank> </weeklyRank> <weeklyRank> <date> dd/MM/yyyy </date> <rank> (rank) </rank> </weeklyRank> </record>
WeekyRank object property specifies the record/weeklyRank/rank entry in the XML, the property is multivalued.