Examples
This chapter contains examples of both object-based and operation-based queries. The examples show some of the ways you can use the metadata query API to get information about namespace content.
Object-based query examples
This section contains examples of object-based queries.
Example: Querying for custom metadata content
Here’s a sample metadata query API request that retrieves metadata for all objects that:
- Are in namespaces owned by the europe tenant
- Have custom metadata that contains an element named
department
with a value ofAccounting
The query uses an XML request body and requests results in JSON format.
In addition to the basic information about the objects in the result set, this request returns the shred
and retention
settings for each object in the result set. The request also specifies that objects in the result set be listed in reverse chronological order based on change time.
<queryRequest> <object> <query>customMetadataContent: "department.Accounting.department" </query> <objectProperties>shred,retention</objectProperties> <sort>changeTimeMilliseconds+desc</sort> </object> </queryRequest>
curl -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d" -H "Content-Type: application/xml" -H "Accept: application/json" -d @Accounting.xml "https://europe.hcp.example.com/query?prettyprint"
import pycurl import os curl = pycurl.Curl() # Set the URL, command, and headers curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" + "query?prettyprint") curl.setopt(pycurl.SSL_VERIFYPEER, 0) curl.setopt(pycurl.SSL_VERIFYHOST, 0) curl.setopt(pycurl.POST, 1) curl.setopt(pycurl.HTTPHEADER, ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d", "Content-Type: application/xml", "Accept: application/json"]) # Set the request body from an XML file filehandle = open("Accounting.xml", 'rb') curl.setopt(pycurl.UPLOAD, 1) curl.setopt(pycurl.CUSTOMREQUEST, "POST") curl.setopt(pycurl.INFILESIZE, os.path.getsize("Accounting.xml")) curl.setopt(pycurl.READFUNCTION, filehandle.read) curl.perform() print curl.getinfo(pycurl.RESPONSE_CODE) curl.close() filehandle.close()
POST /query?prettyprint HTTP/1.1 Host: europe.hcp.example.com Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d Content-Type: application/xml Accept: application/json Content-Length: 192
HTTP/1.1 200 OK Transfer-Encoding: chunked
To limit the example size, the JSON below shows only one object in the result set.
{"queryResult: {"query": {"expression":"customMetadataContent: "department.Accounting.department""}, "resultSet":[ {"version":84689494804123, "operation":"CREATED", "urlName":"https://finance.europe.hcp.example.com/rest/presentations/ Q1_2012.ppt", "changeTimeMilliseconds":"1334244924615.00", "retention":0, "shred":false}, . . . ], "status":{ "message":"", "results":12, "code":"COMPLETE"} } }
<?xml version="1.0"> <presentation> <presentedBy>Lee Green</presentedBy> <department>Accounting</department> <slides>23</slides> <date>04-01-2012</date> </presentation>
Example: Using a paged query to retrieve a list of all objects in a namespace
The Java® example below implements a paged query that uses multiple requests to retrieve all objects in a namespace. The example returns metadata for fifty objects per request and also returns information about the size and ingest time of each object in the result set.
This example uses the com.hds.hcp.apihelpers.query
Java class infrastructure, which uses the Jackson JSON processor to produce a JSON query request body and consume a JSON query response. To limit the example size, the example does not include the source code for this infrastructure. To view the full source code, see http://community.hitachivantara.com/groups/developer-network-for-hitachi-content-platform and reference the sample code section.
The Jackson JSON processor serializes and deserializes JSON formatted content with Java Objects. For more information about the Jackson JSON processor, see http://jackson.codehaus.org.
package com.hds.hcp.examples; import java.util.List; import java.io.BufferedReader; import java.io.InputStreamReader; import org.apache.http.HttpResponse; import org.apache.http.client.HttpClient; import org.apache.http.client.HttpResponseException; import org.apache.http.client.methods.*; import org.apache.http.entity.StringEntity; import org.apache.http.util.EntityUtils; /* General purpose helper routines for samples */ import com.hds.hcp.apihelpers.HCPUtils; /* Provide for helper routines to encapsulate the queryRequest and queryResults. */ import com.hds.hcp.apihelpers.query.request.Object; import com.hds.hcp.apihelpers.query.request.QueryRequest; import com.hds.hcp.apihelpers.query.result.Status; import com.hds.hcp.apihelpers.query.result.QueryResult; import com.hds.hcp.apihelpers.query.result.ResultSetRecord; public class PagedObjectQuery { // Local member variables private Boolean bIsInitialized = false; private String sQueryTenant; private String sQueryNamespace; private String sEncodedUserName, sEncodedPassword; private String sHTTPQueryURL; /** * Initialize the object by setting up internal data and establishing the HTTP client. * connection. * * This routine is called by the ReadFromHCP and WriteToHCP routines, so calling it * by the consumer of this class is unnecessary. */ void initialize(String inNamespace, String inUsername, String inPassword) throws Exception { if (! bIsInitialized) // Initialize only if we haven't already { // Break up the namespace specification to get the namespace and tenant parts. String parts[] = inNamespace.split("\\."); sQueryNamespace = parts[0]; sQueryTenant = parts[1]; // Now extract just the tenant part of the URL and use it to create the // HTTPQueryURL. parts = inNamespace.split(sQueryNamespace + "\\."); sHTTPQueryURL = "https://" + parts[1] + "/query"; // Encode both the username and password for the authentication string. sEncodedUserName = HCPUtils.toBase64Encoding(inUsername); sEncodedPassword = HCPUtils.toMD5Digest(inPassword); // Set up an HTTP client for sample usage. mHttpClient = HCPUtils.initHttpClient(); bIsInitialized = true; } } /** * This method performs an orderly shutdown of the HTTP connection manager. */ void shutdown() throws Exception { // Clean up open connections by shutting down the connection manager. mHttpClient.getConnectionManager().shutdown(); } /** * This routine issues a query to an HCP namespace requesting information about * objects in it. The query requests 1,000 results at a time. If there are more, * the routine performs paged queries to retrieve all the results. * * While processing the query results, the routine displays the name of the first * and last object of the result set to system output. */ protected void runQuery() { // Statistics counters Long TotalRecordsProcessed = 0L; Integer HTTPCalls = 0; try { /* * Set up the query request. */ // Set up for an object query by calling the // com.hds.hcp.apihelpers.query.request.Object constructor. Object mObjQuery = new Object(); // Get only 50 objects at a time. mObjQuery.setCount(50); // Retrieve only those that reside in the namespace specified in the command. mObjQuery.setQuery("+namespace:" + sQueryNamespace + "." + sQueryTenant); // Retrieve the "size" and "ingestTimeString" properties for the object. mObjQuery.setObjectProperties("size,ingestTimeString"); // Set up the query request. QueryRequest mQuery = new QueryRequest(mObjQuery); /* * Loop through and process all the objects one response at a time or until * an error occurs. */ QueryResult mQueryResult = null; do { System.out.println("Issuing query: \n" + mQuery.toString(true)); /* * Execute the query using the HTTP POST method. */ HttpPost httpRequest = new HttpPost(sHTTPQueryURL); // Add the body of the POST request. httpRequest.setEntity(new StringEntity(mQuery.toString())); // Set the Authorization header. httpRequest.setHeader("Authorization: HCP " + sEncodedUserName + ":" + sEncodedPassword); // Execute the query. HttpResponse httpResponse = mHttpClient.execute(httpRequest); // For debugging purposes, dump out the HTTP response. HCPUtils.dumpHttpResponse(httpResponse); // If the status code is anything BUT in the 200 range indicating success, // throw an exception. if (2 != (int)(httpResponse.getStatusLine().getStatusCode() / 100)) { // Clean up after ourselves and release the HTTP connection to the // connection manager. EntityUtils.consume(httpResponse.getEntity()); throw new HttpResponseException(httpResponse.getStatusLine() .getStatusCode(), "Unexpected status returned from " + httpRequest.getMethod() + " (" + httpResponse.getStatusLine().getStatusCode() + ": " + httpResponse.getStatusLine().getReasonPhrase() + ")"); } /* * Process the response from the query request. */ // Put the response in a buffered reader. BufferedReader bodyReader = new BufferedReader(newInputStreamReader (httpResponse.getEntity().getContent())); HTTPCalls += 1; // Parse the response into the QueryResult object. mQueryResult = QueryResult.parse(bodyReader); // Get a copy of the query status from the query result. Status mStatus = mQueryResult.getStatus(); // Display the status of what we just accomplished. System.out.println(); System.out.println("Batch " + HTTPCalls + " Status: " + mStatus.getCode() + " Record Count:" + mStatus.getResults()); // Display the first and last object of the result set. List<ResultSetRecord> mResultSet = mQueryResult.getResultSet(); ResultSetRecord mFirstRecord = mResultSet.get(0); System.out.println(" First Record (" + (TotalRecordsProcessed+1) + ") " + mFirstRecord.getUrlName()); System.out.println(" Size: " + mFirstRecord.getSize()); TotalRecordsProcessed += mStatus.getResults(); ResultSetRecord mLastRecord = mResultSet.get(mResultSet.size()-1); System.out.println(" Last Record (" + TotalRecordsProcessed + ") "+ mLastRecord.getUrlName()); System.out.println(" Size: " + mLastRecord.getSize()); System.out.println(); // Now we need to see whether the query is complete or whether there are more // objects. If INCOMPLETE, it is a successful paged query. if (Status.Code.INCOMPLETE == mStatus.getCode()) { // We have more, so update the offset for the next query to be the previous // offset plus the number we just read. mObjQuery.setOffset( (null == mObjQuery.getOffset() ? 0 : mObjQuery.getOffset()) + mStatus.getResults() ); } // Clean up after ourselves and release the HTTP connection to the connection // manager. EntityUtils.consume(httpResponse.getEntity()); } // Keep doing this while we have more results. while (Status.Code.INCOMPLETE == mQueryResult.getStatus().getCode()); /* * Print out the final statistics. */ System.out.println("Total Records Processed: " + TotalRecordsProcessed); System.out.println("HTTP Calls: " + HTTPCalls); } catch(Exception e) { e.printStackTrace(); } } /* * @param args */ public static void main(String[] args) { PagedObjectQuery myClass = new PagedObjectQuery(); if (args.length != 3) { System.out.println(); System.out.println("Usage: " + myClass.getClass().getSimpleName() + " <DNS-Namespace> <Username> <Password>\n"); System.out.println(" where "); System.out.println(" <DNS-Namespace> is the fully qualified domain name" + " of the HCP Namespace."); System.out.println(" For example: \"ns1.ten1.myhcp.example.com\""); System.out.println(" <Username> and <Password> are the credentials of the" + " HCP user with data access permissions for the namespace"); System.out.println(); System.exit(-1); } try { // Initialize the class with the input parameters myClass.initialize(args[0], args[1], args[2]); // Issue the query and process the results myClass.runQuery(); // Clean up before object destruction myClass.shutdown(); } catch (Exception e) { e.printStackTrace(); } } }
Example Using a faceted query to retrieve object information
Here is a sample metadata query API request that retrieves metadata for all objects added to namespaces owned by the europe tenant between March 1, 2020, and March 31, 2020, inclusive. The verbose
entry specifies true
to request all metadata for each object in the result set. This request also retrieves namespace facet information for objects in the result set. The query uses an XML request body and requests results in XML format.
<queryRequest> <object> <query>ingestTime:[1330560000 TO 1333238399]</query> <facets>namespace</facets> <verbose>true</verbose> </object> </queryRequest>
curl -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d" -H "Content-Type: application/xml" -H "Accept: application/xml" -d @March.xml "https://europe.hcp.example.com/query?prettyprint"
import pycurl import os curl = pycurl.Curl() # Set the URL, command, and headers curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" + "query?prettyprint") curl.setopt(pycurl.SSL_VERIFYPEER, 0) curl.setopt(pycurl.SSL_VERIFYHOST, 0) curl.setopt(pycurl.POST, 1) curl.setopt(pycurl.HTTPHEADER, ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d", "Content-Type: application/xml", "Accept: application/xml"]) # Set the request body from an XML file filehandle = open("March.xml", 'rb') curl.setopt(pycurl.UPLOAD, 1) curl.setopt(pycurl.CUSTOMREQUEST, "POST") curl.setopt(pycurl.INFILESIZE, os.path.getsize("March.xml")) curl.setopt(pycurl.READFUNCTION, filehandle.read) curl.perform() print curl.getinfo(pycurl.RESPONSE_CODE) curl.close() filehandle.close()
POST /query?prettyprint HTTP/1.1 Host: europe.hcp.example.com Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d Content-Type: application/xml Accept: application/xml Content-Length: 134
HTTP/1.1 200 OK Transfer-Encoding: chunked
To limit the example size, the XML below shows only one object
entry in the response body.
<?xml version='1.0' encoding='UTF-8'?> <queryResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result-7.0.xsd"> <query> <expression>ingestTime:[1333238400 TO 1335830399]</expression> <resultSet> <object version="84689595801123" utf8Name="Q1_2020.ppt" urlName="https://marketing.europe.hcp.example.com/rest/ presentations/Q1_2020.ppt" updateTimeString="2020-03-31T15:41:35-0400" updateTime="1333222895" uid="0" type="object" size="6628" shred="false" retentionString="Deletion Allowed" retentionClass=" " retention="0" replicated="true" permissions="555" owner="USER,europe,lgreen" operation="CREATED" namespace="marketing.europe" ingestTimeString="2020-03-31T15:41:35-0400" ingestTime="1333222895" index="true" hold="false" hashScheme="SHA-256" hash="SHA-256 0662D2A2DEF74EF02A8DF5A4F16BF4D55FEE582..." gid="0" objectPath="/presentations/Q1_2020.ppt" dpl="2" customMetadata="false" changeTimeString="2020-03-31T15:41:35-0400" changeTimeMilliseconds="1333222895615.00" accessTimeString="2020-03-31T15:41:35-0400" accessTime="1333222895" acl="false" /> . . . </resultSet> <status results="7" message="" code="COMPLETE" /> <facets> <facet property="namespace <frequency count="4" value="finance.europe" /> <frequency count="3" value="marketing.europe" /> </facet> </facets> </queryResult>
Example: Querying for replication collisions in a namespace
Here’s a sample metadata query API request that retrieves metadata for all objects that are:
- Flagged as replication collisions
- In the finance namespace owned by the europe tenant
The query uses an XML request body and requests results in XML format.
This request returns only the URL, version ID, operation type, and change time for the objects in the result set. The request specifies that the result set be sorted by object path in ascending order.
<queryRequest> <object> <query> +namespace:finance.europe +replicationCollision:true </query> <objectProperties>objectPath</objectProperties> <sort>objectPath+asc</sort> </object> </queryRequest>
curl -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d" -H "Content-Type: application/xml" -H "Accept: application/xml" -d @FinanceCollisions.xml "https://europe.hcp.example.com/query?prettyprint"
import pycurl import os curl = pycurl.Curl() # Set the URL, command, and headers curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" + "query?prettyprint") curl.setopt(pycurl.SSL_VERIFYPEER, 0) curl.setopt(pycurl.SSL_VERIFYHOST, 0) curl.setopt(pycurl.POST, 1) curl.setopt(pycurl.HTTPHEADER, ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d", "Content-Type: application/xml", "Accept: application/xml"]) # Set the request body from an XML file filehandle = open("FinanceCollisions.xml", 'rb') curl.setopt(pycurl.UPLOAD, 1) curl.setopt(pycurl.CUSTOMREQUEST, "POST") curl.setopt(pycurl.INFILESIZE, os.path.getsize("FinanceCollisions.xml")) curl.setopt(pycurl.READFUNCTION, filehandle.read) curl.perform() print curl.getinfo(pycurl.RESPONSE_CODE) curl.close() filehandle.close()
POST /query?prettyprint HTTP/1.1 Host: europe.hcp.example.com Content-Type: application/xml Accept: application/xml Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d Content-Length: 205
HTTP/1.1 200 OK Transfer-Encoding: chunked
<?xml version='1.0' encoding='UTF-8'?> <queryResult xmlns:xsi="http://www.w3.org/2020/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result-9.0.xsd"> <query> <expression>+namespace:t1-ns2.LisaTenant-1 +replicationCollision:true </expression> </query> <resultSet> <object version="89322738450881" urlName="https://finance.europe.hcp.example.com/rest/budgets/2020/ sales_budget_2020.xlsx.collision" operation="CREATED" objectPath="/budgets/2020/sales_budget_2020.xlsx.collision" changeTimeMilliseconds="1395668086005.00" /> <object version="89322749144130" urlName="https://finance.europe.hcp.example.com/rest/quarterly_rpts/ Q1_2020.ppt.collision" operation="CREATED" objectPath="/quarterly_rpts/Q1_2020.ppt.collision" changeTimeMilliseconds="1395668327386.00" /> </resultSet> <status totalResults="2" results="2" message="" code="COMPLETE" /> </queryResult>
Example: Listing content properties
Here is a sample metadata query API request that lists the content properties for all indexed objects in the medical namespace owned by the employees tenant. The query uses an XML request body and requests results in XML format.
<queryRequest> <object> <query>namespace:medical.employees</query> <count>0</count> <contentProperties>true</contentProperties> </object> </queryRequest>
curl -i -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d" -H "Content-Type: application/xml" -H "Accept: application/xml" -d @MedicalQuery.xml "https://employees.hcp.example.com/ query?prettyprint"
import pycurl import os curl = pycurl.Curl() # Set the URL, command, and headers curl.setopt(pycurl.URL, "https://employees.hcp.example.com/" + "query?prettyprint") curl.setopt(pycurl.SSL_VERIFYPEER, 0) curl.setopt(pycurl.SSL_VERIFYHOST, 0) curl.setopt(pycurl.POST, 1) curl.setopt(pycurl.HTTPHEADER, ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d", "Content-Type: application/xml", "Accept: application/xml"]) # Set the request body from an XML file filehandle = open("MedicalQuery.xml", 'rb') curl.setopt(pycurl.UPLOAD, 1) curl.setopt(pycurl.CUSTOMREQUEST, "POST") curl.setopt(pycurl.INFILESIZE, os.path.getsize("MedicalQuery.xml")) curl.setopt(pycurl.READFUNCTION, filehandle.read) curl.perform() print curl.getinfo(pycurl.RESPONSE_CODE) curl.close() filehandle.close()
POST /query?prettyprint HTTP/1.1 Host: employees.example.com Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d Content-Type: application/xml Accept: application/xml Content-Length: 155
HTTP/1.1 200 OK Transfer-Encoding: chunked
To limit the example size, the XML below shows only two contentProperty
entries in the response body.
<?xml version='1.0' encoding='UTF-8'?> <queryResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result-7.0.xsd"> <query> <expression>namespace:medical.employees</expression> </query> <resultSet /> <status totalResults="0" results="0" message="" code="COMPLETE" /> <contentProperties> <contentProperty> <name>DocDateOfBirth</name> <expression>/record/doctor/dob</expression> <type>DATE</type> <multivalued>false</multivalued> <format>MM/dd/yyy</format> </contentProperty> <contentProperty> <name>DocLastName</name> <expression>/record/doctor/name/lastName</expression> <type>STRING</type> <multivalued>false</multivalued> <format></format> </contentProperty> </contentProperties> </queryResult>
Operation-based query examples
This section contains examples of operation-based queries.
Example: Retrieving all operation records for all existing and deleted objects in a directory
Here’s a sample metadata query API request that retrieves operation records for all objects currently in or deleted from the sales namespace owned by the midwest tenant. The query uses an XML request body and requests results in JSON format.
The verbose
entry is set to true to request detailed information for all operation records in the result set.
The response body includes records for all create, delete, and purge operations that occurred since the namespace was created up to one minute before the request was made at March 14, 2013 at 14:59:37 EST.
<queryRequest> <operation> <count>-1</count> <systemMetadata> <namespaces> <namespace>sales.midwest</namespace> </namespaces> </systemMetadata> <verbose>true</verbose> </operation> </queryRequest>
curl -i -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d" -H "Content-Type: application/xml" -H "Accept: application/json" -d @AllSales.xml "https://midwest.hcp.example.com/query?prettyprint"
import pycurl import os curl = pycurl.Curl() # Set the URL, command, and headers curl.setopt(pycurl.URL, "https://midwest.hcp.example.com/" + "query?prettyprint") curl.setopt(pycurl.SSL_VERIFYPEER, 0) curl.setopt(pycurl.SSL_VERIFYHOST, 0) curl.setopt(pycurl.POST, 1) curl.setopt(pycurl.HTTPHEADER, ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d", "Content-Type: application/xml", "Accept: application/json"]) # Set the request body from an XML file filehandle = open("AllSales.xml", 'rb') curl.setopt(pycurl.UPLOAD, 1) curl.setopt(pycurl.CUSTOMREQUEST, "POST") curl.setopt(pycurl.INFILESIZE, os.path.getsize("AllSales.xml")) curl.setopt(pycurl.READFUNCTION, filehandle.read) curl.perform() print curl.getinfo(pycurl.RESPONSE_CODE) curl.close() filehandle.close()
POST /query HTTP/1.1 Host: finance.hcp.example.com Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d Content-Type: application/xml Accept: application/json Content-Length: 258
HTTP/1.1 200 OK Transfer-Encoding: chunked
To limit the example size, the JSON below shows only one object
entry in the response body.
{"queryResult": {"query":{"start":0,"end":1331751577658}, "resultSet":[ {"version":81787144560449 "utf8Name":"C346527", "urlName":"https://sales.midwest.hcp.example.com/rest/ customers/widgetco/orders/C346527", "updateTimeString":"2012-03-10T14:55:33-0500" "updateTime":1331409333, "uid":0, "type":"object", "size":4985, "shred":false, "retentionString":"Deletion Allowed", "retentionClass":"", "retention":"0", "replicated":true, "permissions":"256" "owner":"USER,midwest,rblack" "operation":"CREATED", "namespace":"sales.midwest", "ingestTimeString":"2012-03-10T14:55:33-0500", "ingestTime":1331409333, "index":true, "hold":false, "hashScheme":"SHA-256", "hash":"SHA-256 C67EF26C0E5EDB102A2DEF74EF02A8DF5A4F16BF4D...", "gid":0, "objectPath":"/customers/widgetco/orders/C346527", "dpl":2, "customMetadata":false, "changeTimeString":"2012-03-10T14:55:33-0500", "changeTimeMilliseconds":"1331409333948.00", "accessTimeString":"2012-03-10T14:55:33-0500", "accessTime":1331409333, "acl":false}, . . . ], "status":{"results":7,"message":"","code":"COMPLETE"} } }
Example: Retrieving metadata for changed objects
Here’s a sample metadata query API request that uses a JSON body specified directly in the cURL command line and Python code to retrieve operation records for objects that:
- Are in the finance namespace, which is owned by the europe tenant
- Were modified during 2019
The start
entry specifies 12:00:00.00 a.m. on January 1, 2019, and the end
entry specifies 12:00:00.00 a.m. on January 1, 2020.
The response body is XML. The information returned for each operation record that meets the query criteria consists of the object URL, version ID, operation, and change time.
curl -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d" -H "Content-Type: application/json" -H "Accept: application/xml" -d '{"operation":{"systemMetadata":{"changeTime": {"start":1293840000000,"end":1325376000000},"namespaces": {"namespace":["finance.europe"]}}}}' "https://europe.hcp.example.com/query?prettyprint"
import pycurl curl = pycurl.Curl() # Set the URL, command, and headers curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" + "query?prettyprint") curl.setopt(pycurl.SSL_VERIFYPEER, 0) curl.setopt(pycurl.SSL_VERIFYHOST, 0) curl.setopt(pycurl.POST, 1) curl.setopt(pycurl.HTTPHEADER, ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d", "Content-Type: application/json", "Accept: application/xml"]) # Set the request body theFields = '{"operation":{"systemMetadata":{"changeTime": \ {"start":1293840000000,"end":1325376000000},"namespaces": \ {"namespace":["finance.europe"]}}}}' curl.setopt(pycurl.POSTFIELDS, theFields) curl.perform() print curl.getinfo(pycurl.RESPONSE_CODE) curl.close()
POST /query HTTP/1.1 Host: europe.hcp.example.com Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d Content-Type: application/json Accept: application/xml Content-Length: 81
HTTP/1.1 200 OK Transfer-Encoding: chunked
To limit the example size, the XML below shows only two object
entries in the response body.
<?xml version='1.0' encoding='UTF-8'?> <queryResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result-7.0.xsd"> <query start="1293840000000" end="1325376000000" /> <resultSet> <object version="81787101672577" urlName="https://finance.europe.hcp.example.com/rest/ Presentations/Q2_2019.ppt" operation="CREATED" changeTimeMilliseconds="1310392057456.00" /> <object version="81787102472129" urlName="https://finance.europe.hcp.example.com/rest/ Presentations/Images/thankYou.jpg" operation="CREATED" changeTimeMilliseconds="1310392336286.00" /> . . . </resultSet> <status results="11" message="" code="COMPLETE" /> </queryResult>
Example: Using a paged query to retrieve a large number of records
The Python example below implements a paged query that uses multiple requests to retrieve a large number of operation records in batches of 50 per request. This query retrieves records for all create operations on objects in the /customers/widgetco/orders directory in the default namespace and returns basic information for each record.
The query uses a JSON request body and requests results in JSON format.
#!/usr/bin/env python # encoding: utf-8 import pycurl import StringIO import time import json class OperationBasedQueryTool(): queryArguments = {'operation': {'count': 1, 'verbose': 'false', 'objectProperties': 'utf8Name, type, size', 'systemMetadata': {'changeTime': {}, 'directories': {'directory': []}, 'namespaces': {'namespace': []}, 'transactions': {'transaction': []}}}} def __init__(self): self.complete = False def setConnectionInfo(self, authToken, hostName, urlName): """ Set all connection info for subsequent query requests. @param authToken: authorization token @param hostName: Hostname of the target cluster @param urlName: Full URL for the query interface """ self.curl = pycurl.Curl() requestHeaders = {pycurl.HTTPHEADER :["Authorization: HCP "authToken, "Accept:application/json", "Content-Type: application/json", "Host: admin.%s" % (hostName)]} self.curl.setopt(pycurl.FAILONERROR, 1) self.curl.setopt(pycurl.HTTPHEADER, requestHeaders[pycurl.HTTPHEADER]) self.curl.setopt(pycurl.URL, urlName) for header, value in requestHeaders.iteritems(): self.curl.setopt(header, value) self.curl.setopt(pycurl.CUSTOMREQUEST, 'POST') self.curl.setopt(pycurl.SSL_VERIFYPEER, 0) self.curl.setopt(pycurl.SSL_VERIFYHOST, 0) self.curl.setopt(pycurl.VERBOSE, 0) def setQueryParameters(self, count, verbose, directories, namespaces, transactions, objectProperties, startTimeMillis=0, endTimeMillis=int(round(time.time() * 1000))): """ Set all parameters related to the query. @param count: The number of results to return for each query. @param verbose: Indication to return all object property values. Value is either true or false. @param directories: Dictionary containing list of directory paths. @param namespaces: Dictionary containing list of namespaces. @param transactions: Dictionary containing list of transaction types. @param objectProperties: String containing comma-separated list of object properties to return for each operation record. @param startTimeMillis: The starting timestamp in milliseconds of the query window. Default is 0 (zero). @param endTimeMillis: The ending timestamp in milliseconds of the query window. Default is one minute before time of request. """ self.queryArguments['operation']['count'] = count self.queryArguments['operation']['objectProperties'] = objectProperties self.queryArguments['operation']['verbose'] = verbose self.queryArguments['operation']['systemMetadata']['directories'] = directories self.queryArguments['operation']['systemMetadata']['namespaces'] = namespaces self.queryArguments['operation']['systemMetadata']['transactions'] = transactions self.queryArguments['operation']['systemMetadata']['changeTime'] ['start'] = startTimeMillis self.queryArguments['operation']['systemMetadata']['changeTime'] ['end'] = endTimeMillis def issueQuery(self): """ Issue an operation-based query request. """ self.curl.setopt(pycurl.POSTFIELDS, json.dumps(self.queryArguments)) cout = StringIO.StringIO() self.curl.setopt(pycurl.WRITEFUNCTION, cout.write) print("Performing query with the following arguments: %s" % json.dumps(self.queryArguments)) self.curl.perform() responseCode = self.curl.getinfo(pycurl.RESPONSE_CODE) if responseCode == 200: queryResult = eval(cout.getvalue()) if queryResult['queryResult']['status']['code'] == "COMPLETE": self.complete = True cout.close() return queryResult else: raise Exception("Error: Expected result code 200, but received %s" % responseCode) def setLastResult(self, lastResult): """ Sets the last result we received as the starting point for the next query we issue. @param lastResult: The dictionary containing the last result returned by the previous query. """ self.queryArguments['operation']['lastResult'] = dict() self.queryArguments['operation']['lastResult']['urlName'] = lastResult['urlName'] self.queryArguments['operation']['lastResult'] ['changeTimeMilliseconds'] = lastResult['changeTimeMilliseconds'] self.queryArguments['operation']['lastResult']['version'] = str(lastResult['version']) def closeConnection(self): """ Cleanup the curl connection after we are finished with it. """ self.curl.close() if __name__ == '__main__': authToken = "bXl1c2Vy:3f3c6784e97531774380db177774ac8d" hostName = "clusterName.com" urlName = "https://admin.%s/query" % hostName resultsPerQuery = 50 objectUrls = [] queryTool = OperationBasedQueryTool() queryTool.setConnectionInfo(authToken, hostName, urlName) queryTool.setQueryParameters(resultsPerQuery, "false", {'directory':['/customers/widgetco/orders']}, {'namespace':['Default.Default']}, {'transaction':['create']}) try: while not queryTool.complete: queryResults = queryTool.issueQuery() for result in queryResults['queryResult']['resultSet']: objectUrls.append(result['urlName']) resultCount = len(queryResults['queryResult']['resultSet']) queryTool.setLastResult(queryResults['queryResult']['resultSet'] [resultCount-1]) print("Query completed. Total objects found: %d" % len(objectUrls)) finally: queryTool.closeConnection()
Example: Checking for replication collisions
Here is a sample metadata query API request that checks whether any namespaces owned by the europe tenant currently contain objects that are flagged as replication collisions. The response to the query does not include operation records for any of those objects, but the status of INCOMPLETE indicates that records for such objects exist.
The query uses an XML request body and requests results in XML format.
<queryRequest> <operation> <count>0</count> <systemMetadata> <replicationCollision>true</replicationCollision> <transactions> <transaction>create</transaction> </transactions> </systemMetadata> </operation> </queryRequest>
curl -i -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d" -H "Content-Type: application/xml" -H "Accept: application/xml" -d @ReplicationCollisions.xml "https://europe.hcp.example.com/query?prettyprint"
import pycurl import os curl = pycurl.Curl() # Set the URL, command, and headers curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" + "query?prettyprint") curl.setopt(pycurl.SSL_VERIFYPEER, 0) curl.setopt(pycurl.SSL_VERIFYHOST, 0) curl.setopt(pycurl.POST, 1) curl.setopt(pycurl.HTTPHEADER, ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d", "Content-Type: application/xml", "Accept: application/xml"]) # Set the request body from an XML file filehandle = open("ReplicationCollisions.xml", 'rb') curl.setopt(pycurl.UPLOAD, 1) curl.setopt(pycurl.CUSTOMREQUEST, "POST") curl.setopt(pycurl.INFILESIZE, os.path.getsize("ReplicationCollisions.xml")) curl.setopt(pycurl.READFUNCTION, filehandle.read) curl.perform() print curl.getinfo(pycurl.RESPONSE_CODE) curl.close() filehandle.close()
POST /query?prettyprint HTTP/1.1 Host: europe.hcp.example.com Content-Type: application/xml Accept: application/xml Authorization: HCP YWxscm9sZXM=:04EC9F614D89FF5C7126D32ACB448382 Content-Length: 233
HTTP/1.1 200 OK Transfer-Encoding: chunked
<?xml version='1.0' encoding='UTF-8'?> <queryResult xmlns:xsi="http://www.w3.org/2019/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result-9.0.xsd"> <query start="0" end="1395694699683" /> <resultSet /> <status results="0" message="" code="INCOMPLETE" /> </queryResult>