Skip to main content
Hitachi Vantara Knowledge

Examples

This chapter contains examples of both object-based and operation-based queries. The examples show some of the ways you can use the metadata query API to get information about namespace content.

Object-based query examples

This section contains examples of object-based queries.

Example: Querying for custom metadata content

Here’s a sample metadata query API request that retrieves metadata for all objects that:

  • Are in namespaces owned by the europe tenant
  • Have custom metadata that contains an element named department with a value of Accounting

The query uses an XML request body and requests results in JSON format.

In addition to the basic information about the objects in the result set, this request returns the shred and retention settings for each object in the result set. The request also specifies that objects in the result set be listed in reverse chronological order based on change time.

Request body in the XML file named Accounting.xml
<queryRequest>
    <object>
        <query>customMetadataContent:
            "department.Accounting.department"
        </query>
        <objectProperties>shred,retention</objectProperties>
        <sort>changeTimeMilliseconds+desc</sort>
    </object>
</queryRequest>
Request with cURL command line
curl -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d"
    -H "Content-Type: application/xml" -H "Accept: application/json"
    -d @Accounting.xml "https://europe.hcp.example.com/query?prettyprint"
Request in Python using PycURL
import pycurl
import os
curl = pycurl.Curl()

# Set the URL, command, and headers
curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" +
    "query?prettyprint")
curl.setopt(pycurl.SSL_VERIFYPEER, 0)
curl.setopt(pycurl.SSL_VERIFYHOST, 0)
curl.setopt(pycurl.POST, 1)
curl.setopt(pycurl.HTTPHEADER,
    ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d",
     "Content-Type: application/xml", "Accept: application/json"])

# Set the request body from an XML file
filehandle = open("Accounting.xml", 'rb')
curl.setopt(pycurl.UPLOAD, 1)
curl.setopt(pycurl.CUSTOMREQUEST, "POST")
curl.setopt(pycurl.INFILESIZE,
        os.path.getsize("Accounting.xml"))
curl.setopt(pycurl.READFUNCTION, filehandle.read)

curl.perform()
print curl.getinfo(pycurl.RESPONSE_CODE)
curl.close()
filehandle.close()
Request headers
POST /query?prettyprint HTTP/1.1
Host: europe.hcp.example.com
Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d
Content-Type: application/xml
Accept: application/json
Content-Length: 192
Response headers
HTTP/1.1 200 OK
Transfer-Encoding: chunked
JSON response body

To limit the example size, the JSON below shows only one object in the result set.

{"queryResult:
    {"query":
        {"expression":"customMetadataContent:
            "department.Accounting.department""},
    "resultSet":[
        {"version":84689494804123,
        "operation":"CREATED",
        "urlName":"https://finance.europe.hcp.example.com/rest/presentations/
            Q1_2012.ppt",
        "changeTimeMilliseconds":"1334244924615.00",
         "retention":0,
         "shred":false},
    .
    .
    .
    ],
    "status":{
         "message":"",
         "results":12,
         "code":"COMPLETE"}
    }
}
Custom metadata file for the Q1_2012.ppt object
<?xml version="1.0">
<presentation>
    <presentedBy>Lee Green</presentedBy>
    <department>Accounting</department>
    <slides>23</slides>
    <date>04-01-2012</date>
</presentation>

Example: Using a paged query to retrieve a list of all objects in a namespace

The Java® example below implements a paged query that uses multiple requests to retrieve all objects in a namespace. The example returns metadata for fifty objects per request and also returns information about the size and ingest time of each object in the result set.

This example uses the com.hds.hcp.apihelpers.query Java class infrastructure, which uses the Jackson JSON processor to produce a JSON query request body and consume a JSON query response. To limit the example size, the example does not include the source code for this infrastructure. To view the full source code, see http://community.hitachivantara.com/groups/developer-network-for-hitachi-content-platform and reference the sample code section.

The Jackson JSON processor serializes and deserializes JSON formatted content with Java Objects. For more information about the Jackson JSON processor, see http://jackson.codehaus.org.

package com.hds.hcp.examples;


import java.util.List;
import java.io.BufferedReader;
import java.io.InputStreamReader;


import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.HttpResponseException;
import org.apache.http.client.methods.*;
import org.apache.http.entity.StringEntity;
import org.apache.http.util.EntityUtils;


/* General purpose helper routines for samples */
import com.hds.hcp.apihelpers.HCPUtils;

 

/* Provide for helper routines to encapsulate the queryRequest and queryResults. */
import com.hds.hcp.apihelpers.query.request.Object;
import com.hds.hcp.apihelpers.query.request.QueryRequest;
import com.hds.hcp.apihelpers.query.result.Status;
import com.hds.hcp.apihelpers.query.result.QueryResult;
import com.hds.hcp.apihelpers.query.result.ResultSetRecord;

 

public class PagedObjectQuery {

    // Local member variables
    private Boolean bIsInitialized = false;
    private String sQueryTenant;
    private String sQueryNamespace;
    private String sEncodedUserName, sEncodedPassword;
    private String sHTTPQueryURL;

    /**
    * Initialize the object by setting up internal data and establishing the HTTP client.

    * connection.
    *
    * This routine is called by the ReadFromHCP and WriteToHCP routines, so calling it
    * by the consumer of this class is unnecessary.
    */
    void initialize(String inNamespace, String inUsername, String inPassword) throws
    Exception {

 

        if (! bIsInitialized) // Initialize only if we haven't already
        {
         // Break up the namespace specification to get the namespace and tenant parts.
         String parts[] = inNamespace.split("\\.");

 

        sQueryNamespace = parts[0];
         sQueryTenant = parts[1];

 

         // Now extract just the tenant part of the URL and use it to create the
         // HTTPQueryURL.
         parts = inNamespace.split(sQueryNamespace + "\\.");

        sHTTPQueryURL = "https://" + parts[1] + "/query";

 

        // Encode both the username and password for the authentication string.
        sEncodedUserName = HCPUtils.toBase64Encoding(inUsername);
        sEncodedPassword = HCPUtils.toMD5Digest(inPassword);

         // Set up an HTTP client for sample usage.
mHttpClient = HCPUtils.initHttpClient();

 

         bIsInitialized = true;
     }
  }

 

 /**
   * This method performs an orderly shutdown of the HTTP connection manager.
   */
  void shutdown() throws Exception {
    // Clean up open connections by shutting down the connection manager.
    mHttpClient.getConnectionManager().shutdown();
  }

 

  /**
    * This routine issues a query to an HCP namespace requesting information about
    * objects in it. The query requests 1,000 results at a time. If there are more,
    * the routine performs paged queries to retrieve all the results.
    *
    * While processing the query results, the routine displays the name of the first
    * and last object of the result set to system output.
    */
  protected void runQuery() {

 

   // Statistics counters
    Long TotalRecordsProcessed = 0L;
    Integer HTTPCalls = 0;

 

 try {
      /*
       * Set up the query request.
       */

     // Set up for an object query by calling the
      // com.hds.hcp.apihelpers.query.request.Object constructor.
      Object mObjQuery = new Object();

 

     // Get only 50 objects at a time.
      mObjQuery.setCount(50);

 

      // Retrieve only those that reside in the namespace specified in the command.
      mObjQuery.setQuery("+namespace:" + sQueryNamespace + "." + sQueryTenant);

     

     // Retrieve the "size" and "ingestTimeString" properties for the object.
      mObjQuery.setObjectProperties("size,ingestTimeString");

 

      // Set up the query request.
      QueryRequest mQuery = new QueryRequest(mObjQuery);

     /*
       * Loop through and process all the objects one response at a time or until
       * an error occurs.
       */
      QueryResult mQueryResult = null;
      do {
        System.out.println("Issuing query: \n" + mQuery.toString(true));

       /*
         * Execute the query using the HTTP POST method.
         */
        HttpPost httpRequest = new HttpPost(sHTTPQueryURL);

 

       // Add the body of the POST request.
        httpRequest.setEntity(new StringEntity(mQuery.toString()));

 

       // Set the Authorization header.
        httpRequest.setHeader("Authorization: HCP " + sEncodedUserName + ":"
          + sEncodedPassword);

 

       // Execute the query.
        HttpResponse httpResponse = mHttpClient.execute(httpRequest);

 

       // For debugging purposes, dump out the HTTP response.
       HCPUtils.dumpHttpResponse(httpResponse);

 

        // If the status code is anything BUT in the 200 range indicating success,
       // throw an exception.
       if (2 != (int)(httpResponse.getStatusLine().getStatusCode() / 100))
       {
           // Clean up after ourselves and release the HTTP connection to the
           // connection manager.
           EntityUtils.consume(httpResponse.getEntity());

 

          throw new HttpResponseException(httpResponse.getStatusLine()
          .getStatusCode(),
            "Unexpected status returned from " + httpRequest.getMethod() + " ("
             + httpResponse.getStatusLine().getStatusCode() + ": "
             + httpResponse.getStatusLine().getReasonPhrase() + ")");
        }

 

       /*
         *  Process the response from the query request.
        */

       // Put the response in a buffered reader.
       BufferedReader bodyReader = new BufferedReader(newInputStreamReader
         (httpResponse.getEntity().getContent()));
       HTTPCalls += 1;

 

       // Parse the response into the QueryResult object.
        mQueryResult = QueryResult.parse(bodyReader);

 

       // Get a copy of the query status from the query result.
       Status mStatus = mQueryResult.getStatus();

 

       // Display the status of what we just accomplished.
       System.out.println();
       System.out.println("Batch " + HTTPCalls + " Status: " + mStatus.getCode()
         + " Record Count:" + mStatus.getResults());

 

       // Display the first and last object of the result set.
       List<ResultSetRecord> mResultSet = mQueryResult.getResultSet();
        ResultSetRecord mFirstRecord = mResultSet.get(0);

 

       System.out.println(" First Record (" + (TotalRecordsProcessed+1) + ") "
          + mFirstRecord.getUrlName());
        System.out.println(" Size: " + mFirstRecord.getSize());

 

       TotalRecordsProcessed += mStatus.getResults();

 

        ResultSetRecord mLastRecord = mResultSet.get(mResultSet.size()-1);
       System.out.println(" Last Record (" + TotalRecordsProcessed
         + ") "+ mLastRecord.getUrlName());
       System.out.println(" Size: " + mLastRecord.getSize());
       System.out.println();

 

       // Now we need to see whether the query is complete or whether there are more
        // objects. If INCOMPLETE, it is a successful paged query.
        if (Status.Code.INCOMPLETE == mStatus.getCode())
        {

 

        // We have more, so update the offset for the next query to be the previous
        // offset plus the number we just read.
        mObjQuery.setOffset(
          (null == mObjQuery.getOffset() ? 0 : mObjQuery.getOffset())
          + mStatus.getResults()
          );
        }

 

        // Clean up after ourselves and release the HTTP connection to the connection
        // manager.
        EntityUtils.consume(httpResponse.getEntity());

 

    } // Keep doing this while we have more results.

 

    while (Status.Code.INCOMPLETE == mQueryResult.getStatus().getCode());

    /*
     * Print out the final statistics.
     */
    System.out.println("Total Records Processed: " + TotalRecordsProcessed);
    System.out.println("HTTP Calls: " + HTTPCalls);
     } catch(Exception e) {
       e.printStackTrace();
     }
}

 

/*
 * @param args
 */

public static void main(String[] args) {

 

   PagedObjectQuery myClass = new PagedObjectQuery();

  if (args.length != 3) {
  System.out.println();
   System.out.println("Usage: " + myClass.getClass().getSimpleName()
     + " <DNS-Namespace> <Username> <Password>\n");
   System.out.println(" where ");
   System.out.println(" <DNS-Namespace> is the fully qualified domain name"
     + " of the HCP Namespace.");
   System.out.println(" For example: \"ns1.ten1.myhcp.example.com\"");
   System.out.println(" <Username> and <Password> are the credentials of the"
     + " HCP user with data access permissions for the namespace");
   System.out.println();

 

   System.exit(-1);
}

 

try {
   // Initialize the class with the input parameters
   myClass.initialize(args[0], args[1], args[2]);

  // Issue the query and process the results
   myClass.runQuery();

  // Clean up before object destruction
  myClass.shutdown();

 } catch (Exception e) {
   e.printStackTrace();
   }
 }
}

Example Using a faceted query to retrieve object information

Here is a sample metadata query API request that retrieves metadata for all objects added to namespaces owned by the europe tenant between March 1, 2020, and March 31, 2020, inclusive. The verbose entry specifies true to request all metadata for each object in the result set. This request also retrieves namespace facet information for objects in the result set. The query uses an XML request body and requests results in XML format.

Request body in the XML file named March.xml
<queryRequest>
    <object>
        <query>ingestTime:[1330560000 TO 1333238399]</query>
        <facets>namespace</facets>
        <verbose>true</verbose>
    </object>
</queryRequest>
Request with cURL command line
curl -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d"
    -H "Content-Type: application/xml" -H "Accept: application/xml"
    -d @March.xml "https://europe.hcp.example.com/query?prettyprint"
Request in Python using PycURL
import pycurl
import os
curl = pycurl.Curl()

# Set the URL, command, and headers
curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" +
     "query?prettyprint")
curl.setopt(pycurl.SSL_VERIFYPEER, 0)
curl.setopt(pycurl.SSL_VERIFYHOST, 0)
curl.setopt(pycurl.POST, 1)
curl.setopt(pycurl.HTTPHEADER,
     ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d",
     "Content-Type: application/xml", "Accept: application/xml"])

# Set the request body from an XML file
filehandle = open("March.xml", 'rb')
curl.setopt(pycurl.UPLOAD, 1)
curl.setopt(pycurl.CUSTOMREQUEST, "POST")
curl.setopt(pycurl.INFILESIZE,
          os.path.getsize("March.xml"))
curl.setopt(pycurl.READFUNCTION, filehandle.read)

curl.perform()
print curl.getinfo(pycurl.RESPONSE_CODE)
curl.close()
filehandle.close()
Request headers
POST /query?prettyprint HTTP/1.1
Host: europe.hcp.example.com
Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d
Content-Type: application/xml
Accept: application/xml
Content-Length: 134
Response headers
HTTP/1.1 200 OK
Transfer-Encoding: chunked
XML response body

To limit the example size, the XML below shows only one object entry in the response body.

<?xml version='1.0' encoding='UTF-8'?>
<queryResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:noNamespaceSchemaLocation="/static/xsd/query-result-7.0.xsd">
    <query>
        <expression>ingestTime:[1333238400 TO 1335830399]</expression>
    <resultSet>
        <object
             version="84689595801123"
             utf8Name="Q1_2020.ppt"
            urlName="https://marketing.europe.hcp.example.com/rest/
                 presentations/Q1_2020.ppt"
             updateTimeString="2020-03-31T15:41:35-0400"
             updateTime="1333222895"
             uid="0"
             type="object"
            size="6628"
            shred="false"
            retentionString="Deletion Allowed"
            retentionClass=" "
            retention="0"
            replicated="true"
            permissions="555"
            owner="USER,europe,lgreen"
            operation="CREATED"
            namespace="marketing.europe"
            ingestTimeString="2020-03-31T15:41:35-0400"
            ingestTime="1333222895"
            index="true"
            hold="false"
            hashScheme="SHA-256"
            hash="SHA-256 0662D2A2DEF74EF02A8DF5A4F16BF4D55FEE582..."
            gid="0"
            objectPath="/presentations/Q1_2020.ppt"
            dpl="2"
            customMetadata="false"
            changeTimeString="2020-03-31T15:41:35-0400"
            changeTimeMilliseconds="1333222895615.00"
            accessTimeString="2020-03-31T15:41:35-0400"
            accessTime="1333222895"
            acl="false" />
        .
        .
        .
    </resultSet>
    <status
         results="7"
        message=""
        code="COMPLETE" />
    <facets>
        <facet
            property="namespace
            <frequency
                count="4"
                value="finance.europe" />
            <frequency
                 count="3"
                value="marketing.europe" />
        </facet>
    </facets>
</queryResult>

Example: Querying for replication collisions in a namespace

Here’s a sample metadata query API request that retrieves metadata for all objects that are:

  • Flagged as replication collisions
  • In the finance namespace owned by the europe tenant

The query uses an XML request body and requests results in XML format.

This request returns only the URL, version ID, operation type, and change time for the objects in the result set. The request specifies that the result set be sorted by object path in ascending order.

Request body in the XML file named FinanceCollisions.xml
<queryRequest>
    <object>
         <query>
             +namespace:finance.europe
             +replicationCollision:true
         </query>
         <objectProperties>objectPath</objectProperties>
         <sort>objectPath+asc</sort>
    </object>
</queryRequest>
Request with cURL command line
curl -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d"
    -H "Content-Type: application/xml" -H "Accept: application/xml"
    -d @FinanceCollisions.xml "https://europe.hcp.example.com/query?prettyprint"
Request in Python using PycURL
import pycurl
import os
curl = pycurl.Curl()

# Set the URL, command, and headers
curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" +
     "query?prettyprint")
curl.setopt(pycurl.SSL_VERIFYPEER, 0)
curl.setopt(pycurl.SSL_VERIFYHOST, 0)
curl.setopt(pycurl.POST, 1)
curl.setopt(pycurl.HTTPHEADER,
    ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d",
     "Content-Type: application/xml", "Accept: application/xml"])

# Set the request body from an XML file
filehandle = open("FinanceCollisions.xml", 'rb')
curl.setopt(pycurl.UPLOAD, 1)
curl.setopt(pycurl.CUSTOMREQUEST, "POST")
curl.setopt(pycurl.INFILESIZE,
         os.path.getsize("FinanceCollisions.xml"))
curl.setopt(pycurl.READFUNCTION, filehandle.read)

curl.perform()
print curl.getinfo(pycurl.RESPONSE_CODE)
curl.close()
filehandle.close()
Request headers
POST /query?prettyprint HTTP/1.1
Host: europe.hcp.example.com
Content-Type: application/xml
Accept: application/xml
Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d
Content-Length: 205
Response headers
HTTP/1.1 200 OK
Transfer-Encoding: chunked
XML response body
<?xml version='1.0' encoding='UTF-8'?>
<queryResult xmlns:xsi="http://www.w3.org/2020/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result-9.0.xsd">
<query>
    <expression>+namespace:t1-ns2.LisaTenant-1 +replicationCollision:true </expression>
</query>
<resultSet>
    <object
        version="89322738450881"
        urlName="https://finance.europe.hcp.example.com/rest/budgets/2020/
            sales_budget_2020.xlsx.collision"
        operation="CREATED"
        objectPath="/budgets/2020/sales_budget_2020.xlsx.collision"
        changeTimeMilliseconds="1395668086005.00" />
    <object
        version="89322749144130"
        urlName="https://finance.europe.hcp.example.com/rest/quarterly_rpts/
           Q1_2020.ppt.collision"
        operation="CREATED"
        objectPath="/quarterly_rpts/Q1_2020.ppt.collision"
        changeTimeMilliseconds="1395668327386.00" />
</resultSet>
<status
    totalResults="2"
    results="2"
    message=""
    code="COMPLETE" />
</queryResult>

Example: Listing content properties

Here is a sample metadata query API request that lists the content properties for all indexed objects in the medical namespace owned by the employees tenant. The query uses an XML request body and requests results in XML format.

Request body in the XML file named MedicalQuery.xml
<queryRequest>
    <object>
         <query>namespace:medical.employees</query>
         <count>0</count>
         <contentProperties>true</contentProperties>
    </object>
</queryRequest>
Request with cURL command line
curl -i -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d"
     -H "Content-Type: application/xml" -H "Accept: application/xml"
     -d @MedicalQuery.xml "https://employees.hcp.example.com/
        query?prettyprint"
Request in Python using PycURL
import pycurl
import os
curl = pycurl.Curl()

# Set the URL, command, and headers
curl.setopt(pycurl.URL, "https://employees.hcp.example.com/" +
   "query?prettyprint")
curl.setopt(pycurl.SSL_VERIFYPEER, 0)
curl.setopt(pycurl.SSL_VERIFYHOST, 0)
curl.setopt(pycurl.POST, 1)
curl.setopt(pycurl.HTTPHEADER,
    ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d",
    "Content-Type: application/xml", "Accept: application/xml"])

# Set the request body from an XML file
filehandle = open("MedicalQuery.xml", 'rb')
curl.setopt(pycurl.UPLOAD, 1)
curl.setopt(pycurl.CUSTOMREQUEST, "POST")
curl.setopt(pycurl.INFILESIZE,
          os.path.getsize("MedicalQuery.xml"))
curl.setopt(pycurl.READFUNCTION, filehandle.read)

curl.perform()
print curl.getinfo(pycurl.RESPONSE_CODE)
curl.close()
filehandle.close()
Request headers
POST /query?prettyprint HTTP/1.1
Host: employees.example.com
Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d
Content-Type: application/xml
Accept: application/xml
Content-Length: 155
Response headers
HTTP/1.1 200 OK
Transfer-Encoding: chunked
XML response body

To limit the example size, the XML below shows only two contentProperty entries in the response body.

<?xml version='1.0' encoding='UTF-8'?>
<queryResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result-7.0.xsd">
<query>
    <expression>namespace:medical.employees</expression>
</query>
<resultSet />
<status
    totalResults="0"
    results="0"
    message=""
    code="COMPLETE" />
<contentProperties>
    <contentProperty>
        <name>DocDateOfBirth</name>
        <expression>/record/doctor/dob</expression>
        <type>DATE</type>
        <multivalued>false</multivalued>
        <format>MM/dd/yyy</format>
    </contentProperty>
    <contentProperty>
        <name>DocLastName</name>
        <expression>/record/doctor/name/lastName</expression>
        <type>STRING</type>
        <multivalued>false</multivalued>
        <format></format>
    </contentProperty>
</contentProperties>
</queryResult>

Operation-based query examples

This section contains examples of operation-based queries.

Example: Retrieving all operation records for all existing and deleted objects in a directory

Here’s a sample metadata query API request that retrieves operation records for all objects currently in or deleted from the sales namespace owned by the midwest tenant. The query uses an XML request body and requests results in JSON format.

The verbose entry is set to true to request detailed information for all operation records in the result set.

The response body includes records for all create, delete, and purge operations that occurred since the namespace was created up to one minute before the request was made at March 14, 2013 at 14:59:37 EST.

Request body in the XML file named AllSales.xml
<queryRequest>
    <operation>
         <count>-1</count>
         <systemMetadata>
             <namespaces>
                 <namespace>sales.midwest</namespace>
             </namespaces>
         </systemMetadata>
         <verbose>true</verbose>
    </operation>
</queryRequest>
Request with cURL command line
curl -i -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d"
    -H "Content-Type: application/xml" -H "Accept: application/json"
    -d @AllSales.xml "https://midwest.hcp.example.com/query?prettyprint"
Request in Python using PycURL
import pycurl
import os
curl = pycurl.Curl()

# Set the URL, command, and headers
curl.setopt(pycurl.URL, "https://midwest.hcp.example.com/" +
     "query?prettyprint")
curl.setopt(pycurl.SSL_VERIFYPEER, 0)
curl.setopt(pycurl.SSL_VERIFYHOST, 0)
curl.setopt(pycurl.POST, 1)
curl.setopt(pycurl.HTTPHEADER,
     ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d",
     "Content-Type: application/xml", "Accept: application/json"])

# Set the request body from an XML file
filehandle = open("AllSales.xml", 'rb')
curl.setopt(pycurl.UPLOAD, 1)
curl.setopt(pycurl.CUSTOMREQUEST, "POST")
curl.setopt(pycurl.INFILESIZE,
         os.path.getsize("AllSales.xml"))
curl.setopt(pycurl.READFUNCTION, filehandle.read)

curl.perform()
print curl.getinfo(pycurl.RESPONSE_CODE)
curl.close()
filehandle.close()
Request headers
POST /query HTTP/1.1
Host: finance.hcp.example.com
Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d
Content-Type: application/xml
Accept: application/json
Content-Length: 258
Response headers
HTTP/1.1 200 OK
Transfer-Encoding: chunked
JSON response body

To limit the example size, the JSON below shows only one object entry in the response body.

{"queryResult":
    {"query":{"start":0,"end":1331751577658},
      "resultSet":[
            {"version":81787144560449
             "utf8Name":"C346527",
             "urlName":"https://sales.midwest.hcp.example.com/rest/
                           customers/widgetco/orders/C346527",
             "updateTimeString":"2012-03-10T14:55:33-0500"
             "updateTime":1331409333,
             "uid":0,
             "type":"object",
             "size":4985,
             "shred":false,
             "retentionString":"Deletion Allowed",
             "retentionClass":"",
             "retention":"0",
             "replicated":true,
             "permissions":"256"
             "owner":"USER,midwest,rblack"
             "operation":"CREATED",
             "namespace":"sales.midwest",
             "ingestTimeString":"2012-03-10T14:55:33-0500",
             "ingestTime":1331409333,
             "index":true,
             "hold":false,
             "hashScheme":"SHA-256",
             "hash":"SHA-256 C67EF26C0E5EDB102A2DEF74EF02A8DF5A4F16BF4D...",
             "gid":0,
             "objectPath":"/customers/widgetco/orders/C346527",
             "dpl":2,
             "customMetadata":false,
             "changeTimeString":"2012-03-10T14:55:33-0500",
             "changeTimeMilliseconds":"1331409333948.00",
             "accessTimeString":"2012-03-10T14:55:33-0500",
             "accessTime":1331409333,
             "acl":false},
          .
          .
          .
          ],
       "status":{"results":7,"message":"","code":"COMPLETE"}
    }
}

Example: Retrieving metadata for changed objects

Here’s a sample metadata query API request that uses a JSON body specified directly in the cURL command line and Python code to retrieve operation records for objects that:

  • Are in the finance namespace, which is owned by the europe tenant
  • Were modified during 2019

The start entry specifies 12:00:00.00 a.m. on January 1, 2019, and the end entry specifies 12:00:00.00 a.m. on January 1, 2020.

The response body is XML. The information returned for each operation record that meets the query criteria consists of the object URL, version ID, operation, and change time.

Request with cURL command line
curl -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d"
         -H "Content-Type: application/json" -H "Accept: application/xml"
         -d '{"operation":{"systemMetadata":{"changeTime":
         {"start":1293840000000,"end":1325376000000},"namespaces":
         {"namespace":["finance.europe"]}}}}'
         "https://europe.hcp.example.com/query?prettyprint"
Request in Python using PycURL
import pycurl
curl = pycurl.Curl()

# Set the URL, command, and headers
curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" +
     "query?prettyprint")
curl.setopt(pycurl.SSL_VERIFYPEER, 0)
curl.setopt(pycurl.SSL_VERIFYHOST, 0)
curl.setopt(pycurl.POST, 1)
curl.setopt(pycurl.HTTPHEADER,
     ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d",
     "Content-Type: application/json", "Accept: application/xml"])

# Set the request body
theFields = '{"operation":{"systemMetadata":{"changeTime": \
  {"start":1293840000000,"end":1325376000000},"namespaces": \
  {"namespace":["finance.europe"]}}}}'
curl.setopt(pycurl.POSTFIELDS, theFields)

curl.perform()
print curl.getinfo(pycurl.RESPONSE_CODE)
curl.close()
Request headers
POST /query HTTP/1.1
Host: europe.hcp.example.com
Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d
Content-Type: application/json
Accept: application/xml
Content-Length: 81
Response headers
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Response body

To limit the example size, the XML below shows only two object entries in the response body.

<?xml version='1.0' encoding='UTF-8'?>
<queryResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:noNamespaceSchemaLocation="/static/xsd/query-result-7.0.xsd">
    <query start="1293840000000" end="1325376000000" />
    <resultSet>
         <object
             version="81787101672577"
            urlName="https://finance.europe.hcp.example.com/rest/
                Presentations/Q2_2019.ppt"
            operation="CREATED"
            changeTimeMilliseconds="1310392057456.00" />
        <object
            version="81787102472129"
            urlName="https://finance.europe.hcp.example.com/rest/
                Presentations/Images/thankYou.jpg"
            operation="CREATED"
            changeTimeMilliseconds="1310392336286.00" />
         .
         .
         .
    </resultSet>
    <status results="11" message="" code="COMPLETE" />
</queryResult>

Example: Using a paged query to retrieve a large number of records

The Python example below implements a paged query that uses multiple requests to retrieve a large number of operation records in batches of 50 per request. This query retrieves records for all create operations on objects in the /customers/widgetco/orders directory in the default namespace and returns basic information for each record.

The query uses a JSON request body and requests results in JSON format.

#!/usr/bin/env python
# encoding: utf-8

import pycurl
import StringIO
import time
import json

class OperationBasedQueryTool():
queryArguments = {'operation': {'count': 1, 'verbose': 'false',
   'objectProperties': 'utf8Name, type, size',
   'systemMetadata': {'changeTime': {},
     'directories': {'directory': []},
     'namespaces': {'namespace': []},
     'transactions': {'transaction': []}}}}

def __init__(self):
   self.complete = False

def setConnectionInfo(self, authToken, hostName, urlName):
   """ Set all connection info for subsequent query requests.
   @param authToken: authorization token
   @param hostName: Hostname of the target cluster
   @param urlName: Full URL for the query interface """
   self.curl = pycurl.Curl()
   requestHeaders = {pycurl.HTTPHEADER :["Authorization: HCP
     "authToken, "Accept:application/json", "Content-Type:
     application/json", "Host: admin.%s" % (hostName)]}
    self.curl.setopt(pycurl.FAILONERROR, 1)
   self.curl.setopt(pycurl.HTTPHEADER,
   requestHeaders[pycurl.HTTPHEADER])
   self.curl.setopt(pycurl.URL, urlName)
   for header, value in requestHeaders.iteritems():
   self.curl.setopt(header, value)
   self.curl.setopt(pycurl.CUSTOMREQUEST, 'POST')
   self.curl.setopt(pycurl.SSL_VERIFYPEER, 0)
   self.curl.setopt(pycurl.SSL_VERIFYHOST, 0)
   self.curl.setopt(pycurl.VERBOSE, 0)

def setQueryParameters(self, count, verbose, directories, namespaces,
   transactions, objectProperties, startTimeMillis=0,
   endTimeMillis=int(round(time.time() * 1000))):
   """ Set all parameters related to the query.
   @param count: The number of results to return for each query.
   @param verbose: Indication to return all object property values.
     Value is either true or false.
   @param directories: Dictionary containing list of directory paths.
   @param namespaces: Dictionary containing list of namespaces.
   @param transactions: Dictionary containing list of transaction
     types.
   @param objectProperties: String containing comma-separated list of
     object properties to return for each operation record.
   @param startTimeMillis: The starting timestamp in milliseconds of
     the query window. Default is 0 (zero).
   @param endTimeMillis: The ending timestamp in milliseconds of the
     query window. Default is one minute before time of request. """
   self.queryArguments['operation']['count'] = count
   self.queryArguments['operation']['objectProperties'] =
     objectProperties
   self.queryArguments['operation']['verbose'] = verbose
   self.queryArguments['operation']['systemMetadata']['directories'] =
     directories
   self.queryArguments['operation']['systemMetadata']['namespaces'] =
     namespaces
   self.queryArguments['operation']['systemMetadata']['transactions'] =
     transactions
   self.queryArguments['operation']['systemMetadata']['changeTime']
     ['start'] = startTimeMillis
   self.queryArguments['operation']['systemMetadata']['changeTime']
     ['end'] = endTimeMillis

def issueQuery(self):
   """ Issue an operation-based query request. """
   self.curl.setopt(pycurl.POSTFIELDS, json.dumps(self.queryArguments))
   cout = StringIO.StringIO()
   self.curl.setopt(pycurl.WRITEFUNCTION, cout.write)
   print("Performing query with the following arguments: %s"
     % json.dumps(self.queryArguments))
   self.curl.perform()
   responseCode = self.curl.getinfo(pycurl.RESPONSE_CODE)
   if responseCode == 200:
     queryResult = eval(cout.getvalue())
     if queryResult['queryResult']['status']['code'] == "COMPLETE":
       self.complete = True
     cout.close()
     return queryResult
   else:
     raise Exception("Error: Expected result code 200, but received %s"
       % responseCode)

def setLastResult(self, lastResult):
   """ Sets the last result we received as the starting point for the
       next query we issue.
    @param lastResult: The dictionary containing the last result
       returned by the previous query. """
    self.queryArguments['operation']['lastResult'] = dict()
    self.queryArguments['operation']['lastResult']['urlName'] =
     lastResult['urlName']
   self.queryArguments['operation']['lastResult']
     ['changeTimeMilliseconds'] = lastResult['changeTimeMilliseconds']
    self.queryArguments['operation']['lastResult']['version'] =
      str(lastResult['version'])

  def closeConnection(self):
     """ Cleanup the curl connection after we are finished with it. """
     self.curl.close()

  if __name__ == '__main__':
     authToken = "bXl1c2Vy:3f3c6784e97531774380db177774ac8d"
     hostName = "clusterName.com"
     urlName = "https://admin.%s/query" % hostName
     resultsPerQuery = 50
     objectUrls = []
     queryTool = OperationBasedQueryTool()
    queryTool.setConnectionInfo(authToken, hostName, urlName)
     queryTool.setQueryParameters(resultsPerQuery, "false",
       {'directory':['/customers/widgetco/orders']},
      {'namespace':['Default.Default']},
       {'transaction':['create']})
     try:
       while not queryTool.complete:
         queryResults = queryTool.issueQuery()
         for result in queryResults['queryResult']['resultSet']:
           objectUrls.append(result['urlName'])
        resultCount = len(queryResults['queryResult']['resultSet'])
         queryTool.setLastResult(queryResults['queryResult']['resultSet']
           [resultCount-1])
        print("Query completed. Total objects found: %d" % len(objectUrls))
    finally:
       queryTool.closeConnection()

Example: Checking for replication collisions

Here is a sample metadata query API request that checks whether any namespaces owned by the europe tenant currently contain objects that are flagged as replication collisions. The response to the query does not include operation records for any of those objects, but the status of INCOMPLETE indicates that records for such objects exist.

The query uses an XML request body and requests results in XML format.

Request body in the XML file named ReplicationCollisions.xml
<queryRequest>
    <operation>
        <count>0</count>
        <systemMetadata>
             <replicationCollision>true</replicationCollision>
                     <transactions>
                 <transaction>create</transaction>
            </transactions>
        </systemMetadata>
    </operation>
</queryRequest>
Request with cURL command line
curl -i -k -H "Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d"
        -H "Content-Type: application/xml" -H "Accept: application/xml"
        -d @ReplicationCollisions.xml
        "https://europe.hcp.example.com/query?prettyprint"
Request in Python using PycURL
import pycurl
import os
curl = pycurl.Curl()

# Set the URL, command, and headers
curl.setopt(pycurl.URL, "https://europe.hcp.example.com/" +
     "query?prettyprint")
curl.setopt(pycurl.SSL_VERIFYPEER, 0)
curl.setopt(pycurl.SSL_VERIFYHOST, 0)
curl.setopt(pycurl.POST, 1)
curl.setopt(pycurl.HTTPHEADER,
    ["Authorization: HCP bXl1c2Vy:3f3c6784e97531774380db177774ac8d",
    "Content-Type: application/xml", "Accept: application/xml"])

# Set the request body from an XML file
filehandle = open("ReplicationCollisions.xml", 'rb')
curl.setopt(pycurl.UPLOAD, 1)
curl.setopt(pycurl.CUSTOMREQUEST, "POST")
curl.setopt(pycurl.INFILESIZE,
          os.path.getsize("ReplicationCollisions.xml"))
curl.setopt(pycurl.READFUNCTION, filehandle.read)

curl.perform()
print curl.getinfo(pycurl.RESPONSE_CODE)
curl.close()
filehandle.close()
Request headers
POST /query?prettyprint HTTP/1.1
Host: europe.hcp.example.com
Content-Type: application/xml
Accept: application/xml
Authorization: HCP YWxscm9sZXM=:04EC9F614D89FF5C7126D32ACB448382
Content-Length: 233
Response headers
HTTP/1.1 200 OK
Transfer-Encoding: chunked
XML response body
<?xml version='1.0' encoding='UTF-8'?>
<queryResult xmlns:xsi="http://www.w3.org/2019/XMLSchema-instance" xsi:noNamespaceSchemaLocation="/static/xsd/query-result-9.0.xsd">
<query
    start="0"
    end="1395694699683" />
<resultSet />
<status
    results="0"
    message=""
    code="INCOMPLETE" />
</queryResult>

 

  • Was this article helpful?