Skip to main content
Hitachi Vantara Knowledge

Working with advanced searches

Advanced searches are one of the three types of searches you can perform from the HCP Search Console. For an advanced search, you can specify queries equivalent to those for simple and structured searches. However, in an advanced query, you can combine criteria with various operators to refine your searches in more ways.

This chapter provides instructions for performing advanced searches. It provides an introduction to specifying advanced queries and contains many examples.

Once you have the results of an advanced search, you can filter and export them.

NoteWhile the metadata query engine is active, advanced searches are called advanced queries.

About advanced searches

Advanced searches provide more flexibility than both simple and structured searches. They enable you not only to search for text and metadata with multiple search criteria but also to nest and combine those criteria by using parentheses and Boolean and other operators.

To perform an advanced search, you use the Advanced Query page of the Search Console. On this page, you write your own query (called an advanced query) using a query language. The language you use depends on the active search facility.

GUID-740D9446-FD9E-4C2A-B44C-CB85DB86180F-low.png

You can generate advanced queries from the Structured Search page. To do this, construct the query on the Structured Search page and then click Show as advanced search. The Advanced Search page opens and shows the specified search criteria translated into the applicable query language.

NoteWhile the metadata query engine is active, the Advanced Search page is called the Advanced Query page and the Show as advanced search link is called the Show as advanced query link.

Criteria for advanced queries with the metadata query engine

This section describes the formats for the criteria you use to construct advanced queries while the metadata query engine is active. For more information about the query language used with the metadata query engine, see the applicable Apache Solr documentation at http://lucene.apache.org/solr.

Query expressions

With the metadata query engine, query expressions have this format:

[+|-]criterion [[+|-]criterion]...

In this expression, [+|-] is an optional Boolean operator and criterion is one of:

  • A simple criterion
  • One or more criteria in parentheses, in this format:
    [+|-]criterion [[+|-]criterion]...

    In this expression, criterion can be a simple criterion or one or more criteria in parentheses.

For example, here is one possible query expression:

-(namespace:"finance.europe") +(retention:0 index:1)

Query expressions can contain only valid UTF-8 characters.

Criterion format

The format for a simple advanced criterion is:

property:value

For example, this expression finds objects that are on hold:

hold:true

When querying for a value that’s a negative number, enclose the value in double quotation marks (").

For example, this query expression finds objects with the retention setting -2:

retention:"-2"

The special criterion *:* finds all objects in all namespaces searchable by the user.

Most of the properties for advanced queries correspond to those for structured queries, but the property names differ, and in some cases, the values are expressed differently.

Boolean operators

You can precede a criterion or an individual property value with one of these Boolean operators:

  • Plus sign (+)

    Objects in the result set must contain the criterion or value following the plus sign.

  • Minus sign (-)

    Objects in the result set must not contain the criterion or value following the minus sign.

For example, this query expression finds objects that are not on hold:

-hold:true

Text-based property values

The objectPath and customMetadataContent properties and tokenized content properties match text in the object path or custom metadata.

Text-based property values are text strings consisting of one or more

UTF-8 characters. The string is interpreted as one or more search terms, where each search term is a sequence of either alphabetic or numeric characters. All other characters, except wildcards, are treated as term separators.

For example, the string product123 contains two search terms — product and 123. A query based on this string finds objects for which the specified property contains at least one of product and 123.

Search terms match only complete alphabetic or numeric strings in paths or custom metadata. For example, the text strings AnnualReport, 2012, and AnnualReport_2012 match the object named AnnualReport_2012.pdf. A query expression with a text string such as Annual or 201 does not match this object.

Similarly, to query for objects with a path or custom metadata that contains the word product, you need to use the complete word product as the text string. A query expression with a text string such as prod does not match objects with a path or custom metadata containing product.

Common words such as a and is are indexed and are valid search terms. For example, a query containing the text string A3534 matches all objects with paths and custom metadata that contain the word a.

Search terms are not case sensitive. Therefore, the strings AnnualReport, Annualreport, and annualreport are equivalent.

Common words such as a and is are valid search terms. For example, a query containing the text string A3534 matches all objects with paths and custom metadata that contain the word a. To prevent such a match, use a phrase as described below.

To specify a negative number as a criterion, enclose the value in double quotation marks, for example, "-3121".

To specify a phrase as a criterion, put the text string in double quotation marks. A phrase matches paths and custom metadata that contain each of the alphabetic or numeric search terms within the quotation marks in the specified order, but any special characters or white space between the individual strings is ignored. For example, the phrase "product 123" matches custom metadata that contains any of these strings:

product 123
product123
product_123

Multiple values for a single property

A criterion can specify multiple values for a single property. To specify multiple values, use this format:

property:([+|-]value [[+|-]value]...)

In this format, the parentheses are required.

For example, this query expression finds objects in either the HlthReg-107 or HlthReg-224 retention class:

retentionClass:(HlthReg-107 HlthReg-224)

This query expression finds objects with custom metadata that contains the string finance but not the string foreign.

customMetadataContent:(+finance -foreign)

When you specify multiple values for a single property, you can precede each value with a Boolean operator. For example, this query expression finds objects whose paths contain sales but not 2012:

objectPath:(+sales -2012)

When you specify multiple values for a single property, you can combine values that are preceded by Boolean operators with values that do not have Boolean operators. In this case, objects that match the property values that are not preceded by Boolean operators may or may not appear in the result set, but objects that match the terms without Boolean operators are sorted higher in the query results than objects that don’t match those terms.

For example, this query expression finds objects that have custom metadata that contains both the terms quarterly report and accounting department or only the term quarterly report:

customMetadataContent:(+"quarterly report" "accounting department")

Objects that contain both terms are sorted higher in the query results.

Value ranges

You can query based on ranges of values for properties with numeric, string, or date data types. These properties are accessTime, accessTimeString, changeTimeString, dpl, hash, hashScheme, ingestTime, ingestTimeString, retention, retentionClass, retentionString, size, updateTime, updateTimeString, and utf8Name. You can also query based on ranges for content properties with numeric, string or date data types.

Criteria that query for a range of values for a single property can have either of these formats:

  • For a range that includes the start and end values:
    property:[start-valueTOend-value]

    In this format, the square brackets are required.

    For example, this query expression finds objects that were ingested from 0800 through 0900 UTC on March 1, 2012, inclusive:

    ingestTimeString:[2012-03-01T08:00:00-0000 TO 2012-03-01T09:00:00-0000]
  • For a range that does not include the start or end values:
    property:{start-valueTOend-value}

    In this format, the curly braces are required.

    For example, this query expression finds objects that have names that occur alphabetically between Brown_Lee.xls and Green_Chris.xls, exclusive of those values:

    utf8Name:{Brown_Lee.xls TO Green_Chris.xls}
    Noteutf8Name property values are case sensitive and are ordered according to the positions of characters in the UTF-8 character table.

You can mix square brackets and curly braces in an expression. For example, this query expression finds objects that were ingested from 0800 to 0900 UTC on March 1, 2012, including objects that were ingested at 0800 but excluding objects that were ingested at 0900:

ingestTimeString:[2012-03-01T08:00:00-0000 TO 2012-03-01T09:00:00-0000}

When querying for a range of property values, you can precede the whole criterion with a Boolean operator but you cannot precede an individual value with a Boolean operator. For example, the query expression on the first line below is valid; the criterion on the second line is not:

Valid: +retentionString:[2013-07-01T00:00:00 TO 2013-07-31T00:00:00]
Invalid: retentionString:[+2013-07-01T00:00:00 TO 2013-07-31T00:00:00]

When querying for a range of values, you can replace a value with an asterisk (*) to specify an unlimited range. For example, this query expression finds objects with a size equal to or greater than two thousand bytes:

size:[2000 TO *]

This query expression finds objects with change times before 9:00 AM, March 1, 2012 in the local time zone of the HCP system:

changeTimeString:[* TO 2012-03-01T09:00:00]

Wildcard characters

You can use the question mark (?) and asterisk (*) wildcard characters when specifying values for these object properties:

  • customMetadataContent
  • hash
  • hashScheme
  • retentionClass
  • objectPath
  • utf8Name
  • Content properties

For example, this criterion finds objects assigned to any retention class starting with HlthReg, such as HlthReg-107 or HlthReg-224:

retentionClass:HlthReg*

Query expression considerations

The metadata query engine interprets query expressions as follows:

  • If the query expression consists of a single criterion without a Boolean operator, objects in the result set must meet the criterion. For example, this query expression finds objects with custom metadata that contains the string accounting:
    customMetadataContent:accounting

    The expression above is equivalent to this expression that uses the plus sign (+):

    +customMetadataContent:accounting
  • If a query expression consists of multiple criteria without Boolean operators, objects in the result set must meet at least one of the criteria. For example, this query expression finds objects that have a retention setting of Deletion Allowed or are on hold or will be shredded on deletion:
    retention:0 hold:true shred:true
  • The greater the number of criteria an object meets, the higher the object is in the default sort order. For example, with this query expression, objects that match all three criteria are sorted higher than those that match only two, and those that match only two are sorted higher than those that match only one:
    retention:0 hold:true shred:true
  • If a plus sign precedes some search criteria but not others, the criteria that are not preceded by a plus sign have no effect on which objects are returned. For example, this query expression finds objects that have a utf8Name property with the value Q1_2012.ppt, regardless of whether they are in the finance namespace owned by the europe tenant:
    +utf8Name:"Q1_2012.ppt" namespace:"finance.europe"

    Objects that match the namespace criterion are sorted higher in the result set than those that do not match it.

  • If a minus sign precedes some search criteria but not others and no criteria have plus signs, the query expression finds objects that do not match the criteria preceded by the minus signs and do match at least one of the criteria without a Boolean operator. For example, this query expression finds objects that are not in the finance namespace owned by the europe tenant and can be deleted.
    -(namespace:"finance.europe") retention:0

    This query finds objects that are not in the finance namespace owned by the tenant named europe and either can be deleted or can be indexed (or both):

    -namespace:"finance.europe" retention:0 index:1
  • If a Boolean operator precedes an opening parenthesis, that operator applies to the entire set of criteria inside the parentheses, not the individual criteria. For example, this query expression finds objects that are on hold or have a retention setting of Deletion Prohibited:
    +(hold:true retention:-1)
  • These characters have special meaning when specified in query expressions:
    ? * + - ( ) [ ] { } " :

    To specify one of these characters without special meaning in a query expression, precede the character with a backslash (\). To specify a backslash in a query expression, precede the backslash with another backslash.

Criteria for advanced queries with the Data Discovery Suite search facility

This section describes the criteria you use to construct advanced queries while the Data Discovery Suite search facility is active. For more information about the query language used with this search facility, see the applicable Data Discovery Suite documentation.

Basic criteria

The basic formats for criteria for advanced queries with the Data Discovery Suite search facility are:

property:value

[property:](int32|float|double|datetime|string|phrase|starts-with|
ends-with)(value[(,option)...])

property:(and|or|not|andnot|any|range|rank|near|onear)(value
[(,value)...][(,option)...])

You can also precede any of these formats with the not operator followed by the rest of the criterion in parentheses.

To search for object content, omit the property: entry.

One of the options you can specify is mode. For object content searches this can be any, all, or phrase, for example, mode="all".

Here are some examples of basic criteria:

  • This advanced query returns objects for which the POSIX user ID of the owner is 54:
    uid:54
  • This advanced query returns all email objects:
    contenttype:string("message/rfc822")
  • This advanced query returns all objects that are equal to or larger than 25,000 bytes:
    size:range(25000, max, from="GE")
  • This advanced query returns objects that are not email from rsilver@example.com or pcornflower@example.com:
    not(emailfrom:or(rsilver@example.com, pcornflower@example.com))
  • This advanced query returns objects with content that includes the exact phrase “account value”:
    string("account value" mode="phrase")

Most of the properties for advanced queries correspond to those for structured searches, but the property names differ, and in some cases, the values are expressed differently.

Complex criteria

For complex criteria, you can use these formats:

criterion [(and|or|andnot|any|rank|near|onear) criterion]...

criterion [(and|or|andnot|any|rank|near|onear)(criterion)]...

In these formats, criterion is any basic or complex criterion.

As with basic criteria, you can precede these formats with the not operator followed by the rest of the complex criterion in parentheses.

Here are some examples of complex criteria:

  • This advanced query returns only email objects that are not from rsilver@example.com or pcornflower@example.com:
    contenttype:string("message/rfc822") and
    not(emailfrom:(rsilver@example.com or pcornflower@example.com))
  • This advanced query returns objects that expire before February 1, 2015, and for which either the UID is less than or equal to 56 or the UID is greater than 56 and the GID is not less than 30:
    expirationtime:range(1970-01-01T00:00:10, 2015-02-01T00:00:00) and
    or(uid:range(min, 56, to="LE"),
    andnot(uid:range(56, max), gid:range(min,30, to="LT")))

Properties for advanced queries

The properties you can use for advanced queries differ depending on the active search facility.

Criteria for advanced queries with the metadata query engine

This section describes the formats for the criteria you use to construct advanced queries while the metadata query engine is active.

customMetadataContent property

To search for objects based on the full-text content of custom metadata, you specify the customMetadataContent property in an advanced query. Criteria that use this property find objects only in namespaces that have full-text indexing of custom metadata enabled.

When custom metadata is indexed for full-text searching, the XML is treated as text, not as a structured content. Similarly, the customMetadataContent property value is treated as text.

aclGrant property

To query for objects based on the content of ACLs, you specify the aclGrant property in an advanced query. Valid values for this property have these formats:

"permissions"

"permissions,USER[,location,username]"

"permissions,GROUP,location,(ad-group-name|all_users|authenticated)"

In these formats:

  • permissions

    One or more of these with no space between them:

    • R

      Read_ACL

    • r

      Read

    • W

      Write_ACL

    • w

      Write

    • d

      Delete

    If you specify only permissions as the aclGrant property value, the advanced query finds objects with ACLs that grant the specified permissions to any user or group.

  • USER

    Required when querying for objects with ACLs that grant permissions to a specified user.

    If you are accessing the Metadata Query Engine Console with a tenant-level user account that’s defined in HCP, you can find objects that have ACLs that grant the specified permissions to that user account by specifying only a permissions value and USER.

  • GROUP

    Required when querying for objects with ACLs that grant permissions to a specific group of users.

  • location

    The location in which the specified user or group is defined. Valid values are either:

    • The name of an HCP tenant
    • The name of an AD domain preceded by an at sign (@)

    If the value for the aclGrant property includes all_users or authenticated, location must be the name of an HCP tenant.

  • username

    The name of a user to which the matching ACLs grant the specified permissions. Valid values are:

    • The username for a user account that’s defined in HCP.
    • The username for an AD user account. This can be either the user principal name or the Security Accounts Manager (SAM) account name for the AD user account.
  • ad-group-name

    The name of an AD group to which the matching ACLs grant the specified permissions.

  • all_users

    Represents all users.

  • authenticated

    Represents all authenticated users.

Specifying permissions

The permissions in an aclGrant property value must be specified in this order

R, r, W, w, d

For example, to find objects that have ACLs that grant write and write_ACL permissions, and only those permissions, to the user rsilver who is defined in the europe tenant, specify this advanced query:

aclGrant:"Ww,USER,europe,rsilver"

You can replace one or more permissions with the asterisk (*) wildcard character. When you do so, you still need to specify permissions in the correct order.

When you specify both an asterisk and one or more permission values, the Console returns objects with ACLs that grant only the permissions you explicitly specify or that grant the permissions you explicitly specify and any permissions represented by the asterisk. For example, this advanced query returns objects with ACLs that grant read, read_ACL, write, and write_ACL permissions and may also grant delete permission:

aclGrant:"RrWw*"

A single asterisk represents all the missing permissions in the location where it appears. For example, in this advanced query, the wildcard character represents any combination of write, write_ACL, and delete permissions:

aclGrant:"r*"

In this advanced query, the wildcard character represents any combination of read and write_ACL permissions:

aclGrant:"R*w"

In this advanced query, the wildcard character represents only read_ACL permission:

aclGrant:"*r"

You can specify multiple asterisks in an advanced query. For example, this advanced query returns objects with ACLs that grant read permission and any combination of other permissions to the AD group named managers that is defined in the corp.widgetco.com domain:

aclGrant:"*r*,GROUP,@corp.widgetco.com,managers"

By replacing all permission values with a single asterisk, you query for objects that have ACLs that grant any combination of permissions. For example, if you are accessing the Console with a tenant-level user account, this advanced query finds objects with ACLs that grant any combination of permissions to that user account:

aclGrant:"*,USER"
aclGrant considerations

These considerations apply when you specify the aclGrant property in an advanced query:

  • The entire value for this property must be enclosed in double quotation marks (").
  • The locations and usernames you specify are not case sensitive.
  • The group names you specify, except all_users and authenticated, are not case sensitive.
  • The permission values you specify and the values USER and GROUP are case sensitive.

Content properties

When you search using a content property, the operators you can use depend on the data type of the property. Content properties can have these types:

  • Boolean
  • Datetime
  • Float
  • Integer
  • Text
  • Tokenized

Searches using text content properties find an element or attribute only if it exactly matches the search string.

Tokenized content properties are indexed for full-text search. Therefore, you can search for any of several words, words that occur in any order, or a specific phrase.

To learn the valid operations on a content property, select the property on the Structured Query page and the middle dropdown list will show the operations you can use.

Properties for advanced queries with the Data Discovery Suite search facility

The table below presents the properties you can use in advanced queries while the Data Discovery Suite search facility is active. For each property, the table shows the case-sensitive property name, the data type, a description, an example, and the equivalent structured search property, if any. Values for properties with a data type of string follow the rules for search terms in simple searches.

NoteYou cannot use $now for datetime values in advanced queries.
Advanced search propertyData typeDescriptionExampleStructured search equivalent
accesstimeDatetimeThe POSIX access time (atime) attribute for the object. Users and applications can change this metadata.accesstime:range (2011-11-12T05:00:00Z,max, from="GE")Access Time
archivedtimeDatetimeThe date and time the object was created in the namespace (that is, when the data was added to the namespace).archivedtime:range:(min,2011-11-01T05:00:00Z, to="LT")Ingest Time
attachmentStringThe name of a file attached to the email.attachment:"Sales Quotas 2012"Email Attachment Name
authorStringThe value of the Author metadata property that occurs in many Microsoft® Office and Adobe PDF documents.author:"lgreen"Author
bodyCompositeThe data content, name, title, and email subject of the object. This property is the equivalent of a simple search.1content:"medic*"Object Content
categoryStringThe value of the Category metadata property that occurs in many Microsoft® Office and Adobe PDF documents.category:"Sales Minutes"Category
changetimeDatetimeThe POSIX change time (ctime) attribute for the object. This is the last time the object metadata changed.changetime:range(2011-11-01T04:00:00Z,max, from="GE")Change Time
charsetStringThe character set or encoding used in the document. Use unknown for documents for which HCP cannot determine the character set.charset:string("utf-8")Character Set
contenttypeStringThe object MIME type.contenttype:string("application/zip")Object Type
dataindexedInt32

An indication of whether the object content and/or custom metadata is indexed and whether the object name is valid UTF-8 encoding. Valid values are the sum of any combination of these:

  • 0

    Object name is percent encoded, if necessary.

    Content is indexed.

  • 2

    Content is not indexed because HCP could not determine the MIME type.

  • 3

    HDDS: Content is not indexed because it exceeds ten MB.

    HCP: Content is not indexed because it exceeds 50 MB.

  • 10

    Custom metadata is indexed.

  • 20

    Custom metadata is not indexed due to invalid XML.

  • 30

    HDDS: Content is not indexed because it exceeds ten MB.

    HCP: Content is not indexed because it exceeds 50 MB.

  • 90

    Object has no custom metadata.

  • 100

    Object name is valid UTF-8 encoding.

dataindexed:range(100,190,from="GE",to="LT")N/A
emailbccStringThe email address of one blind-copied email recipient.emailbcc:or("pcornflower@example.com")Email BCC
emailccStringThe email address of one copied email recipient.emailcc:rsilver@example.comEmail CC
emaildateDatetimeThe date the email was sent.emaildate:range(2011-11-14T04:00:00Z,2011-11-15T04:00:00Z,from="GE", to="LT")Email Sent Date
emailfromStringThe email address of an email sender.emailfrom:"lgreen@example.com"Email From
emailmessageidStringThe ID of the email in the namespace.emailmessageid:"73495B59-04A3-59FC-573D-8380897A78BB...e.com-mbox.eml"Email Message ID
emailsubjectStringThe text in the email subject line.emailsubject:"Weekly SalesDepartment Meeting,Minutes -- 2/2/12"Email Subject
emailtoStringThe email address of one email recipient.emailto:rsilver@example.comEmail To
expirationtimeDatetime

The retention setting for the object. Valid values are:

  • For objects that can never be deleted:
    1970-01-01T00:00:03Z
  • For objects that can be deleted at any time:
    range(current-datetime,max)
  • For objects that expire at a specific date and time:
    range(datetime,datetime,from="GE",to="LE")

    In this criterion, the two variables specify the same date and time.

  • For objects that expire on a specific date:
    range(dateT00:00:00Z,date-plus-oneT00:00:00Z,from="GE", to="LT")

    In this criterion, the second date is one day later than the first.

  • For objects that do not yet have a retention setting:
    1970-01-01T00:00:02Z
  • For objects that have expired:
    range(1970-01-01T00:00:10Z,current-datetime)

Any of these values can return objects for which the retention setting is a retention class. This happens for objects that had a retention setting before being assigned to a class.

expirationtime:range(2015-01-13T00:00:00Z,2015-01-14T00:00:00Z,from="GE", to="LT")Retention
filenameStringAll or part of a path and object name, starting after fcfs_data, data, or rest.filename:"french/news_f/pres03_f/mou_16feb03_f.doc"Object Path
formatStringThe format of the object content. This is typically the name of the application used to create the content.3format:"Adobe Photoshop"Content Format
gidInt32The POSIX group ID of the owning group for the object.gid:24GID
gidStringThe cryptographic hash value of the object.hash:"9B6D8A603659B447DA4..."Hash
hashStringThe cryptographic hash value of the object.hash:"9B6D8A603659B447DA4..."Hash
holdInt32

An indication of whether the object is currently on hold. Valid values are:

  • 1

    The object is on hold.

  • 0

    The object is not on hold.

hold:"1"Retention Hold
languageDatetimeThe POSIX modify time (mtime) attribute for the object. Users and applications can change this metadata.modtime:not(range(2011-11-04T04:00:00Z,2011-11-10T04:00:00Z,from="GE", to="LE"))Modification Time
permissionsInt32The decimal equivalent of the octal value of the POSIX permissions for the object.4

permissions:420

This is equal to octal 644.

Permissions
retentionclassStringThe retention class specified as the retention setting for the object.retentionclass:"HlthReg-107a"Retention Class
shredInt32

An indication of whether the object will be shredded when it’s deleted. Valid values are:

  • 1

    The object will be shredded.

  • 0

    The object will not be shredded.

shred:"0"Shredding
sizeFloatThe object size, in bytes. This is the exact size of the object content. For example, to search for a two KB object, you need to specify 2048, not 2000.size:range(min, 5000, to="LT")Size
subjectStringThe value of the Subject metadata property that occurs in many Microsoft Office and Adobe PDF documents.subject:quotasSubject
titleStringThe value of the Title metadata property that occurs in many Microsoft Office and Adobe PDF documents.title:"Monthly Sales Statistics --February 2012"Title
uidInt32The POSIX user ID of the object owner.uid:72UID

Performing an advanced search

Procedure

  1. In the Search Console, click the Advanced Search tab.

    While the metadata query engine is active, the Advanced Search tab is called the Advanced Query tab and the Search button is called the Query button.
  2. In the entry field on the Advanced Search page, type the query for your search.

  3. Click Search.

Next steps

While you’re working on a query specification, you can click Reset to return to the most recently submitted query.

 

  • Was this article helpful?