Skip to main content

We've Moved!

Product Documentation has moved to docs.hitachivantara.com
Hitachi Vantara Knowledge

Aggregations

Aggregations track information across all the documents that a workflow processes. For example, you can use aggregations to:

  • Track the average size of all processed documents.
  • Keep a list of all unique fields discovered in processed documents.
  • Tally the number of processed documents by document type.

As a workflow task runs, it updates its aggregations one time per minute.

Each workflow can have multiple aggregations.

Viewing aggregation values

Aggregation values can be downloaded as part of a workflow report. Also, the Admin App includes graphs and tabular views for some aggregation types. For information, see Viewing aggregations for a workflow.

Aggregation types

Type of aggregateDescriptionConfiguration Settings

Average Aggregate

Calculates the average (arithmetic mean) value for a numeric field across all documents processed.

Field Name: The field the aggregation should track.

Document Field Aggregate

Keeps track of all fields that the workflow discovered in the documents it processed. For each field, the aggregation shows:

  • Name
  • Type: The type of data that the field contains. For information, see Field types.
  • Count: The number of documents in which the field appears.
  • An Example Value for the field.

You can use this aggregation type to import fields into an index collection. For information, see Working with fields discovered by a workflow task.

None

Long Sum Aggregate

Calculates the sum of all values for a numeric field.

Field Name: The field the aggregation should track.

File Extension Aggregate

Tallies the unique file extensions for all documents processed.

Note: Avoid using this aggregation type if your workflow processes a document set with a large number of unique file extensions (for example, if your documents use timestamps as extensions). This can cause task discovery metrics to consume large amounts of memory in a short amount of time.
File Name Field: Document field that contains file names. The default is HCI_filename.

Maximum Aggregate

Calculates the largest value for a numeric field across all documents processed.

Field Name: The field the aggregation should track.

String Count Aggregate

Tallies the unique values for a string field across all documents processed.

Field Name: The field the aggregation should track.

Minimum Aggregate

Calculates the smallest value for a numeric field across all documents processed.

Field Name: The field the aggregation should track.

Standard Deviation Aggregate

Calculates the standard deviation of all values for a specified field across all documents processed.

Field Name: The field the aggregation should track.

Standard Deviation Type: One of these:

  • Population
  • Sample

Variance Aggregate

Calculates the variance of all values for a specified field across all documents processed.

Field Name: The field the aggregation should track.

Variance Type: One of these:

  • Population
  • Sample

Default aggregations

Each workflow you create includes these default aggregations. You can delete any of them from a workflow.

Aggregation

Type

Description
Discovered FieldsDocument Field Aggregate

See Document Field Aggregate in Aggregation types.

MIME TypeString Count Aggregate

Tallies the unique values for the Content_Type field, which contains MIME types.

ExtensionsFile Extension Aggregate

Tallies the unique values for the HCI_filename field.

Aggregations, triggers, and workflow notifications

Aggregations are used as criteria for triggers. When an aggregation meets a condition that you specify, a trigger is activated. You can configure triggers to send you notifications, by email, for example, when they are activated.

For more information, see Triggers and Using aggregations and triggers to monitor a workflow.

Adding aggregations to workflows

Each workflow contains a number of default aggregations, but you can also add new ones.

For information on aggregations, see Aggregations.

NoteIn order for an aggregation to collect, process, and display information about a workflow:
  • You need to run the workflow after adding a new aggregation to it.
  • The Collect Aggregation Metrics setting must be enabled for the workflow task. See Task settings.
Workflow Designer instructions

Procedure

  1. Click the Workflow Designer window.

  2. Select the workflow that you want.

  3. Click the Task window.

  4. Click the Aggregations window.

  5. Click the Add Aggregation + tab.

  6. Select an aggregation type.

  7. Name the aggregation, and, optionally, write a description for it.

  8. Click Add Aggregation to add it to your workflow. Your aggregation appears as a new tab.

    The aggregation displays information about the workflow the next time the workflow is run.

    Related CLI commands

    editWorkflow

    Related REST API methods

    PUT /workflows/{uuid}

Viewing aggregations for a workflow

While a workflow is running or has finished running, you can view the aggregation values for the workflow.

In the Admin App, you can view aggregation values, some in graphs and tables.

You can also retrieve aggregation values as part of a workflow report.

Workflow Designer instructions

Procedure

  1. Click the Workflow Designer window.

  2. Select the workflow that you want.

  3. Click the Task window.

  4. Click the Aggregations window.

  5. Select the aggregation you want to view.

    Related CLI commands

    See Downloading task reports.

    Related REST API methods

    See Downloading task reports.

 

  • Was this article helpful?