Aggregations
Aggregations track information across all the documents that a workflow processes. For example, you can use aggregations to:
- Track the average size of all processed documents.
- Keep a list of all unique fields discovered in processed documents.
- Tally the number of processed documents by document type.
As a workflow task runs, it updates its aggregations one time per minute.
Each workflow can have multiple aggregations.
Viewing aggregation values
Aggregation values can be downloaded as part of a workflow report. Also, the Admin App includes graphs and tabular views for some aggregation types. For information, see Viewing aggregations for a workflow.
Aggregation types
Type of aggregate | Description | Configuration Settings |
Average Aggregate |
Calculates the average (arithmetic mean) value for a numeric field across all documents processed. |
Field Name: The field the aggregation should track. |
Document Field Aggregate |
Keeps track of all fields that the workflow discovered in the documents it processed. For each field, the aggregation shows:
You can use this aggregation type to import fields into an index collection. For information, see Working with fields discovered by a workflow task. |
None |
Long Sum Aggregate |
Calculates the sum of all values for a numeric field. |
Field Name: The field the aggregation should track. |
File Extension Aggregate | Tallies the unique file extensions for all documents processed. Note: Avoid using this aggregation type if your workflow processes a document set with a large number of unique file extensions (for example, if your documents use timestamps as extensions). This can cause task discovery metrics to consume large amounts of memory in a short amount of time. | File Name Field: Document field that contains file names. The default is HCI_filename. |
Maximum Aggregate |
Calculates the largest value for a numeric field across all documents processed. |
Field Name: The field the aggregation should track. |
String Count Aggregate |
Tallies the unique values for a string field across all documents processed. |
Field Name: The field the aggregation should track. |
Minimum Aggregate |
Calculates the smallest value for a numeric field across all documents processed. |
Field Name: The field the aggregation should track. |
Standard Deviation Aggregate |
Calculates the standard deviation of all values for a specified field across all documents processed. |
Field Name: The field the aggregation should track. Standard Deviation Type: One of these:
|
Variance Aggregate |
Calculates the variance of all values for a specified field across all documents processed. |
Field Name: The field the aggregation should track. Variance Type: One of these:
|
Default aggregations
Each workflow you create includes these default aggregations. You can delete any of them from a workflow.
Aggregation |
Type | Description |
Discovered Fields | Document Field Aggregate |
See Document Field Aggregate in Aggregation types. |
MIME Type | String Count Aggregate |
Tallies the unique values for the Content_Type field, which contains MIME types. |
Extensions | File Extension Aggregate |
Tallies the unique values for the HCI_filename field. |
Aggregations, triggers, and workflow notifications
Aggregations are used as criteria for triggers. When an aggregation meets a condition that you specify, a trigger is activated. You can configure triggers to send you notifications, by email, for example, when they are activated.
For more information, see Triggers and Using aggregations and triggers to monitor a workflow.
Adding aggregations to workflows
Each workflow contains a number of default aggregations, but you can also add new ones.
For information on aggregations, see Aggregations.
- You need to run the workflow after adding a new aggregation to it.
- The Collect Aggregation Metrics setting must be enabled for the workflow task. See Task settings.
Procedure
Click the Workflow Designer window.
Select the workflow that you want.
Click the Task window.
Click the Aggregations window.
Click the Add Aggregation + tab.
Select an aggregation type.
Name the aggregation, and, optionally, write a description for it.
Click Add Aggregation to add it to your workflow. Your aggregation appears as a new tab.
The aggregation displays information about the workflow the next time the workflow is run.
Related CLI commands
editWorkflow
Related REST API methods
PUT /workflows/{uuid}
Viewing aggregations for a workflow
While a workflow is running or has finished running, you can view the aggregation values for the workflow.
In the Admin App, you can view aggregation values, some in graphs and tables.
You can also retrieve aggregation values as part of a workflow report.
Workflow Designer instructions
Procedure
Click the Workflow Designer window.
Select the workflow that you want.
Click the Task window.
Click the Aggregations window.
Select the aggregation you want to view.
Related CLI commands
Related REST API methods