Skip to main content

We've Moved!

Product Documentation has moved to docs.hitachivantara.com
Hitachi Vantara Knowledge

Testing pipelines

You can test an individual pipeline or a workflow pipeline by sending a single test document through it. This lets you view detailed test results for each stage, including:

  • How long it took the stage to process the document.
  • The fields and streams that were added and removed by the stage.
NoteThis topic describes how to view what streams are added to your documents, but not the contents of those streams. For that information, see Viewing stream contents.

After a test finishes, you can also view a list of all the fields it discovered and add them to an existing index collection. For information, see Working with fields discovered by a workflow task.

Testing pipelines using archive files

You can use archive files, such as .zip or .tar files, to test your pipelines. This is useful if your pipeline includes one or more archive expansion stages.

When you test a pipeline using an archive file, the test results include the Expanded metric, which shows the number of documents extracted from the archive file.

When you test a workflow pipeline using an archive file and the Workflow Recursion enabled setting is off for the workflow, the test reports results only for the archive file, not for any of the archive's contents.

TipTo get an idea on how your processing pipeline will handle the files in a particular data source, use a test document that is representative of all data in that data source.

If you get document failures when running a workflow task, you can test your pipelines with those failed documents to try and diagnose the failure. For more information, see Testing individual document failures.

If your test ends in the Halted state, verify that you've specified the test document URI correctly.
Note
  • When you test a pipeline that has Execute Action stages, the actions for those stage are performed during the test.
  • When you test a pipeline using an archive file:
    • The test halts after the first document failure it encounters.
    • The test might take a long time to complete.

Testing pipelines

  1. To test an individual pipeline:

    1. In the Admin App, click Workflows.

    2. Click the Processing Pipelines window.

      NoteSkip this step to test a workflow pipeline instead of an individual pipeline.
    3. Click the pipeline you want to test.

    4. Click the Test Pipeline window.

  2. Click Select Document.

  3. Select the data connection for where the test document is stored. Then click Next.

  4. To select a test document, click Browse and use the file browser to find the document you want or click Enter URI and type the full path to the document in the data source.

    Note The page shows up to 5000 files and folders. If the file you want isn't listed, use the Can't find what you're looking for? link, which lets you search the data connection directly. You cannot browse for objects using an HCP MQE data connection that isn't configured to connect to a specific namespace.

    To use the Enter URI option with the Amazon S3 and S3 Compatible data connections, the test document must have public read permissions in the data source. Otherwise, you need to provide a pre-signed URL for the document.

  5. Click Begin Test.

    The test document travels through each stage in the pipeline. A status icon is displayed on each stage as the test progresses. You can cancel the test at any time by clicking Cancel Test.

    IconDescription
    GUID-B927583E-258E-4477-8BA7-5E29747BBF36-low.pngThe test document skipped the stage.
    GUID-07C8169B-4CAB-4ED9-B3DF-07E73DE7B20C-low.pngThe test failed for the stage.
    GUID-4F5310C9-06CF-4FDC-9BE0-9053E3BE5C8B-low.pngThe test finished successfully for the stage.
  6. To view the results of the test for a stage, click the View Results link for the stage.

    If the test failed for the stage, the page displays error information for the stage.

    If the test completed successfully, the Changes tab displays only the fields that were affected by the stage and the Everything tab shows the complete document.

    On these tabs:

    • Removed fields are highlighted in red.
    • Added fields are highlighted in green.
    • Unchanged fields are not highlighted.

Viewing stream contents

You can use the Snippet Extraction stage to view stream contents.

Procedure

  1. Add a Snippet Extraction stage to your pipeline. See Adding stages to a pipeline.

  2. Configure the Snippet Extraction stage with these settings:

    • For Text Input Stream, the stream that holds document contents. Typically, this is HCI_content or HCI_text.
    • For Snippet Output Field, $TestContents.
    TipBeginning a field name with a dollar sign ($) causes the field to be deleted when the workflow pipeline finishes processing the associated document. That is, the field is not indexed.

    Use this technique to prevent unnecessary fields from being indexed. For example, you can add a field called $meetsCondition to a document to satisfy a conditional statement later on in the pipeline, but the field might not include any valuable information for your users to search on.

  3. Test your pipeline. See Testing pipelines.

  4. When the test finishes, view the results for the Snippet Extraction stage and examine the value for the $TestContents field.

    For more information on this stage, see Snippet Extraction stage.

Working with fields discovered by a test

After you run a test on an individual pipeline or a workflow pipeline, you can view the fields that the pipeline produced from the test document. You can then take those fields and add them directly to an index collection.

To add discovered fields to an index collection:

Procedure

  1. Run test for a pipeline. See Testing pipelines.

  2. When the test finishes, click the Discovered Fields tab.

    The Discovered Fields tab lists the fields output by the test. For each field, the list shows the field name and type.
  3. Optionally, to add these fields to an index collection:

    1. Click Add Fields to Index.

    2. Select the index collection that you want to add the fields to.

      For information on configuring these fields in the index collection, see Adding and editing fields in an index collection schema.

 

  • Was this article helpful?