Each content class contains a list of content properties. Each content property has these settings:
- Type: The type of content that the property can extract fields values from:
- XML: The property uses XPath to extract values from XML.
- JSON: The property uses JSONPath to extract values from JSON.
- PATTERN: The property uses regular expressions to extract values from any type of text-based file.
- Name: The name to assign to the resulting field. Add this field to an index collection to allow users to search using this field.
For a field to be indexed, its name:
- Cannot contain hyphens (-) or any other special characters.
- Cannot start with underscores (_) or numbers.
- Expression: An XPath, JSONPath, or regular expression used to match the contents of a document.
- XPath syntax and functions, see https://developer.mozilla.org/en-US/docs/Web/XPath
- JSONPath syntax, see https://github.com/jayway/JsonPath#operators
- Regular expressions, see https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
Click the Content Classes window.
Click Create Content Class.
Enter a name and, optionally, a description for the class. Then click Create.
To add fields manually:
On the Content Properties tab, click Edit Properties.
Click Add Content Property to add a new field.
In the new field, select a Type.
In the Name field, enter a name for the field.
In the Expression field, enter an expression that matches the type you selected:
- For XML, XPath.
- For JSON, JSONPath.
- For PATTERN, regular expression.
To create fields through the system:
Click the Extract Fields tab.
Paste some sample XML or JSON into the box on the XML or JSON tab.
Use XML or JSON that's representative of the XML or JSON files that you want to index.
The extracted fields appear at the bottom of the page.
Select the fields you want.
Click Add Selected Properties.
The fields you selected are added to the content class.