Skip to main content

We've Moved!

Product Documentation has moved to docs.hitachivantara.com
Hitachi Vantara Knowledge

System requirements and sizing

The hardware, networking, and operating system requirements for running an HCI system with one or more instances.

Sizing guidance for Hitachi Content Search

Simple sizing

This table shows the minimum and recommended hardware requirements for each instance in an HCI running Hitachi Content Search.

Resource

Minimum

Recommended

RAM

16 GB

32 GB

CPU

4-core

8-core

Available disk space

50 GB

500 GB

Important
  • A large number of factors determine how many documents your system can index and how fast it can process them, including: the number of documents to be indexed; the contents of those documents; what search features (such as sorting) the index supports; the number of fields in the index; the number of users querying the system; and so on.

    Depending on how you use your system, you might require additional hardware resources to index all the documents you want and at the rate you require.

  • Each instance uses all available RAM and CPU resources on the server or virtual machine on which it's installed.

Detailed sizing

If you are installing HCI to run Hitachi Content Search, you should size your system based on the number of documents you need to index and the rate at which you need documents to be processed and indexed.
ImportantThis sizing guide details the resources required for a system with a single Index Protection Level (IPL). To scale your system accordingly, you will need to double the recommended values to accommodate IPL 2, triple the recommended values to accommodate IPL 3, etc.

To determine the system size that you need:

Procedure

  1. Determine how many documents you need to index.

  2. Based on the number of documents you want to index, use the following tables to determine:

    • How many instances you need
    • How much RAM each instance needs
    • The Index service configuration needed to support indexing the number of documents you want
    Total documents to be indexedSystem configuration
    15 million25 million50 milliona

    Total instances required: 1b

    Instances running the Index service: 1

    Index service configuration required:

    • Shards per index: 1
    • Index Protection Level per index: 1
    • Container memory: 200MB greater than Heap settings
    • Heap settings: Depends on instance RAM.
      Instance RAMHeap setting
      16 GB1800m
      32 GB9800m
      64 GB25800m
    16 GB32 GB64 GB
    Instance RAM needed (for each instance running the Index service)

    a Contact Hitachi Vantara for guidance before trying to index this many documents on this number of instances. At this scale, your documents and required configuration settings can greatly affect the number of documents you can index.

    b Single-instance systems are suitable for testing and development, but not for production use.

    Total documents to be indexed System configuration
    45 million75 million150 milliona

    Total instances required: 4

    Instances running the Index service: 3

    Index service configuration required:

    • Shards per index: 3
    • Index Protection Level per index: 1
    • Container memory: 200MB greater than Heap settings
    • Heap settings: Depends on instance RAM.
      Instance RAMHeap setting
      16 GB1800m
      32 GB9800m
      64 GB25800m
    16 GB32 GB64 GB
    Instance RAM needed (for each instance running the Index service)

    a Contact Hitachi Vantara for guidance before trying to index this many documents on this number of instances. At this scale, your documents and required configuration settings can greatly affect the number of documents you can index.

    Total documents to be indexed System configuration
    75 million125 million250 milliona

    Total instances required: 8

    Instances running the Index service: 5

    Index service configuration required:

    • Shards per index: 5
    • Index Protection Level per index: 1
    • Container memory: 200MB greater than Heap settings
    • Heapb settings: Depends on instance RAM.
      Instance RAMHeap setting
      16 GB7800m
      32 GB15800m
      64 GB31000m
    16 GB32 GB64 GB
    Instance RAM needed (for each instance running the Index service)

    a Contact Hitachi Vantara for guidance before trying to index this many documents on this number of instances. At this scale, your documents and required configuration settings can greatly affect the number of documents you can index.

    b With an 8-instance system, the Index service should be the only service running on each of its 5 instances. With the Index service isolated this way, you can allocate more heap space to the service than you can on a single or 4-instance system.

    Total documents to be indexed System configuration
    195 million325 million650 milliona

    Total instances required: 16

    Instances running the Index service: 13

    Index service configuration required:

    • Shards per index: 13
    • Index Protection Level per index: 1
    • Container memory: 200MB greater than Heap settings
    • Heapb settings: Depends on instance RAM.
      Instance RAMHeap setting
      16 GB7800m
      32 GB15800m
      64 GB31000m
    16 GB32 GB64 GB
    Instance RAM needed (for each instance running the Index service)

    a Contact Hitachi Vantara for guidance before trying to index this many documents on this number of instances. At this scale, your documents and required configuration settings can greatly affect the number of documents you can index.

    b With a 16-instance system, the Index service should be the only service running on each of its 13 instances. With the Index service isolated this way, you can allocate more heap space to the service than you can on a single or 4-instance system.

    For example, if you need to index up to 150 million documents, you need at minimum a 4-instance system with 64 GB RAM per instance.

  3. Determine how fast you need to index documents, in documents per second.

    For example:

    • To index 100 million documents in 2 days, you need an indexing rate of 578 documents per second.
    • To continuously index 1 million documents every day, you need an indexing rate of 12 documents per second.
  4. Determine the base indexing rate for your particular dataset and processing pipelines:

    1. Install a single-instance HCI system with that has the minimum required hardware resources.
    2. Run a workflow with the pipelines you want and on a representative subset of your data.
    3. Use the workflow task details to determine the rate of documents processed per second.
  5. To determine the number of cores you need per instance, replace Base rate in this table with the rate you determined in step 4.

    Number of instances you needCores per instance
    4 (minimum required)8 (recommended)
    1Base rate70% Base rate
    4300% Base rate500% Base rate
    8600% Base rate900% Base rate
    More than 8Contact Hitachi Vantara for guidance

    For example, if you had previously determined that:

    • You need a 4-instance system.
    • You need to process 500 documents per second.
    • The base processing rate for your data and pipelines is 100 documents per second.

    You need 8 cores per instance.

  6. Multiply the number of instances you need times the number of cores per instances to determine the total number of cores that you need for your system.

  7. After your system is installed, configure it with the index settings you determined in step 2.

    For information on index shards, Index Protection Level, and moving the Index service, see the Administrator Help, which is available from the Admin App.

Sizing guidance for HCM

Minimum hardware requirements

If you are installing HCI to run HCM, each instance in the system must meet these minimum hardware requirements:

Documents per secondCoresRAM (GB)Disk (GB)
Up to 1200828600
1200-16001232800
1600-200016401000
2000-240018481400
2400-280020561700
2800-320024642000

Determining number of instances

The number of instances that your HCM system needs is based on:
  • Whether you need the system to remain highly available.
  • The number of documents being produced by the HCP system you want to monitor. In this case, each document represents a single piece of data about the HCP system. A more active HCP system will produce more documents than a less active one.
  • The total number of documents you want HCM to store.

Number of instances: simple procedure

If you're monitoring a typically-active HCP system (roughly 75 operations per second per node), you can use this table to determine the number of HCM instances you need. This table lists the number of HCM instances you need based on the number of nodes in your HCP system and the number of days that you want your HCM system to retain the data it receives from HCP.

If your system is more active, see Number of instances: detailed procedure.

HCP nodesData retention time on HCM Instances needed
Up to 8Up to 30 days

1*

Up to 8Up to 60 days

3*

Up to 16Up to 30 days4
Up to 24Up to 60 days8

*An HCM system must have a minimum of 4 instances to maintain high system availability.

Number of instances: detailed procedure

  1. Determine whether you need your HCM system to maintain high availability. If so, you need a minimum of 4 instances. For more information, see Single-instance systems versus multi-instance systems.

  2. Determine the number of documents per second being produced by the HCP system you want to monitor. You can easily do this if you already have an HCM system up and running:

    1. Go to the Monitor App: https://system-hostname:6162

    2. Add the HCP system as a source. For information, see the help that's available from the Monitor App.

    3. Go to the HCI Admin App: https://system-hostname:8000

    4. Go to Workflows > Monitor App Workflow > Task > Metrics.

    5. View the value for the Average DPS field.

      TipLet the workflow run for a while to get a more accurate measure for the Average DPS field.
    Otherwise, you can get an average documents per second value by doing this:
    1. Select a time period.

    2. Download the HCP Internal Logs for this time period. For more information, see the help that's accessible from the HCP System Management Console.

    3. In the downloaded logs for each node, count the number lines logged during the selected time period.

    4. Add the line value for each node and then divide the sum by the number of seconds in the time window you selected.

  3. Use this table to determine the number of instances needed based on the number of documents per second produced by your HCP system.

    Documents per secondInstances needed

    ≤ 3,200

    1

    3,201 to 7,200

    3

    7,201-10,500*

    4

    *This is the maximum documents per second that HCM currently supports.
  4. Based on your data availability requirements, determine the number of instances you need.

    Data availability requirementIndex replicas neededInstances neededImpact on total documents stored

    No failure tolerance

    1

    1None

    Survive 2 failed replicas

    3

    3

    3x

    Survive 3 failed replicas

    44

    4x

    An index with multiple copies remains available in the event of an instance outage. For example, if an index has two copies stored on two instances and one of those instances fails, one copy of the index remains available for servicing requests.

  5. Use this formula to determine the total number of documents your HCM system must be able to store:

    documents per second from step 2.

    x 3600 seconds in an hour

    x 24 hours in a day

    x number of days you want to store data (default is 30)

    x Impact from the data availability table in step 4.

    = Total document count

    For example, if your HCP system produces 1500 documents per second, you want to store data for 30 days, and you want to maintain two copies of each index containing the stored data, your system must have enough instances to be able to store roughly 8 billion documents:

    1500

    x 3600

    x 24

    x 30

    x 2

    = 7,776,000,000

  6. Use this table to determine the number of instances needed based on the total number of documents your HCM must store.

    Total document countInstances needed

    2 billion or less

    1

    6 billion or less

    3

    8 billion or less

    4

  7. Take the highest number of instances from steps 2, 3, and 6. That's the number instances you need.

Operating system and Docker requirements

To be an HCI instance, each server or virtual machine you provide:

  • Must run a 64-bit Linux distribution
  • Must have Docker version 1.13.1 or later installed
  • Must be configured with IP and DNS addresses

Additionally, you should install all relevant patches on the operating system and perform appropriate security hardening tasks.

Suggested Docker version

ImportantInstall the current Docker version suggested by your operating system, unless that version is earlier than 1.13.1. The system cannot run with Docker versions earlier than 1.13.1.

This table shows the operating systems, as well as the Docker and SELinux configurations, on which this HCI system has been qualified for:

Operating systemDocker versionDocker storage configurationSELinux setting
CentOS 7.6Docker 18.03.1-cedevice-mapperEnforcing
CentOS 8.1.1911 Docker 19.03.9-ceoverlay2Enforcing and Disabled
Red Hat Enterprise Linux 8.1Docker 19.03.11-ceoverlay2Enforcing
Ubuntu 18.04.4 LTSDocker 18.03.1-ceoverlay2Enforcing

Docker considerations

The Docker installation folder on each instance must have at least 20 GB available for storing the Docker images.

Make sure that the Docker storage driver is configured correctly on each instance before installing the product. After you install the product, to change the Docker storage driver you must reinstall the product. To view the current Docker storage driver on an instance, run:

docker info

Core dumps can fill a host's file system, which can result in host or container instability. Also, if your system uses the data at rest encryption (DARE) feature, encryption keys are written to the dump file. It's best to disable core dumps.

To enable SELinux on the system instances, you need to use a Docker storage driver that SELinux supports. The storage drivers that SELinux supports differ depending on the Linux distribution you're using. For more information, see the Docker documentation.

If you are using the Docker devicemapper storage driver:

  • Make sure that there's at least 40 GB of Docker metadata storage space available on each instance. The product needs 20 GB to install successfully and an additional 20 GB to successfully update to a later version.

    To view Docker metadata storage usage on an instance, run:

    docker info

  • On a production system, do not run devicemapper in loop-lvm mode. This can cause slow performance or, on certain Linux distributions, the product might not have enough space to run.

SELinux considerations

  • You should decide whether you want to run SELinux on system instances and enable or disable it before installing additional software on the instance.

    Enabling or disabling SELinux on an instance needs a restart of the instance.

    To view whether SELinux is enabled on an instance, run: sestatus

  • To enable SELinux on the system instances, you need to use a Docker storage driver that SELinux supports.

    The storage drivers that SELinux supports differ depending on the Linux distribution you're using. For more information, see the Docker documentation.

Networking

This topic describes the network usage and requirements for both system instances and services.

You can configure the network settings for each service when you install the system. You cannot change these settings after the system is up and running. If your networking environment changes such that the system can no longer function with its current networking configuration, you need to reinstall the system. See Handling network changes.

WARNING

The HCI product uses both internal and external ports to operate its services and the system-internal ports do not have authentication or Transport Layer Security (TLS). At a minimum, use your firewall to make these ports accesible only to other instances in the system. If any users have root access to your system, your network and its systems are vulnerable to unauthorized use.

To secure your data and HCI system, you need to manually use iptables or firewalld to restrict ports to only local communications that the HCI installer otherwise leaves open. See System-internal ports and Example HCI firewall setup.

Additionally, you can use Internet Protocol Security (IPSec) or an equivalent to secure internode communications. Consult with your system administrator to configure your network with this added security.

Instance IP address requirements

All instance IP addresses must be static. This includes both internal and external network IP addresses, if applicable to your system.

ImportantIf the IP address of any instance changes, see Handling network changes.

Network types

Each of the HCI services can bind to one type of network, either internal or external, for receiving incoming traffic. If your network infrastructure supports having two networks, you might want to isolate the traffic for most system services to a secured internal network that has limited access to avoid critical security risks to your data and system. You can then leave only the Search-App and Admin-App services on your external network for user access.

You can use either a single network type for all services or a mix of both types. To use both types, every instance in your system must be addressable by two IP addresses: one on your internal network and one on your external network. If you use only one network type, each instance needs only one IP address.

Allowing access to external resources

Regardless of whether you're using a single network type or a mix of types, you need to configure your network environment to ensure that all instances have outgoing access to the external resources you want to use.

This includes:

  • The data sources where your data is stored.
  • Identity providers for user authentication.
  • Email servers that you want to use for sending email notifications.
  • Any external search indexes (for example, HDDS indexes) that you want to make accessible through HCI.

Ports

Each service binds to a number of ports for receiving incoming traffic. Before installing HCI, you can configure the services to use different ports, or use the default values shown in the following tables.

Port values can be reconfigured during system installation, so your system might not use the default values. You cannot change service port values when the system is up and running.

To view the ports that your system is using, view the Network tab for each service your system runs (Services > service-name > Network).

WARNING

The HCI product uses both internal and external ports to operate its services and the system-internal ports do not have authentication or Transport Layer Security (TLS). At a minimum, use your firewall to make these ports accesible only to other instances in the system. If any users have root access to your system, your network and its systems are vulnerable to unauthorized use.

To secure your data and HCI system, you need to manually use iptables or firewalld to restrict ports to only local commnuications that the HCI installer otherwise leaves open. See System-internal ports and Example HCI firewall setup.

Additionally, you can use Internet Protocol Security (IPSec) or an equivalent to secure internode communications. Consult with your system administrator to configure your network with this added security.

System-external ports

ImportantTo keep your system secure, HCI system-external ports require user authentication and utilize Transport Layer Security (TLS).

The following table contains information about the service ports that are used to interact with the system.

On every instance in the system, each of these ports:

  • Must be accessible from any network that needs administrative or search access to the system.
  • Must be accessible from every other instance in the system.
NoteDebug ports are accessible only when debug is set to true in /<installation-directory>/config/cluster.config
Default Port ValueServicePurpose

6162

Monitor-App

Access to the HCM application, which is used to monitor the health of HCP systems.

WARNINGThe Monitor-App service will not function properly if it is assigned a port value lower than 1024.

8000

Admin-App

Access to administrative interfaces:

  • Administration App
  • Administrative REST API
  • Administrative CLI

8888

Search-App

Access to search interfaces:

  • Search App
  • Workflow Designer
  • Search REST API
  • Workflow Designer REST API
  • Search CLI
  • Workflow Designer CLI

System-internal ports

This table lists the ports used for intra-system communication by the services. On every instance in the system, each of these ports:

  • Must be accessible from every other instance in the system.
  • Should not be accessible from outside the system.

You can find more information on how these ports are used in the documentation for the third-party software underlying each service.

NoteFor a secure and recommended firewall setup using these internet ports, see Example HCI firewall setup.
Default Port ValueUsed ByPurpose
2181

Synchronization service

Synchronization service client port.

2888

Synchronization service

Synchronization service internal communication.

3888

Synchronization service

Synchronization service leader election.

4040

Workflow jobs

Spark UI port.

5001

Admin-App service

Debug port for Admin-App service.

5005

Workflow jobs

The port to use for debugging the job driver.

5008

Workflow jobs

The port to use for debugging the job executor.

5002

Search-App service

Debug port used by the Search-App service.
5003

Index service

Debug port used by the Index service.
5050

Cluster-Coordination service

Primary port for communicating with Cluster-Coordination.
5051

Cluster-Worker service

Primary port for communicating with Cluster-Worker.

5123

Monitor-App service

The debug port used by the Monitor App.

5555

Watchdog service

Port for JMX connections to Watchdog service.

6175

Monitor-App service

The port used by the Monitor App for graceful shutdowns.

7000

Database service

TCP port for commands and data.

7199

Database service

Port for JMX connections to Database service.

7203

Message Queue service

Port for JMX connections to Message Queue service.

8005

Admin-App service

Port used by Admin-App for graceful shutdowns.

8006

Search App service

Port used by the Search App service for graceful shutdowns.
8080

Service-Deployment service

Primary port for communicating with Service-Deployment.
8081

Scheduling service

Primary port for communicating with the Scheduling service.
WARNINGIf you change the port number for the Scheduling service, in order for the changes to take effect, you will need to restart HCI.service on all system nodes.
5007

Sentinel service

Debug port used by Sentinel service.

8007

Sentinel service

Port used by the Sentinel service for graceful shutdowns.

8889

Sentinel service

Primary port for communicating with Sentinel.

8893Monitor-App servicePort used for the Monitor App Analytics functionality.
8983

Index service

Primary port used to communicate with the Index service.

WARNINGThe port assigned to the Index service should not be below 1024.
9042

Database service

Primary port for communicating with the Database service.

9091

Network-Proxy service

Primary port for communicating with Network-Proxy.
9092

Message Queue service

Primary port for communicating with Message Queue service.

9200

Metrics service

Port used to communicate with the Metrics service cluster.

9201

Metrics service

Port used to communicate with an individual Metrics service node.

9301

Metrics service

Port that nodes in the Metrics service cluster should use when communicating with each other.

9600

Logging service

Primary port for communicating with Logging service.

9601

Logging service

The port used to receive syslog messages.

10000

Index service

Port used by the Index service for graceful shutdowns.

15050

Cluster-Coordination service

Cluster-Coordination internal communication

18000

Admin-App service

Admin-App internal communication.

18080

Service-Deployment service

Service-Deployment internal communication

18889

Sentinel service

Sentinel service internal communication.

31000-34000

Cluster-Coordination and Cluster-Worker services

High ports used by both Mesos and Docker.

System ports for Monitor-App

This table lists the ports used by Monitor-App during the Configuration and Deployment phases. Each signal needs the following port information to function properly:
Monitor-App signalPort TypePort Number

Node Status

TCP

443 (or 80 if not using SSL) inbound to HCP

MAPI

TCP9090 inbound to HCP

SNMP

TCP/UDP161 inbound to HCP
SyslogUDP 9601 (the default listener port of Monitor-App) inbound to the HCM node

Time source

If you are installing a multi-instance system, each instance should run NTP (network time protocol) and use the same external time source. For information, see support.ntp.org.

Supported browsers

The HCI web applications support these web browsers:

  • The latest version of Google Chrome
  • The latest version of Mozilla Firefox
  • The latest version of Microsoft Edge

File ownership considerations

Within some of the Docker containers on each system instance, file ownership is assigned to this user and group:

  • User: hci, UID: 10001
  • Group: hci, GID: 10001

When you view such files in the instance operating system (for example, by running ls -l), the files appear to be owned by an unknown or undefined user and group. Typically, this causes no issues.

However, if you run applications on the system instances that change file ownership (for example, security hardening scripts), changing the ownership of files owned by the hci user and group can cause the system to become unresponsive.

To avoid these issues:

  1. Create the expected user and group on each instance:

    sudo groupadd hci -g 10001

    sudo useradd hci -u 10001 -g 10001

  2. Configure your applications to not change the ownership of files owned by the hci user and group.