Skip to main content

We've Moved!

Product Documentation has moved to docs.hitachivantara.com
Hitachi Vantara Knowledge

Bucket synchronization

Hitachi Content Platform for cloud scale (HCP for cloud scale) lets you configure and manage bucket synchronization.

To configure bucket synchronization, use S3 put bucket replication API requests. Scripts are available to simplify the process.

About bucket synchronization

HCP for cloud scale can synchronize the following kinds of data in buckets:

  • Object data
  • All user metadata (that is, anything that can be returned in the header x-amz-meta-*)
  • Tags
  • Content-Type system metadata
  • Objects that the owner of the source bucket doesn't have permission to read

This diagram illustrates the concept of bucket synchronization.

Conceptual diagram showing an HCP for cloud scale system with synch-to and synch-from buckets, each synchronized with different buckets in the cloud
Limitations on bucket synchronization

Objects that existed before synchronization functions are configured are not synchronized.

HCP for cloud scale verifies the rules that are valid at the time an object is synchronized, not at the time the object is ingested.

Objects that are marked as deleted are not synchronized.

Most system metadata is not synchronized, specifically:

  • Owner ID and Name
  • Timestamps (when last modified)
  • Metadata returned in x-amz-grant-*
  • Metadata returned in x-amz-acl
  • Metadata returned in x-amz-grant-*
  • Metadata returned in x-amz-acl
  • Metadata returned in x-amz-storage-class
  • Metadata returned in x-amz-replication-status
  • Metadata returned in x-amz-server-side-encryption-*
  • Metadata returned in x-amz-restore-*
  • Metadata returned in x-amz-version-id-*
  • Metadata returned in x-amz-website-redirect-location
  • Metadata returned in x-amz-object-lock-*

The bucket sync-from function only supports one rule for the same external SQS queue and external bucket. If a bucket has multiple sync-from rules for the same external queue, objects might not be synchronized. To use multiple rules for an external bucket, use one SQS queue for each rule.

Comparing synchronization to replication

Unlike AWS replication, HCP for cloud scale can synchronize with buckets on storage systems outside of AWS.

AWS determines the destination bucket using rules, but only applies one rule to each new object. In contrast, HCP for cloud scale can apply multiple rules to each new object so long as the destination buckets are different. This is how one-to-many synchronization is implemented.

AWS does not replicate, but HCP for cloud scale synchronizes, objects that the owner of the source bucket doesn't have permission to read.

In contrast with AWS replication, HCP for cloud scale does not synchronize the following:

  • Access control lists (ACLs)
  • Lock retention information
  • Objects that are encrypted using Amazon S3 managed keys (SSE-S3) and AWS KMS managed keys (SSE-KMS)

If an object being synchronized has the same name as an object in the target bucket, the result depends on whether the target bucket uses versioning:

  • If versioning is used, the old object is kept as an old version.
  • If versioning is not used, the old object is replaced by the new object.

HCP for cloud scale buckets always use versioning. The best practice is to use versioning in all target buckets.

Best-effort ordering

HCP for cloud scale guarantees that operations are applied in the order of their arrival (strong consistency). However, synchronizing multiple operations applied in a short period of time to the same object presents the following difficulties:

  • In a distributed system, especially when many systems are involved, synchronizing all operations in correct order is complex.
  • Even if HCP for cloud scale synchronizes all operations in correct order to an external storage component, that component might not guarantee that the operations are applied with strong consistency. In particular, AWS guarantees only "eventual consistency."
  • For bucket sync-from, the external queue service might not guarantee that messages are provided in correct order. In particular, AWS Simple Queue Service (SQS) does not support first-in, first-out (FIFO) queues for S3 notifications.

Therefore, HCP for cloud scale makes its best effort to synchronize only the latest state of an object, not each version or operation for the object. For example:

  • Assume that a client sends three operations to an object and that they are all committed: (1) PUT, (2) PUT, (3) DEL. The latest state of the object is (3) DEL. HCP for cloud scale only synchronizes DEL.
  • Assume that a client sends three operations to an object and that they are all committed: (1) PUT, (2) DEL, (3) PUT. The latest state of the object is (3) PUT. HCP for cloud scale only synchronizes (3) PUT.

This approach does not guarantee that the latest state of an object will be in the external storage for all situations. Partly because of the "eventual consistency" offered by AWS S3 API, corner cases still exist.

Synchronization to an external bucket: high-level tasks

Synchronization to an external bucket involves assigning roles and permissions to users, creating and synchronizing the buckets, and then reading from and writing to the buckets.

This description of high-level tasks assumes three classes of user:

  1. An HCP for cloud scale system administrator to create roles and assign them to users using an IdP
  2. An HCP for cloud scale bucket administrator, who could be a tenant administrator, to create and configure an HCP for cloud scale bucket
  3. An Amazon Web Services (AWS) user, who could be a customer, to create a remote bucket using AWS S3 and then read and write data
NoteThe default HCP for cloud scale account has full permissions and can perform the tasks assigned to the first two user classes.

Procedure

  1. The system administrator assigns permissions to the bucket administrator to configure bucket synchronization.

    1. In the System Management application, create a role with the permission group bucket_sync.

    2. In the IdP server, set up two groups: bucket administrators and bucket users.

    3. In the IdP server, register users in these groups.

    4. In the System Management application, assign the role to the bucket administrator group.

  2. The bucket administrator creates local and remote buckets.

    1. In the S3 User Credentials application, generate S3 credentials.

      TipUse the base64 utility to encode S3 credentials.
    2. Using the S3 credentials, use an S3 API to create an HCP for cloud scale (local) bucket.

    3. Use an AWS S3 API to create an S3 (remote) bucket.

  3. The bucket administrator configures bucket synchronization between the HCP for cloud scale bucket and the S3 bucket using an S3 PUT Bucket Replication method, replacing the bucket ARN with configuration settings. By using multiple rules and filters, the bucket administrator can specify what objects are synchronized to the S3 bucket.

  4. The bucket administrator sets access control lists to let the bucket user write data to the HCP for cloud scale bucket.

    1. Using a management API, get the user ID of the bucket user.

    2. Using an S3 API, assign write permission to the bucket user for the HCP for cloud scale bucket.

  5. The AWS user is now free to write objects to the HCP for cloud scale bucket, which is now synchronized with the remote bucket.

Synchronization from an external bucket: high-level tasks

Synchronization from an external bucket involves assigning roles and permissions to users, creating and synchronizing buckets, and then reading from and writing to the buckets.

This description of high-level tasks assumes three classes of user:

  1. An HCP for cloud scale system administrator to create roles and assign them to users using an IdP
  2. An HCP for cloud scale bucket administrator, who could be a tenant administrator, to create and configure an HCP for cloud scale bucket
  3. An AWS user, who could be a customer, to create a remote bucket using AWS S3, create an AWS SQS queue, and then configure S3 notifications to SQS
NoteThe default HCP for cloud scale account has full permissions and can perform the tasks assigned to the first two user classes.

Procedure

  1. The system administrator assigns permissions to the bucket administrator to configure bucket synchronization.

    1. In the System Management application, create a role with the permission group bucket_sync.

    2. In the IdP server, set up two groups: bucket administrators and bucket users.

    3. In the IdP server, register users in these groups.

    4. In the System Management application, assign the role to the bucket administrator group.

  2. The bucket administrator creates local and remote buckets.

    1. In the S3 User Credentials application, generate S3 credentials.

      TipUse the base64 utility to encode S3 credentials.
    2. Using the S3 credentials, use an S3 API to create an HCP for cloud scale (local) bucket.

    3. Use an AWS S3 API to create an S3 (remote) bucket.

  3. The AWS user creates a standard queue in SQS.

    1. Using an AWS account, create a queue of the type Standard Queue.

    2. Create a policy document.

  4. The AWS user configures the remote bucket to send S3 notifications to the AWS SQS queue.

    1. Add a notification for all object creation events to the remote bucket.

  5. The bucket administrator configures bucket synchronization between the S3 bucket and the HCP for cloud scale bucket using an S3 PUT Bucket Replication method, replacing the bucket ARN with configuration settings. By using multiple rules and filters, the bucket administrator can specify what objects are synchronized to the local bucket.

  6. The bucket administrator sets access control lists to let the bucket user read data from the HCP for cloud scale bucket.

    1. Using a management API, get the user ID of the bucket user.

    2. Using an S3 API, assign write permission to the bucket user for the HCP for cloud scale bucket.

  7. The AWS user is now free to read objects from the HCP for cloud scale bucket, which is now synchronized with the remote bucket.

Bucket synchronization configuration

Bucket synchronization is configured using S3 PUT bucket replication API requests that define rules. Each bucket can have up to 1,000 rules, but all rules must be sync-to or sync-from rules. Each rule defines the following:

  • External bucket settings
  • A set of one or more prefixes; an object with one of the prefixes is mirrored
  • A set of one or more tags; an object with all, or any, of the tags is mirrored
  • For sync-from, external queue settings

Because you can configure multiple rules with multiple tags, you have flexibility in selecting objects to mirror. For example:

  • To mirror all objects that contain Tag1 and Tag2, you can configure one rule that includes both tags.
  • To mirror all objects that contain Tag1 or Tag2, you can configure two rules, one for each tag.

For information on PUT bucket replication see Configure bucket synchronization (PUT bucket replication).

Visibility of new buckets and objects

After they are created, buckets and objects are not immediately visible. Some client applications (such as Cloudberry Explorer) immediately retrieve the list of buckets to display the new bucket or object, which is not visible. If you create a new bucket or object and it's not immediately visible, update the list manually.

Rule collisions

HCP for cloud scale can apply multiple bucket synchronization rules to each new object so long as the destination buckets are different. This is how one-to-many synchronization is implemented.

A rule collision is when two or more rules that apply to an object have the same destination (that is, the same external host, port, and bucket). HCP for cloud scale does not allow rule collisions, so PUT bucket replication requests are rejected if they contain rule collisions. To avoid rule collisions, you can define as many tags in a rule as necessary, so that multiple rules with the same destination are not needed.

Effect of configuration changes

After an object operation is performed, the policy engine asynchronously checks if that object needs to be copied according to the sync-to rules. When bucket synchronization rules are created, updated, or deleted, the changes only apply to new objects, object operations, and to objects that have not been yet processed by the policy engine. Objects that existed before the rules were configured are not synchronized. If an object exists in the PENDING state when a rule is created, updated, or deleted, the rule change might not be applied.

Synchronizing to the same source and destination

You cannot set up bucket synchronization with the same bucket as both the source and the destination.

Configure bucket synchronization (PUT bucket replication)

You can configure S3 bucket sync-to and sync-from settings.

Notes
  • If you use the AWS command-line interface to configure bucket synchronization, use at least aws-cli v1.16.211 and aws-sdk 1.11.610.
  • Configuration rules should be provided to AWS CLI from a file, rather than inline. This is to avoid problems with double quote characters in some terminals.
HTTP request syntax (URI)
aws --endpoint-url https://10.08.1019 s3api put-bucket-replication --bucket "hcpcs_bucket" --replication-configuration file://rules.json
Request structure

A rule consists of up to 1000 prefixes and tag-value pairs. You can configure up to 1000 rules per bucket. Separate tag-value pairs in the rule using the keywords "And": or "Or":.

The content of the configuration JSON file is:

{
  "Role": "",
  "Rules": [{
    "ID": "string",
    "Filter": {
       "Prefix": "string",
       "Tag": {
         "Key": "string",
         "Value": "string"
      }
    },
    "Status": "boolean",
    "Destination": {
      "Bucket": "json"
    }
   }
   .
   .
   .
  }]
}
NoteS3 parameters not shown are not required, not supported, and if specified should be left empty.
Account ParameterRequiredTypeDescription
RoleYesN/ANot supported; leave empty.
IDNoString

Unique identifier for rule, up to 255 characters.

All rules must specify the same bucket.

PriorityYesIntegerNot supported; ignored.
DeleteMarkerReplication.StatusNoStringNot supported; if provided, leave as Disabled.
PrefixNoStringPrefix (one per rule). Up to 1024 characters.
KeyNoStringTag key (up to 1000 per rule). Up to 128 characters.
ValueNoStringTag value. Up to 256 characters.
Rules.StatusYesBooleanEnabled or Disabled. If Disabled, rule is ignored.
Rules.Destination.BucketYesBase64-encoded JSON

External S3 bucket access settings.

  • For bucket sync-to, the settings to access the external bucket.
  • For bucket sync-from, the settings to access the external bucket and the SQS queue settings.

You can't specify the same bucket name and host as both source and destination.

Rules.Destination.Account NoN/ANot supported; leave empty.
Bucket sync-to structure

Bucket sync-to settings are defined by a set of parameters and passed in the value of Rules.Destination.Bucket as a Base64-encoded JSON structure.

The syntax inside the bucket parameter for the sync-to setting is:

{
  'version': 'version', 
  'action': 'sync-from', 
  'externalBucket': {
    'host': 'host', 
    'type': 'type', 
    'region': 'region', 
    'remoteBucketName': 'bucket_name', 
    'accessKey': 'B64_key', 
    'secretKey': 'B64_key', 
    'port': 'port', 
    'authVersion': 'auth_version', 
    'usePathStyleAlways': '[true|false]'
    },
  'notifications': {
    'type': 'type', 
    'region': 'region', 
    'queue': 'queue', 
    'accessKey': 'B64_key', 
    'secretKey': 'B64_key'
    }
}
ParameterRequiredTypeDescription
versionYesString1.0.
hostYesIP addressHost IP address.
typeYesStringDestination storage class: AMAZON_S3 or GENERIC_S3.
regionYesStringThe S3 region.
remoteBucketNameYesStringThe name of the bucket, from 3 to 63 characters long, containing only lowercase characters (a-z), numbers (0-9), periods (.), or hyphens (-). The bucket must already exist.
accessKeyYesBase64 encoded stringThe S3 access key credentials to the external S3 bucket.
secretKeyYesBase64 encoded stringThe S3 secret key credentials to the external S3 bucket.
portYesintegerHost port.
authVersionYesStringAWS Signature version: V2 or V4.
usePathStyleAlwaysYesBooleanPath-style URLs for bucket access: true or false.
Bucket sync-from structure

Bucket sync-from settings include both a bucket address and a notification queue. The settings are defined by a set of parameters and passed in the value of Rules.Destination.Bucket as a Base64-encoded string.

The syntax inside the bucket parameter for sync-from setting is:

"{
  'version': 'version', 
  'action': 'sync-from', 
  'externalBucket': {
    'host': 'host', 
    'type': 'type', 
    'region': 'region', 
    'remoteBucketName': 'bucket_name', 
    'accessKey': 'B64_key', 
    'secretKey': 'B64_key', 
    'port': 'port', 
    'authVersion': 'auth_version', 
    'usePathStyleAlways': '[true|false]'
    }
}"
ParameterRequiredTypeDescription
versionYesStringEnter 1.0.
hostYesIP addressHost IP address.
typeYesStringDestination storage class: AMAZON_S3 or GENERIC_S3.
regionYesStringThe S3 region.
remoteBucketNameYesStringThe name of the bucket, from 3 to 63 characters long, containing only lowercase characters (a-z), numbers (0-9), periods (.), or hyphens (-). The bucket must already exist.
accessKeyYesBase64 encoded stringThe S3 access key credentials to the external S3 bucket.
secretKeyYesBase64 encoded stringThe S3 secret key credentials to the external S3 bucket.
portYesintegerHost port.
authVersionYesStringAWS Signature version: V2 or V4.
usePathStyleAlwaysYesBooleanPath-style URLs for bucket access: true or false.
Destination.type YesStringAlways set as AWS_SQS.
Destination.regionYesStringRegion of your AWS_SQS queue.
Destination.queueYesStringName of your AWS_SQS queue.
Destination.accessKeyYesBase64 encoded stringaccessKey for permissions to read from your AWS_SQS queue.
Destination.secretKeyYesBase64 encoded stringsecretKey for permissions to read from your AWS_SQS queue.
Response structure

None.

Example

Request example:

aws --endpoint-url https://10.08.1019 s3api put-bucket-replication --bucket "hcpcs_bucket" --replication-configuration file://rules.json

Configuration rules.json:

{
    "ID": "sync_rule2_for_music",
    "Filter": {
      "Prefix": "/music/october/",
      "Tag": {
        "Key": "target",
        "Value": "cloud"
        }
      }
    },
    "Status": "Enabled",
    "Destination": {
      "Bucket": "{
        'version' : '1.0',
        'action' : 'sync_from',
        'externalBucket' : {
          'type' : 'AMAZON_S3',
          'region' : 'us-east-1',
          'remoteBucketName' : 'bluebucket',
          'authVersion' : 'V4',
          'usePathStyleAlways' : 'true',
          'accessKey' : 'access_key',
          'secretKey' : 'secret_key'
          },
        "notifications" : {
          "type" : "AMAZON_SQS",
          "region" : "us-east-1",
          "queue" : "testQueue",
          "accessKey" : "access_key",
          "secretKey" : "secret_key"
          }
        },
      }
    }
  }]
}

Script to generate bucket sync-to JSON

HCP for cloud scale includes a script to generate the JSON needed to configure bucket synchronization to an external bucket (sync-to).

The script is written in Python and located in the folder install_path/product/bin (for example, /opt/hcpcs/bin).

The script generates the JSON string that you can use for the field destination.bucket in the AWS S3 command put-bucket-replication. Optionally, the script verifies whether the destination bucket exists. If you omit the secret key, the script prompts you for it, which lets you create a script that calls this script without storing the secret key. You can mix the short and full form of arguments.

NoteThe script produces JSON using single quotes.
Syntax
SyncToBucketJsonGenerator.py
  [--help]
  --s3host host
  --region region
  --bucket bucket
  --accessKey access_key
  [--secretKey secret_key]
  [--s3type { GENERIC_S3 | AMAZON_S3 }]
  [--port port]
  [--authVersion { v2 | v4 }]
  [--usePathStyleAlways {true | false}]
  [--jsonSample output_file.json]
  [--verifyTarget]
  [--http]
  [--quietMode]
Options and parameters
  • -h, --help

    Optional. Displays a help message and exits.

  • --s3host host, -s3 host

    Host name of the remote S3 storage component.

  • --region region, -r region

    Region of the remote bucket.

  • --bucket bucket, -b bucket

    Name of the remote bucket.

  • --accessKey access_key, -ak access_key

    Access key for the remote bucket.

  • --secretKey secret_key, -sk secret_key

    Secret key for the remote bucket. The script prompts for the key if you don't specify it.

  • --s3type { GENERIC_S3 | AMAZON_S3 }, -s3t { GENERIC_S3 | AMAZON_S3 }

    Optional. The remote bucket type:

    • GENERIC_S3: An S3-compatible node
    • AMAZON_S3: An Amazon Web Services S3-compatible node
    If not specified, the default bucket type is AMAZON_S3.
  • --port port, -p port

    Optional. Port of the remote bucket. If not specified, the default port is 443.

  • --authVersion { v2 | v4 }, -av { v2 | v4 }

    Optional. The Auth Version of the remote bucket. If not specified, the default version is v4.

  • --usePathStyleAlways {true | false}, -upsa {true | false}

    Optional. Sets the Use Path Style Always flag for the remote bucket. If not specified, the default is true.

  • --jsonSample output_file.json, -json output_file.json

    Optional. Creates a file named output_file.json with a sample JSON structure for bucket replication configuration. If not specified, no sample file is created.

  • --verifyTarget, -verify

    Optional. Verifies that the remote bucket exists. SSL certificates are not validated. This option requires python3 and boto3. If not specified, the bucket's existence isn't verified.

  • --http, -http

    Optional. Use HTTP when verifying the remote bucket. If not specified, the default is to use HTTPS.

  • --quietMode, -qm

    Optional. Displays only the Destination.Bucket element.

    NoteYou can't specify both --quietMode and --verifyTarget together.
Example
$ SyncToBucketJsonGenerator.py -s3 s3.us-east-2.amazonaws.com -b hcpcs-bucket-5 -r us-east-2 -ak A1234567890 -sk S1234567890  -verify -json testto.json

This example can produce the following output:

Verifying that a remote bucket "hcpcs-bucket-5" exists...
Verification successfully completed: remote bucket "hcpcs-bucket-5" is FOUND

Generated a JSON string for the Destination->Bucket element for bucket replication sync-to configuration:

{'action': 'sync-to', 'version': '1.0', 'externalBucket': {'host': 's3.us-east-2.amazonaws.com', 'type': 'AMAZON_S3', 'region': 'us-east-2', 
'remoteBucketName': 'hcpcs-bucket-5', 'accessKey': 'A1234567890=', 'secretKey': 'S1234567890==', 'port': 443, 
'authVersion': 'v4', 'usePathStyleAlways': 'true'}}

Saved sample JSON file for bucket replication sync-to configuration in 'testto.json'

You can use 'testto.json' sample JSON file as an input to put-bucket-replication S3 API. For example, using aws s3api command:
aws s3api put-bucket-replication --no-verify-ssl --endpoint-url https://cloudscale-hostname --bucket cloudscale-bucket --replication-configuration file://testto.json

Script to generate bucket sync-from JSON

A script is included to generate the JSON needed to configure bucket synchronization from an external bucket (sync-from).

The script is written in Python and located in the folder install_path/product/bin (for example, /opt/hcpcs/bin).

The script generates the JSON string that you can use for the field destination.bucket in the AWS S3 command put-bucket-replication. Optionally, the script verifies whether the destination bucket or the target AWS SQS queue exist. If you omit the secret key, the script prompts you for it, which lets you create a script that calls this script without storing the secret key. If you omit the access key for a queue, the script uses the access key and secret key for the bucket. You can mix the short and full form of arguments.

NoteThe script produces JSON using single quotes.
Syntax
SyncFromBucketJsonGenerator.py
  [--help]
  --s3host host
  --region region
  --bucket bucket
  --accessKey access_key
  [--secretKey secret_key]
  [--s3type { GENERIC_S3 | AMAZON_S3 }]
  [--port port]
  [--authVersion { v2 | v4 }]
  [--usePathStyleAlways {true | false}]
  [--jsonSample output_file.json]
  [--verifyTarget]
  [--http]
  --notificationsQueue queue
  [--notificationsRegion region]
  [--notificationsAccessKey access_key]
  [--notificationsSecretKey secret_key]
  [--quietMode]
Options and parameters
  • -h, --help

    Optional. Displays a help message and exits.

  • --s3host host, -s3 host

    Host name of the remote S3 storage component.

  • --region region, -r region

    Region of the remote bucket.

  • --bucket bucket, -b bucket

    Name of the remote bucket.

  • --accessKey access_key, -ak access_key

    Access key for the remote bucket.

  • --secretKey secret_key, -sk secret_key

    Secret key for the remote bucket. The script prompts for the key if you don't specify it.

  • --s3type { GENERIC_S3 | AMAZON_S3 }, -s3t { GENERIC_S3 | AMAZON_S3 }

    Optional. The remote bucket type:

    • GENERIC_S3: An S3-compatible node
    • AMAZON_S3: An Amazon Web Services S3-compatible node
    If not specified, the default bucket type is AMAZON_S3.
  • --port port, -p port

    Optional. Port of the remote bucket. If not specified, the default port is 443.

  • --authVersion { v2 | v4 }, -av { v2 | v4 }

    Optional. The Auth Version of the remote bucket. If not specified, the default version is v4.

  • --usePathStyleAlways {true | false}, -upsa {true | false}

    Optional. Sets the Use Path Style Always flag for the remote bucket. If not specified, the default is true.

  • --jsonSample output_file.json, -json output_file.json

    Optional. Creates a file named output_file.json with a sample JSON structure for bucket replication configuration. If not specified, no sample file is created.

  • --verifyTarget, -verify

    Optional. Verifies that the remote bucket exists. SSL certificates are not validated. This option requires python3 and boto3. If not specified, the bucket's existence isn't verified.

  • --http, -http

    Optional. Use HTTP when verifying the remote bucket. If not specified, the default is to use HTTPS.

  • --notificationsQueue queue, -nq queue

    Name of the notifications queue.

  • --notificationsRegion region, -nq region

    Optional. Name of the notifications region. If not specified, the default is the region of the remote bucket.

  • --notificationsAccessKey access_key, -nak access_key

    Optional. The notifications access key. If not specified, the default is the access key of the remote bucket.

  • --notificationsSecretKey secret_key, -nsk secret_key

    Optional. The notifications secret key. If not specified, the default is the secret key of the remote bucket.

  • --quietMode, -qm

    Optional. Displays only the JSON for QueueArn.

    NoteYou can't specify both --quietMode and --verifyTarget together.
Example
$ SyncFromBucketJsonGenerator.py -s3 s3.us-east-2.amazonaws.com -b hcpcs-bucket-5 -r us-east-2 -ak A1234567890 -sk S1234567890 -nq 'bucketevents2' -verify -json testfrom.json

This example can produce the following output:

Verifying that a remote bucket "hcpcs-bucket-5" exists...
Verification successfully completed: remote bucket "hcpcs-bucket-5" is found

Verifying that a remote notification queue with a prefix "bucketevents2" exists...
Verification successfully completed: found "bucketevents2" queue.

Generated a JSON string for the Destination->Bucket element for bucket replication sync-from configuration:

{'action': 'sync-from', 'version': '1.0', 'externalBucket': {'host': 's3.us-east-2.amazonaws.com', 'type': 'AMAZON_S3', 'region': 'us-east-2', 
'remoteBucketName': 'hcpcs-bucket-5', 'accessKey': 'A1234567890=', 'secretKey': 'S1234567890==', 'port': 443, 
'authVersion': 'v4', 'usePathStyleAlways': 'true'}, 'notifications': {'type': 'AWS_SQS', 'queue': 'bucketevents2', 'region': 'us-east-2', 
'accessKey': 'A1234567890=', 'secretKey': 'S1234567890=='}}

Saved sample JSON file for bucket replication sync-from configuration in 'testfrom.json'

You can use 'testfrom.json' sample JSON file as an input to put-bucket-replication S3 API. For example, using aws s3api command:
aws s3api put-bucket-replication --no-verify-ssl --endpoint-url https://cloudscale-hostname --bucket cloudscale-bucket --replication-configuration file://testfrom.json

Get bucket synchronization rules (GET bucket replication)

You can retrieve the synchronization rules for a bucket.

HTTP request syntax (URI)
aws --endpoint -url https://host_ip s3api get-bucket-replication --bucket "bucket"
Request structure

Not applicable.

Response structure

The response body is shown below:

{
  "ReplicationConfiguration": {
    "Role": "",
    "Rules": [
      {
        "Filter": {
          "And": {
            "Prefix": "string",
            "Tags": [
              {
              "Key": "string",
              "Value": "string"
        }
        .
        .
        .
      },
      "Status": "boolean",
      "Destination": {
        "Bucket": "access_settings",
      },
       "ID": "string",
     }
     ],
  }
}
ParameterRequiredTypeDescription
RoleYesN/ANot supported; empty.
PrefixNoStringPrefix.
KeyNoStringTag key.
ValueNoStringTag value. Sets of prefixes and key-value pairs.
StatusYesBooleanIf false, rule is ignored.
BucketYesBase64-encoded JSON

Bucket access settings. S3 access and secret keys are masked.

IDNoStringUnique identifier for rule, up to 255 characters.
HTTP status codes

Status code

HTTP name

Description

200 OK The request was executed successfully.
401 Unauthorized Access was denied due to invalid credentials.
Example

Request example:

aws --endpoint-url https://10.08.1019 s3api get-bucket-replication --bucket "hcpcs_bucket"

JSON response:

{
  "ReplicationConfiguration": {
    "Role"": "",
    "Rules": [
      {
        "Filter": {
          "And": {
            "Prefix": "SQS",
            "Tags": [
              {
                "Value": "cloud",
                "Key": "target"
              }
            ]
          }
        },
        "Status": "Enabled",
        "Destination": {
          "Bucket": {
            'version': 'version', 
            'action': 'sync-from', 
            'externalBucket': {
              'host': 'host', 
              'type': 'type', 
              'region': 'region', 
              'remoteBucketName': 'bucket_name', 
              'port': 'port', 
              'authVersion': 'auth_version', 
              'usePathStyleAlways': '[true|false]'
              }
            }"
          },
        "ID": "mirrorBack_rule_for_images"
      }
    ]
  }
}

Get object synchronization status

The synchronization status of an object is returned in metadata as part of the response to a GET object or HEAD object request.

For a GET object or HEAD object request, the synchronization functions return a replication status header in addition to the standard response metadata. This information is useful before deletion from a source bucket to verify synchronization.

When an object is created, HCP for cloud scale evaluates the sync-to rules for the bucket. If the object matches the rules, it sets the object's sync state as PENDING. Most of the time, this sync state is accurate. However, it is never definitive because users may change the sync-to rules for the bucket before the policy engine starts processing the object, which happens asynchronously. The policy engine evaluates the sync-to rules again when processing an object to act according to the latest sync rules.

For example:

  • An object was ingested that matches the sync-to rules, so its sync state is set as PENDING. Then, a user changes the sync-to rules. The object does not match the rules anymore so the object is actually not synced and that sync state is removed.
  • An object was ingested that does not match the sync-to rules, so its sync state is not set. Then, a user changes the sync rules. The object now matches the rules so the object is actually synced and the sync state is set to COMPLETED.

Response header

Description

x-amz-replication-status

Status of synchronization:

  • COMPLETED: For sync-to, all rules were successfully executed and the object was successfully synchronized.

    Note: This status is also returned for objects that match a sync-to rule but were skipped because they are not the most recent version.

  • PENDING: For sync-to, one of the following: (1) a check is pending to see if the object needs synchronization; (2) the object needs synchronization, but the process is not complete.
  • FAILED: For sync-to, the process has failed multiple times. To be synchronized, the object must be reloaded.
  • REPLICA: For sync-from, the object is a replica created by Amazon S3.
(Header not in response)The object did not match any rules.

Delete bucket synchronization rules (DELETE bucket replication)

You can delete S3 synchronization settings for buckets. This function is the same as in AWS S3.

HTTP request syntax (URI)
aws --endpoint -url https://host_ip s3api delete-bucket-replication --bucket "bucket"
Request structure

None.

Response structure

None.

Example

Request example:

aws --endpoint-url https://10.08.1019 s3api delete-bucket-replication --bucket "hcpcs_bucket"
NoteIf a sync-from action fails it is retried and the SQS message about the failure is retained. To avoid a possible accumulation of SQS failure messages, the best practice is to define a suitable retention policy for SQS and to delete the sync-from rule once the desired results are obtained.