Workflows · From Tasks · from_aws_s3

Purpose

Imports data from Amazon S3. If you have an Amazon Cloud configured as your client_cloud you should probably use the do task.

Method of use

Before you can download data you have to give Workflows access. Follow the steps below:

Have the Amazon Cloud administrator create an IAM account
Hand him the list of buckets and folders you want to access
Make sure the following policies are added:
1. `AmazonS3ReadOnlyAccess` - Minimum required access. You can only read data, but not set the delete_after=yes to delete source files after downloading them.
2. `AmazonS3FullAccess` - If you want to delete source files when read, or if you want to upload files to AWS S3 with the to_aws_s3 task
Make sure to activate `Programmatic access`
Send Access Key ID and Secret Access Key back to Onesecondbefore staff. They will add it to Workflows.
You should now be able to download data cloud objects from Amazon S3

Configuration

Example usage

extract:
    conn_id: aws_s3_readonly
    bucket: onesecondbefore-demo
    # Download files in folder `my/folder` with file prefix `part-`
    prefix: my/folder/part-

Properties

property	type	required	description
`conn_id`	string	no	Connection string as handed to you by the Onesecondbefore team. Default is `aws_s3`
`bucket`	string	yes	Contains the Amazon S3 bucket
`prefix`	string	no	Default is no prefix (all files in the bucket). Contains the prefix. If configured like `prefix: my/folder/part-`, this means that only blobs with a filename that starts with `part-` will be downloaded from folder `my/folder`
`delete_after`	yesno (boolean)	no	Default is `no`. Set to `yes` if you want the blob to be deleted after you have imported it. This action cannot be undone. Please refer to the access settings to make sure your account has the correct policy for this.

Details

item	description
`API`	Amazon S3 REST API
`Pre-formatted schema`	No. Does not come with a pre-formatted schema.