Snowflake AWS Storage Integration Setup Guide
    • Dark
      Light

    Snowflake AWS Storage Integration Setup Guide

    • Dark
      Light

    Article Summary

    Overview

    This topic covers how to set up a storage integration for use in Matillion ETL for Snowflake instances on Amazon Web Services (AWS). Matillion ETL for Snowflake on AWS requires users to select a storage integration when configuring data staging components.

    A storage integration is a named Snowflake object that removes the need to pass explicit cloud provider credentials such as secret keys or access tokens. An integration object references an AWS service account.


    Prerequisites

    • Completing the instructions below requires permissions in AWS to create and manage IAM policies and roles. If you aren't an AWS administrator, ask your AWS administrator to perform these tasks.
    • Only users with the Snowflake role ACCOUNTADMIN, or a role with the global CREATE INTEGRATION privilege, can execute the SQL commands mentioned in this topic.
    • Any identifiers not enclosed in double quotes ( " " ) aren't case sensitive. For example, if you create an integration called DocsTeam it will be passed by Snowflake as DOCSTEAM.
    • As a best practice, Snowflake recommends creating an IAM policy for Snowflake access to the S3 bucket. You can then attach the policy to the role and use the security credentials generated by AWS for the role to access files in the bucket.

    Create an IAM policy in AWS

    The following instructions describe how to configure access permissions for Snowflake in your AWS Management Console so that you can use an S3 bucket to load and unload data.

    1. Log in to the AWS Management Console.
    2. On the console dashboard, click IAM (Identity & Access Management).
    3. Click Account settings on the left-hand menu.
    4. Expand the Security Token Service Regions list, find the AWS region corresponding to the region where your account is located, and click Activate if the status is "Inactive".
    5. Click Policies on the left-hand menu.
    6. Click Create Policy.
    7. Click the JSON tab and paste the following into the policy editor:
    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Effect": "Allow",
    "Action": [
    "s3:PutObject",
    "s3:GetObject",
    "s3:GetObjectVersion",
    "s3:DeleteObject",
    "s3:DeleteObjectVersion"
    ],
    "Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
    },
    {
    "Effect": "Allow",
    "Action": [
    "s3:ListBucket",
    "s3:GetBucketLocation"
    ],
    "Resource": "arn:aws:s3:::<bucket>",
    "Condition": {
    "StringLike": {
    "s3:prefix": [
    "<prefix>/*"
    ]
    }
    }
    }
    ]
    }
    

    This policy provides Snowflake with the required permissions to load and unload data using a single S3 bucket and folder path. For an alternative policy that provides Snowflake with the required permissions to only load data from a single read-only S3 bucket and folder path, use the following:

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Effect": "Allow",
    "Action": [
    "s3:GetObject",
    "s3:GetObjectVersion"
    ],
    "Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
    },
    {
    "Effect": "Allow",
    "Action": [
    "s3:ListBucket",
    "s3:GetBucketLocation"
    ],
    "Resource": "arn:aws:s3:::<bucket>",
    "Condition": {
    "StringLike": {
    "s3:prefix": [
    "<prefix>/*"
    ]
    }
    }
    }
    ]
    }
    
    1. Replace <bucket> and <prefix> in the pasted policy text with your actual bucket name and folder path prefix.
    2. Click Review policy.
    3. Enter Name for the policy (for example, snowflake_access), and an optional Description.
    4. Click Create policy.

    Create an IAM role in AWS

    In the AWS Management Console, create an AWS IAM role to grant privileges on the S3 bucket containing your data files, using the policy you created above.

    1. Log into the AWS Management Console.
    2. On the console dashboard, click IAM (Identity & Access Management).
    3. Click Roles on the left-hand menu.
    4. Click Create role.
    5. On the Create role page, click Another AWS account.
    6. In the Account ID field, enter your own AWS account ID temporarily. Later, you will modify the trusted relationship and grant access to Snowflake.
    7. Select Require external ID option. Enter a dummy External ID such as 0000. Later, you will modify the trusted relationship and specify the external ID for your Snowflake stage. An external ID is required to grant Snowflake access to your AWS S3 resources.
    8. Click Next: Permissions.
    9. Locate the policy you created in Create an IAM policy in AWS and select this policy.
    10. Click Next.
    11. On the Create role page, enter a Role name and an optional Role description.
    12. Click Create role.
    13. On the Summary page, copy the Role ARN value and keep it, as you will need it later.

    Create a storage integration in Snowflake

    1. Log in to your Snowflake account.

    2. In either an existing Snowflake worksheet or a new worksheet, use the CREATE STORAGE INTEGRATION command.

    The form of the command is as follows:

    CREATE STORAGE INTEGRATION <integration_name>
    TYPE = EXTERNAL_STAGE
    STORAGE_PROVIDER = S3
    ENABLED = TRUE
    STORAGE_AWS_ROLE_ARN = '<iam_role>'
    [ STORAGE_AWS_OBJECT_ACL = 'bucket-owner-full-control' ]
    STORAGE_ALLOWED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/')
    [ STORAGE_BLOCKED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/') ]
    

    This uses the following variables:

    • <integration_name> The name of the new integration.
    • <iam_role> The Role ARN value of the role you created in Create an IAM role in AWS.
    • <bucket> The name of the S3 bucket that stores your data files.
    • <path> An optional file path used to provide granular control over objects in the S3 bucket.

    The parameters STORAGE_ALLOWED_LOCATIONS and STORAGE_BLOCKED_LOCATIONS let you specify any buckets that you wish to restrict access to, or block access from, respectively. As shown in the above template, locations are enclosed in single quotation marks, and multiple locations separated by commas. The STORAGE_BLOCKED_LOCATIONS parameter is optional.

    Setting the STORAGE_AWS_OBJECT_ACL1 parameter enables support for AWS access control lists (ACLs) to grant the bucket owner full control. This is not required in our current use case, and therefore can be omitted.

    Additional stages that also use this integration can reference the permitted bucket.

    1. In your Snowflake worksheet, execute the Describe Integration command to retrieve the ARN for the AWS IAM user that was automatically created for your Snowflake account.

    You can abbreviate DESCRIBE to DESC when writing this command, and the term STORAGE is optional:

    DESC [STORAGE] INTEGRATION <integration_name>
    
    1. After you run the DESC command, locate the STORAGE_AWS_IAM_USER_ARN and STORAGE_AWS_EXTERNAL_ID entries in the property column, and copy the values given for them in the property_value column. You will require these values in Granting IAM user permissions to access bucket objects.

    As Snowflake clarifies in their documentation, they provision a single IAM user across your entire Snowflake account. All Cloud Storage integrations use that IAM user.

    The next section focuses on granting permissions for that IAM user within the AWS console.


    Granting IAM user permissions to access bucket objects

    The following instructions describe how to configure IAM access permissions for Snowflake in your AWS Management Console so that you can use a S3 bucket to load and unload data.

    1. Log in to the AWS Management Console.
    2. On the console dashboard, click IAM (Identity & Access Management).
    3. Click Roles on the left-hand menu.
    4. Click the role you created in Create an IAM role in AWS.
    5. Click the Trust relationships tab.
    6. Click Edit trust policy.
    7. Paste in the following policy:
    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Sid": "",
    "Effect": "Allow",
    "Principal": {
    "AWS": "<snowflake_user_arn>"
    },
    "Action": "sts:AssumeRole",
    "Condition": {
    "StringEquals": {
    "sts:ExternalId": "<snowflake_external_id>"
    }
    }
    }
    ]
    }
    

    Edit the policy as follows:

    This trust policy allows a single external stage in your Snowflake account to assume your IAM role. It's the most restrictive trust policy and is therefore the most secure.

    1. Click Update policy to save your changes.

    Create an external stage

    Note

    Creating a stage that uses a storage integration requires a role that has the CREATE STAGE privilege for the schema as well as the USAGE privilege on the storage integration.

    Create an external S3 stage that references the storage integration you created in Create a storage integration in Snowflake using the following commands:

    USE SCHEMA <schema>;
    
    CREATE STAGE <my_s3_stage>
    storage_integration = <integration_name>
    url = 's3://<bucket>/<path>/'
    file_format = <format>;
    

    Where:

    • <schema> is the current database and schema for the user session.
    • <my_s3_stage> is the name of the stage you are creating.
    • <integration_name>is the name of the integration you created in Create a storage integration in Snowflake.
    • <bucket> is the name of the S3 bucket that stores your data files.
    • <path> is an optional file path used to provide granular control over objects in the S3 bucket.
    • <format> is a named file format object.

    Using the integration in Matillion ETL

    1. In Matillion ETL, create a new orchestration job, and drag your chosen data staging component onto the canvas.
    2. Click on the data staging component, then click its Properties tab and set the following properties:
      • Stage Platform: Select Existing Amazon S3 Location.
      • S3 Staging Area: Select your S3 external stage.
      • Stage Authentication: Select Storage Integration.

    Contact support

    If you require assistance creating a storage integration in Snowflake, or with any other aspect of Matillion ETL, please visit our Getting Support page.

    Please consult the Snowflake documentation to read deeper into these topics, for example, to learn how to modify an existing stage to use a storage integration.