Amplitude Extract
    • Dark
      Light

    Amplitude Extract

    • Dark
      Light

    Article Summary

    This article is specific to the following platforms - Snowflake - Redshift - BigQuery.

    Amplitude Extract

    The Amplitude Extract component calls the Amplitude API to retrieve and store data to be either referenced by an external table or loaded into a table, depending on the user's cloud data warehouse. Users can then transform their data with the Matillion ETL library of transformation components.

    Using this component may return structured data that requires flattening. For help with flattening such data, we recommend using the Nested Data Load Component for Amazon Redshift and the Extract Nested Data Component for Snowflake or Google BigQuery.


    Properties

    Snowflake Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    Data SourceSelectPlease select an Amplitude data source. Available options are "Event Types" and "Events".
    API KeyStringPlease provide your Amplitude API Key.
    For help acquiring your Amplitude API Key, please read our Amplitude Authentication Guide.
    Secret KeyStringPlease provide your Amplitude Secret Key. Secret Keys can be stored inside the component; however, it is highly recommended to use the Password Manager feature instead.
    For help acquiring your Amplitude Secret Key, please read our Amplitude Authentication Guide.
    Start timeDatetime StringPlease designate a start time. The format, without whitespaces, is YYYYMMDDT00
    T00, T01, T02...T23 indicate the hour (there is no T24). Please select only one hour.
    Note: This parameter is only available when the Data Source parameter is set to "Events".
    End timeDatetime StringPlease designate an end time. The format, without whitespaces, is YYYYMMDDT00
    T00, T01, T02...T23 indicate the hour (there is no T24). Please select only one hour.
    Note: This parameter is only available when the Data Source parameter is set to "Events".
    LocationStorage LocationProvide an S3 bucket path, GCS bucket path, or Azure Blob Storage path that will be used to store the data. Once on an S3 bucket, GCS bucket or Azure Blob, the data can be referenced by an external table. A folder will be created at this location with the same name as the Target Table.
    IntegrationSelect(GCP only) Choose your Google Cloud Storage Integration. Integrations are required to permit Snowflake to read data from and write to a Google Cloud Storage bucket. Integrations must be set up in advance of selecting them in Matillion ETL. To learn more about setting up a storage integration, read our Storage Integration Setup Guide.
    WarehouseSelectChoose a Snowflake warehouse that will run the load.
    DatabaseSelectChoose a database to create the new table in.
    SchemaSelectSelect the table schema. The special value, [Environment Default], will use the schema defined in the environment. For more information on using multiple schemas, please refer to this article.
    Target TableStringProvide a new table name.
    Warning: Upon running the job, this table will be recreated and will drop any existing table of the same name.

    Redshift Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    Data SourceSelectPlease select an Amplitude data source. Available options are "Event Types" and "Events".
    API KeyStringPlease provide your Amplitude API Key.
    For help acquiring your Amplitude API Key, please read our Amplitude Authentication Guide.
    Secret KeyStringPlease provide your Amplitude Secret Key. Secret Keys can be stored inside the component; however, it is highly recommended to use the Password Manager feature instead.
    For help acquiring your Amplitude Secret Key, please read our Amplitude Authentication Guide.
    Start timeDatetime StringPlease designate a start time. The format, without whitespaces, is YYYYMMDDT00
    T00, T01, T02...T23 indicate the hour (there is no T24). Please select only one hour.
    Note: This parameter is only available when the Data Source parameter is set to "Events".
    End timeDatetime StringPlease designate an end time. The format, without whitespaces, is YYYYMMDDT00
    T00, T01, T02...T23 indicate the hour (there is no T24). Please select only one hour.
    Note: This parameter is only available when the Data Source parameter is set to "Events".
    LocationStorage LocationProvide an S3 Bucket path that will be used to store the data. Once on an S3 bucket, the data can be referenced by an external table. A folder will be created at this location with the same name as the target table.
    TypeDropdownSelect between a standard table and an external table.
    Standard SchemaDropdownSelect the Redshift schema. The special value, [Environment Default], will use the schema defined in the Matillion ETL environment.
    External SchemaSelectSelect the table's external schema. To learn more about external schemas, please read our support documentation target="_blank">Getting Started With Amazon Redshift Spectrum.
    Target TableStringProvide a name for the external table to be used.
    Warning: Upon running the job, this table will be recreated and will drop any existing table of the same name.

    BigQuery Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    Data SourceSelectPlease select an Amplitude data source. Available options are "Event Types" and "Events".
    API KeyStringPlease provide your Amplitude API Key.
    For help acquiring your Amplitude API Key, please read our Amplitude Authentication Guide.
    Secret KeyStringPlease provide your Amplitude Secret Key. Secret Keys can be stored inside the component; however, it is highly recommended to use the Password Manager feature instead.
    For help acquiring your Amplitude Secret Key, please read our Amplitude Authentication Guide.
    Start timeDatetime StringPlease designate a start time. The format, without whitespaces, is YYYYMMDDT00
    T00, T01, T02...T23 indicate the hour (there is no T24). Please select only one hour.
    Note: This parameter is only available when the Data Source parameter is set to "Events".
    End timeDatetime StringPlease designate an end time. The format, without whitespaces, is YYYYMMDDT00
    T00, T01, T02...T23 indicate the hour (there is no T24). Please select only one hour.
    Note: This parameter is only available when the Data Source parameter is set to "Events".
    Table TypeSelectSelect whether the table is Native (by default in BigQuery) or an external table.
    ProjectSelectSelect the Google Bigquery project. The special value, [Environment Default], will use the project defined in the environment.
    For more information, refer to the BigQuery documentation.
    DatasetSelectSelect the Google Bigquery dataset to load data into. The special value, [Environment Default], will use the dataset defined in the environment.
    For more information, refer to the BigQuery documentation.
    Target TableStringA name for the table.
    Warning: This table will be recreated and will drop any existing table of the same name.
    Only available when the table type is Native.
    New Target TableStringA name for the new external table.
    Only available when the table type is External.
    Cloud Storage Staging AreaCloud Storage BucketSpecify the target Google Cloud Storage bucket to be used for staging the queried data. Users can either:
    1. Input the URL string of the Cloud Storage bucket following the template provided: gs://<bucket>/<path>
    2. Navigate through the file structure to select the target bucket.

    Only available when the table type is External.
    LocationCloud Storage BucketSpecify the target Google Cloud Storage bucket to be used for staging the queried data. Users can either:
    1. Input the URL string of the Cloud Storage bucket following the template provided: gs://<bucket>/<path>
    2. Navigate through the file structure to select the target bucket.
    Only available when the table type is Native.
    Load OptionsMultiple SelectClean Cloud Storage Files: Destroy staged files on Cloud Storage after loading data. Default is On.
    Cloud Storage File Prefix: Give staged file names a prefix of your choice. The default setting is an empty field.
    Recreate Target Table: Choose whether the component recreates its target table before the data load. If Off, the component will use an existing table or create one if it does not exist. Default is On.
    Use Grid Variable: Check this checkbox to use a grid variable. This box is unchecked by default.