Bash Script
    • Dark
      Light

    Bash Script

    • Dark
      Light

    Article Summary

    Bash Script Component

    Run a Bash script.

    The script is executed in an external Bash process hosted by the Matillion ETL instance. Any errors encountered while running the script will immediately halt it.

    Please Note

    To ensure that instance credentials access is managed correctly at all times, we always advise that customers limit scopes (permissions) where applicable.

    Since Matillion ETL is based on the latest Linux, the command line tools are all installed. Furthermore, the credentials stored in your current environment are exported into the shell, so you may (and indeed should!) omit security keys from your scripts when calling the APIs.

    All the usual variables are made available in the bash environment and any changes made to such variables will never be visible outside of the current script execution.

    Matillion ETL runs as a Tomcat user and care must be taken to ensure this user has sufficient access to resources and does not uninstall any customer-installed Bash libraries.

    Cancellation and Timeout

    If you cancel a task while a Bash script is running, then it is killed. If the timeout is exceeded, the script is also killed. The purpose of the timeout is to ensure scripts will never run forever even if they enter an infinite loop, or are blocked by an external resource.

    Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    ScriptBash ScriptThe Bash script to execute. Output from commands should be brief, as it is sent into the Task Status message.
    TimeoutIntegerThe number of seconds to wait for script termination. After the set number of seconds has elapsed, the script is forcibly terminated.
    The default is 300 seconds (5 minutes).
    UserSelectSet the user type. For legacy Jobs, this property will not be available. The default setting is Restricted to prevent accidental or inexperienced coding errors that could damage the Matillion ETL instance.
    Privileged: a privileged user will have access to credentials and Tomcat folders.
    Restricted: restricted users do not have access to credentials stored in the instance, nor any Tomcat folders. When creating new Jobs, this is the default setting.

    Strategy

    Runs the Bash script, redirecting any output it produces into the task message.


    Enabling the User Property

    To enable the User property, follow these steps:

    1. ssh into the Matillion ETL instance.

    2. Create a .sh file and paste the following script into the .sh file:


    #!/bin/bash useradd -r restricteduser usermod -a -G restricteduser tomcat cat <<HEREDOC > /etc/sudoers.d/matillion-sudo # User rules for tomcat Cmnd_Alias EMD = /sbin/service tomcat restart, /sbin/service tomcat restart, /bin/systemctl restart tomcat.service, /bin/systemctl restart tomcat.service, /sbin/sv restart tomcat, /sbin/sv restart tomcat, /bin/mkdir /usr/share/*, /usr/bin/yum -y check-update matillion-*, /usr/bin/yum -y update matillion-*, /bin/touch /usr/share/*, /bin/chmod 755 /usr/share/*, /bin/chown tomcat\\:tomcat /usr/share/emerald/oom, /bin/chown tomcat\\:tomcat /usr/share/emerald/oom/*, /usr/sbin/logrotate --force /etc/logrotate.d/tomcat, /usr/sbin/logrotate --force /etc/logrotate.d/tomcat, /usr/bin/su restricteduser -c /bin/bash /tmp/interpreter-input-*.tmp, /usr/bin/su restricteduser -c /usr/bin/python /tmp/interpreter-input-*.tmp, /usr/bin/su restricteduser -c /usr/bin/python3 /tmp/interpreter-input-*.tmp tomcat ALL=(ALL) NOPASSWD:EMD HEREDOC

    3. Save and close this .sh file and then run the following command:

    chmod +x {FILE_NAME}
    

    4. The above command makes the .sh file executable. Once you have done that, run the following command:

    ./{FILE_NAME}
    

    5. The newly created .sh file should run successfully.

    6. In Matillion ETL, click Admin and then click Restart Server. Once the server restarts, the User property should be selectable on the Bash Script component.


    Restricting Component Availability

    You can, if required, configure the Bash Script component to be unavailable on your Matillion ETL instance. To do this, follow these steps:

    1. SSH into the instance.

    2. Open the file emerald.properties as a root user (sudo).

    3. Locate MTLN_ALLOW_BASH_COMPONENTS and set the value to false.

    4. Save and close the file and restart the server.

    Please Note

    By default, MTLN_ALLOW_BASH_COMPONENTS is set to true.


    Example

    In this example, we are staging a small amount of data to form part of a monthly report. We want to back up this data at the end of the run, so we use a Bash Script component to create a copy of the staged data in the S3 bucket. The job is shown below.

    The Bash Script component properties are simple. The component is optionally given a name, then the script is written and a timeout duration is specified.

    The script used is shown below. This simply archives the data staged on S3 into a separate "backups" directory. In this case, the name of the data directory is given by a variable that is set at the beginning of the job (variables may be declared by clicking Project and then clicking Manage Environment Variables.

    As an embellishment, the rowcount of the data staging component is exported to another variable, set up in advance.

    When run, the Job will end with a backup of the data being created. Due to the echo command, the name of the file and the rowcount will be printed in the Tasks console.


    What's Next