Filter
    • Dark
      Light

    Filter

    • Dark
      Light

    Article Summary

    Filter Component

    Filter rows from the input to pass a subset of rows to the next component based on a set of conditions.

    The function of this component is similar to the Data Filter property found in data staging (Query) orchestration components.

    Matillion ETL for Delta Lake on Databricks users may experience errors if using binary values. It is currently impossible to filter against a binary value with this component.

    Properties

    Snowflake Properties

    PropertySettingDescription
    NameTextA human-readable name for the component.
    Filter ConditionsInput ColumnThe name of the column from the input component to which the filter is applied.
    QualifierIs or Not — select whether matching rows in the data should be included or excluded. Default is Is.
    ComparatorLess than (default): Value in the Input Column must be less than that specified in the Value Column.
    Less than or equal to: Value in the Input Column must be less than or equal to that specified in the Value Column.
    Equal to: Value in the Input Column must be equal to that specified in the Value Column.
    Greater than (default): Value in the Input Column must be greater than that specified in the Value Column.
    Greater than or equal to: Value in the Input Column must be greater than or equal to that specified in the Value Column.
    Like: Case-sensitive match of the value in the Input Column and the pattern in the Value Column. See the Snowflake documentation.
    ILike: Case-insensitive match of the value in the Input Column and the pattern in the Value Column. See the Snowflake documentation.
    Similar to: Matches the value in the Input Column with a SQL standard regular expression pattern in the Value column.
    Null: Checks whether the value of the Input Column is the SQL "null" value.
    Blank: Checks whether the value of the Input Column is an empty string.
    Null or blank: Checks whether the value of the Input Column is an empty string or the SQL "null" value.
    Value ColumnThe value entered by the user to use for comparison. If this value is enclosed in double quotes, then it will be treated as a column in the generated SQL, which will allow for comparing columns.
    Combine ConditionsSelectWhen multiple Filter Conditions are present, they can be separated by AND or OR. AND means all the Filter Conditions must be true, OR means any of the Filter Conditions must be true.
    Note: For more complex conditions involving grouping several AND and OR conditions together, consider creating a new field using a Calculator component that specifies your complex condition and then use that new field in a simple Filter Condition.

    Redshift Properties

    PropertySettingDescription
    NameTextA human-readable name for the component.
    Filter ConditionsInput ColumnThe name of the column from the input component to which the filter is applied.
    QualifierIs or Not — select whether matching rows in the data should be included or excluded. Default is Is.
    ComparatorLess than (default): Value in the Input Column must be less than that specified in the Value Column.
    Less than or equal to: Value in the Input Column must be less than or equal to that specified in the Value Column.
    Equal to: Value in the Input Column must be equal to that specified in the Value Column.
    Greater than (default): Value in the Input Column must be greater than that specified in the Value Column.
    Greater than or equal to: Value in the Input Column must be greater than or equal to that specified in the Value Column.
    Like: Case-sensitive match of the value in the Input Column and the pattern in the Value Column. See the Redshift documentation.
    ILike: Case-insensitive match of the value in the Input Column and the pattern in the Value Column. See the Redshift documentation.
    Similar to: Matches the value in the Input Column with a SQL standard regular expression pattern in the Value column.
    See the Redshift documentation for expression syntax.
    Null: Checks whether the value of the Input Column is the SQL "null" value.
    Blank: Checks whether the value of the Input Column is an empty string.
    Null or blank: Checks whether the value of the Input Column is an empty string or the SQL "null" value.
    Value ColumnThe value entered by the user to use for comparison. If this value is enclosed in double quotes, then it will be treated as a column in the generated SQL, which will allow for comparing columns.
    Combine ConditionsSelectWhen multiple Filter Conditions are present, they can be separated by AND or OR. AND means all the Filter Conditions must be true, OR means any of the Filter Conditions must be true.
    Note: For more complex conditions involving grouping several AND and OR conditions together, consider creating a new field using a Calculator component that specifies your complex condition and then use that new field in a simple Filter Condition.

    BigQuery Properties

    PropertySettingDescription
    NameTextA human-readable name for the component.
    Filter ConditionsInput ColumnThe name of the column from the input component to which the filter is applied.
    QualifierIs or Not — select whether matching rows in the data should be included or excluded. Default is Is.
    ComparatorLess than (default): Value in the Input Column must be less than that specified in the Value Column.
    Less than or equal to: Value in the Input Column must be less than or equal to that specified in the Value Column.
    Equal to: Value in the Input Column must be equal to that specified in the Value Column.
    Greater than (default): Value in the Input Column must be greater than that specified in the Value Column.
    Greater than or equal to: Value in the Input Column must be greater than or equal to that specified in the Value Column.
    Like: Case-sensitive match of the value in the Input Column and the pattern in the Value Column.
    ILike: Case-insensitive match of the value in the Input Column and the pattern in the Value Column.
    Similar to: Matches the value in the Input Column with an SQL standard regular expression pattern in the Value column.
    Null: Checks whether the value of the Input Column is the SQL "null" value.
    Blank: Checks whether the value of the Input Column is an empty string.
    Null or blank: Checks whether the value of the Input Column is an empty string or the SQL "null" value.
    Value ColumnThe value entered by the user to use for comparison. If this value is enclosed in backticks, then it will be treated as a column in the generated SQL, which will allow for comparing columns.
    Combine ConditionsSelectWhen multiple Filter Conditions are present, they can be separated by AND or OR. AND means all the Filter Conditions must be true, OR means any of the Filter Conditions must be true.
    Note: For more complex conditions involving grouping several AND and OR conditions together, consider creating a new field using a Calculator component that specifies your complex condition and then use that new field in a simple Filter Condition.

    Synapse Properties

    PropertySettingDescription
    NameTextA human-readable name for the component.
    Filter ConditionsInput ColumnThe name of the column from the input component to which the filter is applied.
    QualifierIs or Not — select whether matching rows in the data should be included or excluded. Default is Is.
    ComparatorLess than (default): Value in the Input Column must be less than that specified in the Value Column.
    Less than or equal to: Value in the Input Column must be less than or equal to that specified in the Value Column.
    Equal to: Value in the Input Column must be equal to that specified in the Value Column.
    Greater than (default): Value in the Input Column must be greater than that specified in the Value Column.
    Greater than or equal to: Value in the Input Column must be greater than or equal to that specified in the Value Column.
    Like: Case-sensitive match of the value in the Input Column and the pattern in the Value Column.
    ILike: Case-insensitive match of the value in the Input Column and the pattern in the Value Column.
    Similar to: Matches the value in the Input Column with an SQL standard regular expression pattern in the Value column.
    Null: Checks whether the value of the Input Column is the SQL "null" value.
    Blank: Checks whether the value of the Input Column is an empty string.
    Null or blank: Checks whether the value of the Input Column is an empty string or the SQL "null" value.
    Value ColumnThe value entered by the user to use for comparison. If this value is enclosed in double quotes, then it will be treated as a column in the generated SQL, which will allow for comparing columns.
    Combine ConditionsSelectWhen multiple Filter Conditions are present, they can be separated by AND or OR. AND means all the Filter Conditions must be true, OR means any of the Filter Conditions must be true.
    Note: For more complex conditions involving grouping several AND and OR conditions together, consider creating a new field using a Calculator component that specifies your complex condition and then use that new field in a simple Filter Condition.

    Delta Lake Properties

    PropertySettingDescription
    NameTextA human-readable name for the component.
    Filter ConditionsInput ColumnThe name of the column from the input component to which the filter is applied.
    QualifierIs or Not — select whether matching rows in the data should be included or excluded. Default is Is.
    ComparatorLess than (default): Value in the Input Column must be less than that specified in the Value Column.
    Less than or equal to: Value in the Input Column must be less than or equal to that specified in the Value Column.
    Equal to: Value in the Input Column must be equal to that specified in the Value Column.
    Greater than (default): Value in the Input Column must be greater than that specified in the Value Column.
    Greater than or equal to: Value in the Input Column must be greater than or equal to that specified in the Value Column.
    Like: Case-sensitive match of the value in the Input Column and the pattern in the Value Column.
    ILike: Case-insensitive match of the value in the Input Column and the pattern in the Value Column.
    Similar to: Matches the value in the Input Column with an SQL standard regular expression pattern in the Value column.
    Null: Checks whether the value of the Input Column is the SQL "null" value.
    Blank: Checks whether the value of the Input Column is an empty string.
    Null or blank: Checks whether the value of the Input Column is an empty string or the SQL "null" value.
    Value ColumnThe value entered by the user to use for comparison. If this value is enclosed in backticks, then it will be treated as a column in the generated SQL, which will allow for comparing columns.
    Combine ConditionsSelectWhen multiple Filter Conditions are present, they can be separated by AND or OR. AND means all the Filter Conditions must be true, OR means any of the Filter Conditions must be true.
    Note: For more complex conditions involving grouping several AND and OR conditions together, consider creating a new field using a Calculator component that specifies your complex condition and then use that new field in a simple Filter Condition.

    Strategy

    Generates a WHERE clause.

    Quoting Date/Time values in Snowflake

    Values representing a constant date/time need to be single-quoted so that they are recognised by Snowflake. Matillion ETL recognises the following formats, and will quote them automatically:

    • yyyy-MM-dd
    • yyyy-MM-dd HH:mm:ss
    • yyyy-MM-dd HH:mm:ss +HH:mm
    • yyyy-MM-dd HH:mm:ss.SSSSSS
    • yyyy-MM-dd hh:mm:ss.SSSSSS +HH:mm
    • yyyy-MM-ddThh:mm:ss.SSSSSSZ

    Other date/time formats recognised by Snowflake will work if they are single-quoted in the Filter property, such as RFC format 'Thu, 21 Dec 2000 04:01:07 PM'.

    Read Supported Formats for AUTO Detection for a complete list of what Snowflake will recognize automatically.

    Functions and column references can also be used in this field. Values such as below will work as expected:

    • CURRENT_TIMESTAMP()
    • "anotherDateColumn" + 365

    Example

    In this example, we have used the Table Input component to load a large volume of 'flights' data, and we wish to take a small subset of this data for transformation. For this example, our subset of data will only include rows of data for flights that occurred: during the year 1999; in the second half of each month; with an airport 'origin' of RSW; and with an airport 'destination' of DFW. To drill down to this specific data, it will be advantageous to reduce the size of the dataset to include only rows of data that satisfy our conditions.

    Data is loaded from a table using the Table Input component. Its properties are defined in the image below.

    When we sample the input data (below), we see that the raw 'flights' data offers 123,534,969 rows of data.

    With the criteria of our subset of data in mind, we use the Filter component to filter out all irrelevant rows. We have four filters set up. The first filter ensures that we only take flights dated during the year 1999. The second and third filters clarify our chosen value for "origin" and "destination". Finally, the fourth filter ensures that our data is from days of the month greater than or equal to 16.

    You can see that we have set the Filter component's "Combine Conditions" property to "AND", so that the multiple filter conditions set up must all be satisfied by a given row. Applying this filter should drastically reduce the number of rows in our dataset.

    Sampling the data through the Filter component, we can see that the dataset has been reduced from 123,534,969 rows to a mere 76 rows.



    Video


    What's Next