Experts,

We have our JSON files stored in the below folder structure in S3 as/appname/lob/2020/07/24/12,/appname/lob/2020/07/24/13,/appname/lob/2020/07/24/14

stage @SFSTG = /appname/lob/

We need to create a external table with partition based on the hours. We can derive the partition part from the metadata$filename. However question here is should the partition column should be created as timestamp or varchar?

Which partition datatype helps us in better performance when accessing the file from snowflake using External table.

1

Best Answer


Snowflake's recommendation is the following:

date_part date as to_date(substr(metadata$filename, 14, 10), 'YYYY/MM/DD'),

*Double check 14 is the correct start of your partition in your stage url I may have it incorrect here.

Full example:

CREATE OR REPLACE EXTERNAL TABLE Database.Schema.ExternalTableName(date_part date as to_date(substr(metadata$filename, 14, 10), 'YYYY/MM/DD'),col1 varchar AS (value:col1::varchar)),col2 varchar AS (value:col2::varchar))PARTITION BY (date_part)INTEGRATION = 'YourIntegration'LOCATION=@SFSTG/appname/lob/AUTO_REFRESH = trueFILE_FORMAT = (TYPE = JSON);