简体   繁体   中英

External table not refreshing data Snowflake

Say that I have an external table in Hive and the csv file in the external table's S3 location looks like below.

+----+------+
| ID | Name |
+----+------+
|  1 | A    |
|  2 | B    |
+----+------+

If I change the data in the file like below, I am able to see the changed value when I query the external table in Hive.

+----+------+
| ID | Name |
+----+------+
|  1 | A    |
|  2 | C    |
+----+------+

The same scenario when I tested in Snowflake, I am not able to see the new data rather I can still see the old data, though I have added the auto_refresh = true while creating external table in Snowflake.

create or replace external table schema.table_name 
(
    ID INT as  (value:c1::int), 
    Name varchar(20) as ( value:c2::varchar)
)
with location = @ext_stage_test
file_format = pipeformat
auto_refresh = true

Is this the behavior of Snowflake or am I missing anything?

Any help is highly appreciated.

I'll place this as an answer, but it doesn't sound like you have setup the SNS notifications on AWS S3. The AUTO_REFRESH allows Snowflake to refresh an External Table when a notification is sent to a queue that a file has either been placed in your S3 bucket or removed. This is the only truly auto-refresh property available. Otherwise, there is a manual refresh of the external table that needs to be run to refresh the external table:

alter external table exttable_part refresh;

If you are unable to set the SNS configuration noted in the documentation ( https://docs.snowflake.com/en/sql-reference/sql/create-external-table.html#simple-external-table-auto-refresh-using-amazon-sns ), then I would recommend adding the command above to a task that executes on a frequency that makes sense for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM