I am using Python Snowflake connector to extract data from tables in Snowflake. Here is my file structure:
sql
a.sql
b.sql
c.sql
configurations.py
data_extract.py
main.py
Here the sql folder contains all my sql queries in .sql files. I put these sql files separately because they are handreds of lines long each and looks messy if I put them into python files. configuration.py contains datetime parameters I want to change every time I run the code. It looks like this:
START_TIME = '2018-10-01 00:00:00'
END_TIME = '2019-04-01 00:00:00'
I want to add these parameters into the .sql files. For example, a.sql includes the following content:
DECLARE
@START_PICKUP_DATE DATE,
@END_PICKUP_DATE DATE,
SET
@START_PICKUP_DATE = '2018-10-01'
SET
@END_PICKUP_DATE = '2019-04-01'
select supplier_confirmation_id, pickup_datetime, dropoff_datetime, pickup_station_distance
from SANDBOX.ZQIAN.V_PDL
where pickup_datetime >= START_PICKUP_DATE and pickup_datetime < END_PICKUP_DATE
and supplier_confirmation_id is not null;
I use a.sql in my python code in the following way:
def executeSQLScriptsFromFile(filepath):
# snowflake credentials, replace SECRET with your own
ctx = snowflake.connector.connect(
user='S_ANALYTICS_USER',
account=SECRET_A,
region='us-east-1',
warehouse=SECRET_B,
database=SECRET_C,
role=SECRET_D,
password=SECRET_E)
fd = open(filepath, 'r')
query = fd.read()
fd.close()
cs = ctx.cursor()
try:
cur = cs.execute(query)
df = pd.DataFrame.from_records(iter(cur), columns=[x[0] for x in cur.description])
finally:
cs.close()
ctx.close()
return df
def extract_data():
a_sqlpath = os.path.join(os.getcwd(), 'sql\a.sql')
a_df = executeSQLScriptsFromFile(a_sqlpath)
return a_df
The problem is I want START_PICKUP_DATE and END_PICKUP_DATE in a.sql file to be synced and equal to START_TIME and END_TIME in configurations.py file so that I only need to change START_TIME and END_TIME in configurations.py and extract data in different timeframe using a.sql in Snowflake.
I've been looking for solutions online for quite a long time, but still not able to find a good solution that is specific to my problem. Many thanks to anyone who can provide a hint!
To accomplish this, I would take your .sql files and extract the queries into triple-quoted python strings with format specifiers for your variables. Then import the queries into your main script just like you import your configuration:
sql_queries.py:
sql_a = """
DECLARE
@START_PICKUP_DATE DATE,
@END_PICKUP_DATE DATE,
SET
@START_PICKUP_DATE = {START_TIME}
SET
@END_PICKUP_DATE = {END_TIME}
select supplier_confirmation_id, pickup_datetime, dropoff_datetime, pickup_station_distance
from SANDBOX.ZQIAN.V_PDL
where pickup_datetime >= START_PICKUP_DATE and pickup_datetime < END_PICKUP_DATE
and supplier_confirmation_id is not null;
"""
main:
from sql_queries import sql_a
print(sql_a.format(configuration.START_TIME, configuration.END_TIME))
You should be able to parameterize the sql statements so that instead of declaring in the SQL file you can just make it a parameter passed during execution.
select supplier_confirmation_id, pickup_datetime, dropoff_datetime, pickup_station_distance
from SANDBOX.ZQIAN.V_PDL
where pickup_datetime >= %(START_PICKUP_DATE)s and pickup_datetime < %(END_PICKUP_DATE)s and supplier_confirmation_id is not null;
Then when calling the function, just send the parameters START_PICKUP_DATE
and END_PICKUP_DATE
as parameters to the execute statement. One way to do this is to do a mapping from the parameter name to the value of the parameter. (In this example I'm assuming you have a function that will get the parameter value).
cur = cs.execute(query, {'START_PICKUP_DATE':get_value_from_config('start_pickup'), 'END_PICKUP_DATE':get_value_from_config('end_pickup')})
Or you can pass them by location
cur = cs.execute(query, [get_value_from_config('start_pickup'), get_value_from_config('end_pickup')])
Which in essense becomes
cur = cs.execute(query, ['2018-10-01 00:00:00','2019-04-01 00:00:00'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.