[英]read_sql query returns an empty dataframe after I pass parameters as a dict in python pandas
我正在嘗試使用以下字典來參數化SQL查詢的某些部分:
query_params = dict(
{'target':'status',
'date_from':'201712',
'date_to':'201805',
'drform_target':'NPA'
})
sql_data_sample = str("""select *
from table_name
where dt = %(date_to)s
and %(target)s in (%(drform_target)s)
----------------------------------------------------
union all
----------------------------------------------------
(select *,
from table_name
where dt = %(date_from)s
and %(target)s in ('ACT')
order by random() limit 50000);""")
df_data_sample = pd.read_sql(sql_data_sample,con = cnxn,params = query_params)
但是,這將返回一個完全沒有記錄的數據框。 我不確定錯誤是什么,因為沒有錯誤被拋出。
df_data_sample.shape
Out[7]: (0, 1211)
最終的PostgreSql查詢將是:
select *
from table_name
where dt = '201805'
and status in ('NPA')
----------------------------------------------------
union all
----------------------------------------------------
(select *
from table_name
where dt = '201712'
and status in ('ACT')
order by random() limit 50000);-- This part of random() is only for running it on my local and not on server.
以下是用於復制的一小部分數據示例。 原始數據有超過一百萬條記錄和1211列
service_change_3m service_change_6m dt grp_m2 status
0 -2 201805 $50-$75 NPA
0 0 201805 < $25 NPA
0 -1 201805 $175-$200 ACT
0 0 201712 $150-$175 ACT
0 0 201712 $125-$150 ACT
-1 1 201805 $50-$75 NPA
有人可以幫我嗎?
更新:根據@shmee的建議。我最終使用:
target = 'status'
query_params = dict(
{
'date_from':'201712',
'date_to':'201805',
'drform_target':'NPA'
})
sql_data_sample = str("""select *
from table_name
where dt = %(date_to)s
and {0} in (%(drform_target)s)
----------------------------------------------------
union all
----------------------------------------------------
(select *,
from table_name
where dt = %(date_from)s
and {0} in ('ACT')
order by random() limit 50000);""").format(target)
df_data_sample = pd.read_sql(sql_data_sample,con = cnxn,params = query_params)
是的,我非常有信心,您的問題是由於試圖通過注釋中提到的參數綁定( and %(target)s in ('ACT')
嘗試在查詢中設置列名而導致的。
這導致查詢將結果集限制為記錄'status' in ('ACT')
中的'status' in ('ACT')
所在的位置(即字符串'status'是僅包含字符串'ACT'的列表的元素)。 當然,這是錯誤的,因此不會選擇任何記錄,並且結果為空。
這應該可以正常工作:
import psycopg2.sql
col_name = 'status'
table_name = 'public.churn_data'
query_params = {'date_from':'201712',
'date_to':'201805',
'drform_target':'NPA'
}
sql_data_sample = """select *
from {0}
where dt = %(date_to)s
and {1} in (%(drform_target)s)
----------------------------------------------------
union all
----------------------------------------------------
(select *
from {0}
where dt = %(date_from)s
and {1} in ('ACT')
order by random() limit 50000);"""
sql_data_sample = sql.SQL(sql_data_sample).format(sql.Identifier(table_name),
sql.Identifier(col_name))
df_data_sample = pd.read_sql(sql_data_sample,con = cnxn,params = query_params)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.