[英]dynamic variable in pyspark dataframe
我已經使用 pyspark 創建了 dataframe 並嘗試基於動態變量進行查詢,它給出空行。可以幫助我如何在下面的查詢中傳遞動態變量嗎?
start_dt = '2022-1-15'
df.printSchema()
-- state
--- county
--- population
---- pdate --- string
df = df.filter((df.state == 'CA') & (df.pdate == start_dt))
df.show()
使用 pysparks 文字 function 傳遞顯式值。代碼如下
df = spark.createDataFrame([
('https:john', 'john', 1.1, 'httpsasd'),
('https:john', 'john', 1.1, 'kafka'),
('https:john', 'john', 1.2, 'httpsasd')
], ['website', 'name', 'value', 'other']
)
df.show(truncate=False)
selection ='httpsasd'
df = df.filter((df.value == 1.1) & (df.other == lit(selection)))
df.show()
結果
+----------+----+-----+--------+
| website|name|value| other|
+----------+----+-----+--------+
|https:john|john| 1.1|httpsasd|
+----------+----+-----+--------+
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.