[英]py4j.Py4JException: Method and([class java.lang.String]) does not exist
我是 spark dataframe,具有以下架构。
-root
|-- ME_KE: string (nullable = true)
|-- CSPD_CAT: string (nullable = true)
|-- EFF_DT: string (nullable = true)
|-- TER_DT: string (nullable = true)
|-- CREATE_DTM: string (nullable = true)
|-- ELIG_IND: string (nullable = true)
基本上我正在尝试将 spark SQL 代码直接在 dataframe 上转换为 SQL。
df=spark.read.format('csv').load(SourceFilesPath+"\\cutdetl.csv",infraSchema=True,header=True)
df.createOrReplaceTempView("cutdetl")
spark.sql(f"""select
me_ke,
eff_dt,
ter_dt,
create_dtm
from
cutdetl
where
(elig_ind = 'Y') and
((to_date({start_dt},'dd-mon-yyyy') between eff_dt and ter_dt) or
(eff_dt between to_date({start_dt}'dd-mon-yyyy') and to_date({end_dt},'dd-mon-yyyy'))
""")
下面是我试过的代码。
df1=df.select("me_ke","eff_dt","ter_dt","elig_ind")
.where(col("elig_ind")=="Y" & (F.to_date('31-SEP-2022', dd-mon-yyyy')
.between(col("mepe_eff_dt"),col("mepe_term_dt"))) |
(F.to_date(col("eff_dt"))
.between(F.to_date('31-DEC-2022'),F.to_date('31-DEC-2022'))))
我收到以下错误:
py4j.Py4JException: Method and([class java.lang.String]) does not exist```
Could anyone help with converting above code to dataframe level SQL
我会像这样 go
from pyspark.sql.functions import col
df=spark.read.format('csv').load(SourceFilesPath+"\\cutdetl.csv",infraSchema=True,header=True)
df.createOrReplaceTempView("cutdetl")
df1 = df.filter(col("elig_ind") == "Y")
df1 = df1.filter((col("eff_dt").between(f"to_date({start_dt},'dd-mon-yyyy')", f"to_date({end_dt},'dd-mon-yyyy')")) |
(f"to_date({start_dt},'dd-mon-yyyy')".between(col("eff_dt"), col("ter_dt"))))
df1 = df1.select("me_ke", "eff_dt", "ter_dt", "create_dtm")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.