如何在 Pyspark 中处理包含 SQL 逻辑的 Table.column

Question

Eg:例如：

Table - MappingTable表 - 映射表

Col1 Col1	Col2 Col2	MappingLogic映射逻辑
One一	Two二	SELECT FROM TableX* *SELECT 来自 TableX**
One一	Two二	SELECT FROM TableX X Left Outer Join TableY Y on X.id=Y.ID* *SELECT FROM TableX X Left Outer Join TableY Y on X.id=Y.ID**

Other Tables - TableX and TableY其他表 - TableX 和 TableY

How Can I use this mapping table in Pyspark dataframe and build my logic using MappingLogic column??如何在 Pyspark dataframe 中使用此映射表并使用 MappingLogic 列构建我的逻辑？

Answer 1

Not sure what kind of answer are you expecting, but in general you can use sql expressions in your pyspark code.不确定您期待什么样的答案，但通常您可以在pyspark代码中使用 sql 表达式。 You just have to create views on your tables first:你只需要先在你的表上创建视图：

spark.read \
    .jdbc("jdbc:postgresql:dbserver", "tableX",
          properties={"user": "username", "password": "password"}).createOrReplaceTempView("tableX")

# Later you get sql-expression from your mapping logic table and execute it:
s = "SELECT * FROM TableX"
df = spark.sql(s)

如何在 Pyspark 中处理包含 SQL 逻辑的 Table.column

问题描述

1 个解决方案

解决方案1
0 2022-08-31 05:29:36

如何在 Pyspark 中处理包含 SQL 逻辑的 Table.column

问题描述

1 个解决方案

解决方案1 0 2022-08-31 05:29:36

解决方案1
0 2022-08-31 05:29:36