[英]How to combine two columns values to another column using pyspark?
This is the code I'm using to map values from a csv to a table in sql in aws glue.这是我用于 map 值的代码,从 csv 到 aws 胶水中 sql 中的表。
mappings=[
("houseA", "string", "villa", "string"),
("houseB", "string", "small_house", "string"),
("houseA"+"houseB", "string", "combined_key", "string"),
],
I find no issue with mapping houseA and houseB to "villa" and "small_house" columns respectively.我发现将houseA 和houseB 分别映射到“villa”和“small_house”列没有问题。 But when I try to have houseAhouseB in "combined_key" column it is giving me this error.
但是当我尝试在“combined_key”列中有houseAhouseB时,它给了我这个错误。
An error occurred while calling o128.pyWriteDynamicFrame.
调用 o128.pyWriteDynamicFrame 时出错。 Cannot insert the value NULL into column 'combined_key', table 'dbo.Buildings';
无法将值 NULL 插入到列“combined_key”、表“dbo.Buildings”中; column does not allow nulls.
列不允许空值。 INSERT fails.
插入失败。
I couldn't quite figure out why it is giving back a null error.我不太明白为什么它会返回 null 错误。
Any ideas on how the code can be modified?关于如何修改代码的任何想法?
Thanks in advance.提前致谢。
I actually had found that there is a custom transformation available in glue studio where we can achieve this using pyspark code实际上,我发现胶水工作室中有一个自定义转换可用,我们可以使用 pyspark 代码实现此目的
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.