This artcle gives a great overview on how to change columnnames. How to change dataframe column names in pyspark?
Nontheless I need something more / slightly adjusted that I am not capable of doing. Can anybody help remove spaces from all colnames? Its needed for eg join commands and the systematic approach reduces the effort of dealing with 30 columns. I suppose a combination of regex and a UDF would work best.
Example: root |-- CLIENT: string (nullable = true) |-- Branch Number: string (nullable = true)
There is a real simple solution:
for name in df.schema.names:
df = df.withColumnRenamed(name, name.replace(' ', ''))
如果您想使用与前缀(或后缀)连接的相同列名重命名多个列,这应该有效
df.select([f.col(c).alias(PREFIX + c) for c in columns])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.