简体   繁体   中英

pyspark convert array to string in loop

I have a psypark data frame which has string,int and array type columns. I am trying to run a for loop for all columns to check if their is any array type column and convert it to string.

The output in the pyspark data frame should then hold the int,string columns.

The below code will return only the columns which were converted from array to string. How do i include else statement to get the remaining columns from dataframe which are not array type.

dfstring = df.select([(F.col(c).cast('String')).alias(c) for c in df.columns if dict(df.dtypes)[c] == 'array<string>'])

You can try to modify the list comprehension as below:

dfstring = df.select([
    (F.col(c).cast('String')).alias(c)
    if 'array' in dict(df.dtypes)[c]
    else F.col(c)
    for c in df.columns 
])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM