简体繁体中英

Pyspark dataframe drop columns issue

原文 2018-03-04 20:40:06 9 1 python/ python-3.x/ pyspark/ spark-dataframe

I am trying to drop two columns from a dataframe but I am facing an error as

**Error:**
drop() takes 2 positional arguments but 3 were given

***Code:***
 excl_columns= row['exclude_columns'].split(',')
 df=df.drop(*excl_columns)

#print(excl_columns)
#['year_of_birth', 'ethnicity']

1 answers

Here's one way which should work:

excl_columns = row['exclude_columns'].split(',')
df.select([c for c in df.columns if c not in excl_columns])

Best practice for PySpark Dataframe - to drop multiple columns?

Pyspark dataframe how to drop rows with nulls in all columns?

How to drop all columns with null values in a PySpark DataFrame?

Select columns in PySpark dataframe

How to drop rows of a pyspark dataframe if they're in another dataframe based on the values from two columns?

Issue with PySpark DataFrame

Pyspark Dataframe Ordering Issue

PySpark DataFrame unable to drop duplicates

How to process pyspark dataframe columns

repartitioning by multiple columns for Pyspark dataframe

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Best practice for PySpark Dataframe - to drop multiple columns? Pyspark dataframe how to drop rows with nulls in all columns? How to drop all columns with null values in a PySpark DataFrame? Select columns in PySpark dataframe How to drop rows of a pyspark dataframe if they're in another dataframe based on the values from two columns? Issue with PySpark DataFrame Pyspark Dataframe Ordering Issue PySpark DataFrame unable to drop duplicates How to process pyspark dataframe columns repartitioning by multiple columns for Pyspark dataframe

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM