I just started to using pandas and I would to reduce amount of data that I get by merging my DataFrames in that way:
Return
def merge_df(in_df): alist = [] for col in in_df.columns: if len(in_df[col].unique()) == 1: alist.append(col) return in_df[alist].T.squeeze()[1]
Is there any more elegent way to do it? Eg without looping through all columns?
Yeah you can remove duplicate data by pandas simple function. df.drop_duplicates()
You can refer documentation here.
For removing particular column redundant data you can pass column name as a parameter " subset ". It will remove whole row for duplicate data.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.