[英]combine rows of single column over multiple .csv files in pandas
I have a bunch of.csv files with the same column headers and data types in the columns.我有一堆 .csv 文件,它们在列中具有相同的列标题和数据类型。
c1 c2 c3
1 5 words
2 6 words
3 7 words
4 8 words
is there a way to combine all the text in c3 in each.csv file then combine them into one csv?有没有办法将每个.csv 文件中c3 中的所有文本组合成一个csv?
I combined them this way我这样组合它们
path = r'C:\\Users\\...\**\*.csv'
all_rec = iglob(path, recursive=True)
dataframes = (pd.read_csv(f) for f in all_rec)
big_dataframe = pd.concat(dataframes, ignore_index=True)
i'm not sure how to combine the text rows first then bring them together.我不确定如何先组合文本行,然后再将它们组合在一起。
There are many way to do it.有很多方法可以做到这一点。 One way:
单程:
path = r'C:\\Users\\...\**\*.csv'
all_rec = iglob(path, recursive=True)
# Extract only c3 column from files
dataframes = {f: pd.read_csv(f, usecols=['c3']) for f in all_rec}
# Group all dataframes then combine text rows of each dataframe
big_dataframe = pd.concat(dataframes).groupby(level=0)['c3'] \
.apply(lambda x: ' '.join(x.tolist())).reset_index(drop=True)
Output: Output:
>>> big_dataframe
0 words words words words
1 words2 words2 words2 words2
2 words3 words3 words3 words3
Name: c3, dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.