[英]Split pandas dataframe rows up to searched column value into new dataframes
I have a dataframe that contains multiple header rows (a combination of multiple csvs).我有一个包含多个标题行(多个 csv 的组合)的数据框。 Is there a way to split the dataframe back into individual dataframes without using .iloc?有没有办法在不使用 .iloc 的情况下将数据帧拆分回单个数据帧? iloc works, but will be time consuming for my workflow. iloc 有效,但对我的工作流程来说会很耗时。
data = {'A': [1,2,3,'A',4,5,6,'A',7,8,9],
'B': [9,8,7,'B',6,5,4,'B',3,2,1]}
df = pd.DataFrame(data, columns = ['A','B'])
## My current approach:
df1 = df.iloc[:3,]
df2 = df.iloc[4:7,]
df3 = df.iloc[8:,]
Is there a better way to split the data frame by searching for the values in the columns?是否有更好的方法通过搜索列中的值来拆分数据框? ie something like df1,df2,df3 = df.split(df['A']=='A')
即类似于df1,df2,df3 = df.split(df['A']=='A')
One can use eq
to check for the header rows, then groupby on the cumsum:可以使用eq
来检查标题行,然后在 cumsum 上使用 groupby:
header_rows = df.eq(df.columns).all(1)
dfs = {k:v for k,v in df[~header_rows].groupby(header_rows.cumsum())}
then, for example dfs[0]
gives:然后,例如dfs[0]
给出:
A B
0 1 9
1 2 8
2 3 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.