简体   繁体   English

如何按行中的特定字符串拆分 dataframe

[英]How to split dataframe by specific string in rows

I have a dataframe like this:我有一个像这样的 dataframe:

df = pd.DataFrame({"a":["x1", 12, 14, "x2", 32, 9]})

df
Out[10]: 
    a
0  x1
1  12
2  14
3  x2
4  32
5   9

I would like to split it in multiple dataframes (in this case, two) if row begins with "x".如果行以“x”开头,我想将其拆分为多个数据帧(在本例中为两个)。 And then this row should be the column name.然后这一行应该是列名。 Maybe splitting these dataframes and put inside a dictionary?也许拆分这些数据框并放入字典中?

The output should be like this: output 应该是这样的:

x1
Out[12]: 
   x1
0  12
1  14

x2
Out[13]: 
   x2
0  32
1   9

Anyone could help me?任何人都可以帮助我吗?

You can try cumsum on str.startswith then groupby on that:您可以在str.startswith cumsumgroupby上尝试:

for k, d in df.groupby(df['a'].str.startswith('x').fillna(0).cumsum()):
    # manipulate data to get desired output
    sub_df = pd.DataFrame(d.iloc[1:].to_numpy(), columns=d.iloc[0].to_numpy()) 

    # do something with it
    print(sub_df)
    print('-'*10)

Output: Output:

   x1
0  12
1  14
----------
   x2
0  32
1   9
----------

Something like this should work:像这样的东西应该工作:

import pandas as pd
df = pd.DataFrame({"a":["x1", 12, 14, "x2", 32, 9]})
## Get the row index of value starting with x
ixs = []
for j in df.index:
    if isinstance(df.loc[j,'a'],str):
        if df.loc[j,'a'].startswith('x'):
            ixs.append(j)
dicto = {}
for i,val in enumerate(ixs):
    start_ix = ixs[i]
    if i == len(ixs) - 1:
        end_ix = df.index[-1]
    else:
        end_ix = ixs[i+1] - 1
    new_df = df.loc[start_ix:end_ix,'a'].reset_index(drop=True)
    new_df.columns = new_df.iloc[0]
    new_df.drop(new_df.index[0],inplace=True)
    dicto[i] = new_df

A groupby is like a dictionary, so we can explicitly make it one: groupby就像一本字典,所以我们可以明确地将其设为一个:

dfs = {f'x{k}':d for k, d in df.groupby(df['a'].str.startswith('x').fillna(False).cumsum())}
for k in dfs:
    dfs[k].columns = dfs[k].iloc[0].values # Make x row the header.
    dfs[k] = dfs[k].iloc[1:] # drop x row.
    print(dfs[k], '\n')

Output: Output:

   x1
1  12
2  14

   x2
4  32
5   9

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM