熊猫将行迭代到新的数据框

Question

How to I sperate rows and form a new dataframe with the series ? 如何对行进行分类并与该系列形成一个新的数据框？

Suppose I have a dataframe df and I am iterating over df with the following and trying to append over an empty dataframe 假设我有一个数据框df，并使用以下内容遍历df并尝试在一个空的数据框上追加

df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
                    columns=['a', 'b', 'c', 'd', 'e'])

df1 = pd.DataFrame()
df2 = pd.DataFrame()

for index,row in df.iterrows():
    if (few conditions goes here):
        df1.append(row)
    else:
        df2.append(row)

the type of each rows over iteration is a series, but if I append it to empty dataframe it appends rows as columns and columns as row. 迭代中每行的类型是一个序列，但是如果我将其附加到空数据框，则会将行附加为列，将列附加为行。 Is there a fix for this ? 有解决办法吗？

Answer 1

I think the best is avoid iterating and use boolean indexing with conditions chained by & for AND , | 我认为最好的方法是避免迭代，并在& ， AND |链接的条件下使用boolean indexing | for OR , ~ for NOT and ^ for XOR : 对于OR ， ~表示NOT ， ^表示XOR ：

#define all conditions
mask = (df['a'] > 2) & (df['b'] > 3)
#filter
df1 = df[mask]
#invert condition by ~
df2 = df[~mask]

Sample: 样品：

np.random.seed(125)
df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
                    columns=['a', 'b', 'c', 'd', 'e'])
print (df)
   a  b  c  d  e
0  2  7  3  6  0
1  5  6  2  5  0
2  4  2  9  0  7
3  2  7  9  5  3
4  5  7  9  9  1

mask = (df['a'] > 2) & (df['b'] > 3)
print (mask)
0    False
1     True
2    False
3    False
4     True


df1 = df[mask]
print (df1)
   a  b  c  d  e
1  5  6  2  5  0
4  5  7  9  9  1

df2 = df[~mask]
print (df2)
   a  b  c  d  e
0  2  7  3  6  0
2  4  2  9  0  7
3  2  7  9  5  3

EDIT: 编辑：

Loop version, if possible dont use it because slow: 循环版本，如果可能的话请不要使用它，因为速度慢：

df1 = pd.DataFrame(columns=df.columns)
df2 = pd.DataFrame(columns=df.columns)

for index,row in df.iterrows():
    if (row['a'] > 2) and (row['b'] > 3):
       df1.loc[index] = row
    else:
       df2.loc[index] = row


print (df1)
   a  b  c  d  e
1  5  6  2  5  0
4  5  7  9  9  1

print (df2)
   a  b  c  d  e
0  2  7  3  6  0
2  4  2  9  0  7
3  2  7  9  5  3

Answer 2

尝试查询方法

df2 = df1.query('conditions go here')

熊猫将行迭代到新的数据框

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-01-04 16:00:54

解决方案2
1 2018-01-04 16:01:51

熊猫将行迭代到新的数据框

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-01-04 16:00:54

解决方案2 1 2018-01-04 16:01:51

解决方案1
1 已采纳 2018-01-04 16:00:54

解决方案2
1 2018-01-04 16:01:51