[英]Adding a column based on index to a data frame in Pandas
I have a data frame where I want to add a new column with values based on the index. 我有一个数据框,我想在其中添加一个基于索引的值的新列。
This is my fake df: 这是我的假df:
{'fruit': [
'Apple', 'Kiwi', 'Clementine', 'Kiwi', 'Banana', 'Clementine', 'Apple', 'Kiwi'],
'bites': [1, 2, 3, 1, 2, 3, 1, 2]})
I have found a similar question and tried the solution there but I get error messages. 我发现了一个类似的问题,并在那里尝试了解决方案,但收到错误消息。 This is what I tried:
这是我尝试的:
conds = [(my.index >= 0) & (my.index <= row_2),
(my.index > row_2) & (my.index<=row_5),
(my.index > row_5) & (my.index<=row_6),
(my.index > row_6)]
names = ['Donna', 'Kelly', 'Andrea','Brenda']
my['names'] = np.select(conds, names)
For me it working nice (variables changed to numeric), also added alternative solutions with cut
with include_lowest=True
parameter for match 0
value and selecting by DataFrame.loc
: 对我来说,它工作得很好(变量更改为数字),还添加了带有
cut
替代解决方案,其中include_lowest=True
参数用于匹配0
值,并通过DataFrame.loc
选择:
conds = [(my.index >= 0) & (my.index <= 2),
(my.index > 2) & (my.index<=5),
(my.index > 5) & (my.index<=6),
(my.index > 6)]
names = ['Donna', 'Kelly', 'Andrea','Brenda']
my['names'] = np.select(conds, names)
my['names1'] = pd.cut(my.index, [0,2,5,6,np.inf], labels=names, include_lowest=True)
my.loc[:2, 'names2'] = 'Donna'
my.loc[3:5, 'names2'] = 'Kelly'
my.loc[6:7, 'names2'] = 'Andrea'
my.loc[7:, 'names2'] = 'Brenda'
print (my)
fruit bites names names1 names2
0 Apple 1 Donna Donna Donna
1 Kiwi 2 Donna Donna Donna
2 Clementine 3 Donna Donna Donna
3 Kiwi 1 Kelly Kelly Kelly
4 Banana 2 Kelly Kelly Kelly
5 Clementine 3 Kelly Kelly Kelly
6 Apple 1 Andrea Andrea Andrea
7 Kiwi 2 Brenda Brenda Brenda
You can try pd.cut
: 您可以尝试
pd.cut
:
df['names'] = (pd.cut(df.index,
[0, 2, 5, 6, np.inf],
labels=names)
.fillna(names[0])
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.