[英]How to add a new column in a dataframe based on the value of external list?
I have a dataframe df
with two columns: col1
and col2
.我有一个 dataframe df
有两列: col1
和col2
。
col1 includes the id
of my users. col1 包括我的用户的id
。 users is a list of names (ie strings). users 是一个名称列表(即字符串)。 So, id=0 is equal to the name at index 0 in my users list.因此,id=0 等于我的用户列表中索引 0 处的名称。
I want to add a new column in my dataframe
including the corresponding names of the ids
.我想在我的dataframe
中添加一个新列,包括相应的ids
名称。
If the id column only has unique values (meaning there aren't multiple rows with the same id) you can sort the dataframe by the id column then assign the list to a new column.如果 id 列只有唯一值(意味着没有多行具有相同的 id),您可以按 id 列对 dataframe 进行排序,然后将列表分配给新列。
data = {'id': [2, 1, 0, 3]}
df = pandas.DataFrame(data=d)
users = ['dave', 'sandy', 'will', 'arthur']
df.sort_values(by=['id'], inplace=True)
df['user'] = users
Output: Output:
id user
0 dave
1 sandy
2 will
3 arthur
If the id column has multiple instances of the same id, you can use a lambda function:如果 id 列有多个相同 id 的实例,则可以使用 lambda function:
data = {'id': [3, 1, 0, 3]}
df = pandas.DataFrame(data=d)
users = ['dave', 'sandy', 'will', 'arthur']
df['user'] = df.apply(lambda row: users[row.id], axis=1)
Output: Output:
id user
3 arthur
1 sandy
0 dave
3 arthur
The lambda is basically saying for every row in this new column 'user', the value should be from the users list at the index given by the rows 'id' column value. lambda 基本上是说对于这个新列“用户”中的每一行,该值应该来自用户列表中的行“id”列值给出的索引。
user_list = ['user_1', 'user_2', 'user_3'] user_list = ['user_1', 'user_2', 'user_3']
Adding column to dataframe:将列添加到 dataframe:
df['UserName'] = user_list
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.