Python将函数应用于DataFrame的每一行

Question

I have DataFrame with two columns: Type and Name .我有两列 DataFrame： Type和Name 。 The values in each cell are lists of equal length, ie we have pairs (Type, Name) .每个单元格中的值都是等长的列表，即我们有对(Type, Name) 。 I want to:我想要：

Group Name by it's Type组Name Type
Create column Type with the values of Name s使用Name的值创建列Type

My current code is a for loop:我当前的代码是一个 for 循环：

for idx, row in df.iterrows():
    for t in list(set(row["Type"])):
        df.at[idx, t] = [row["Name"][i] for i in range(len(row["Name"])) if row["Type"][i] == t]

but it works very slow.但它的工作速度很慢。 How can I speed up this code?如何加快此代码的速度？

EDIT Here is the code example which ilustrates what I want to obtain but in a faster way:编辑这是说明我想要获得但以更快的方式获得的代码示例：

import pandas as pd
df = pd.DataFrame({"Type": [["1", "1", "2", "3"], ["2","3"]], "Name": [["A", "B", "C", "D"], ["E", "F"]]})

unique = list(set(row["Type"]))
for t in unique:
    df[t] = None
    df[t] = df[t].astype('object')

for idx, row in df.iterrows():
    for t in unique:
        df.at[idx, t] = [row["Name"][i] for i in range(len(row["Name"])) if row["Type"][i] == t]

Answer 1

You could write a function my_function(param) and then do something like this:您可以编写一个函数my_function(param)然后执行以下操作：

df['type'] = df['name'].apply(lambda x: my_function(x))

There are likely better alternatives to using lambda functions, but lambdas are what I remember.使用 lambda 函数可能有更好的选择，但我记得 lambda。 If you post a simplified mock of your original data and what the desired output should look like, it may help you find the best answer to your question.如果您发布原始数据的简化模拟以及所需输出的外观，它可能会帮助您找到问题的最佳答案。 I'm not certain I understand what you're trying to do.我不确定我是否理解你想要做什么。 A literal group by should be done using Dataframes' groupby method .文字分组应该使用Dataframes 的 groupby 方法来完成。

Answer 2

If I understand correctly your dataframe looks something like this:如果我理解正确，您的数据框看起来像这样：

df = pd.DataFrame({'Name':['a,b,c','d,e,f,g'], 'Type':['3,3,2','1,2,2,1']}) 


Name    Type
0   a,b,c   3,3,2
1   d,e,f,g 1,2,2,1

where the elements are lists of strings.其中元素是字符串列表。 Start with running:从运行开始：

df['Name:Type'] = (df['Name']+":"+df['Type']).map(process)

using:使用：

def process(x):
    x_,y_ = x.split(':')
    x_ = x_.split(','); y_ = y_.split(',')
    s = zip(x_,y_)
    str_ = ','.join(':'.join(y) for y in s)
    return str_

Then you will get:然后你会得到：

This reduces the problem to a single column.这将问题减少到单个列。 Finally produce the dataframe required by:最后生成所需的数据框：

l = ','.join(df['Name:Type'].to_list()).split(',')
pd.DataFrame([i.split(':') for i in l], columns=['Name','Type'])

Giving:给予：

Answer 3

is it the result you want?是你想要的结果吗？ (if not then add to your question an example of desired output): （如果没有，那么在您的问题中添加所需输出的示例）：

res = df.explode(['Name','Type']).groupby('Type')['Name'].agg(list)

print(res)
'''
Type
1    [A, B]
2    [C, E]
3    [D, F]
Name: Name, dtype: object

UPD UPD

df1 = df.apply(lambda x: pd.Series(x['Name'],x['Type']).groupby(level=0).agg(list).T,1)
res = pd.concat([df,df1],axis=1)

print(res)
'''
           Type          Name       1    2    3
0  [1, 1, 2, 3]  [A, B, C, D]  [A, B]  [C]  [D]
1        [2, 3]        [E, F]     NaN  [E]  [F]

Python将函数应用于DataFrame的每一行

问题描述

3 个解决方案

解决方案1
1 2022-06-25 00:27:44

解决方案2
0 2022-06-25 01:02:15

解决方案3
0 2022-06-25 21:17:22

Python将函数应用于DataFrame的每一行

问题描述

3 个解决方案

解决方案1 1 2022-06-25 00:27:44

解决方案2 0 2022-06-25 01:02:15

解决方案3 0 2022-06-25 21:17:22

解决方案1
1 2022-06-25 00:27:44

解决方案2
0 2022-06-25 01:02:15

解决方案3
0 2022-06-25 21:17:22