繁体   English   中英

如何将函数应用于数据框中的每一行并获得一系列指令?

[英]how to apply a function to each row in a dataframe and get a series of dicts?

我想将一个函数应用于数据框中的每一行并获得一个序列。 如果该函数返回一个数字,它将起作用,但是当该函数返回一个dict时,它将不起作用。

In [31]: d
Out[31]: 
                a         b
bar one -0.185677 -0.554356
    two -0.457943 -1.094836
baz one -0.731338 -0.027821
    two -1.061098  0.258291
foo one -1.392160  2.287989
    two  2.010208 -1.350581
qux one -0.792229 -0.323397
    two -1.063265  0.048641

In [32]: d.apply(lambda x: x.a+x.b, axis=1)
Out[32]: 
bar  one   -0.740034
     two   -1.552779
baz  one   -0.759159
     two   -0.802806
foo  one    0.895829
     two    0.659626
qux  one   -1.115627
     two   -1.014624
dtype: float64

In [33]: d.apply(lambda x: {"boo": x.a}, axis=1)// I want a series of dict
Out[33]: 
          a   b
bar one NaN NaN
    two NaN NaN
baz one NaN NaN
    two NaN NaN
foo one NaN NaN
    two NaN NaN
qux one NaN NaN
    two NaN NaN

apply的reduce参数是否为None / True / False似乎无关紧要。 通过在字典{“ boo”:xa}中访问键“ a”和“ b”的值,熊猫似乎太聪明了。

答案是不要尝试一步执行计算和数据类型转换。 计算一系列值,然后将其重组为字典。

import pandas
import random

mi1 = ['bar','baz']
mi2 = ['one', 'two']
data = [[s,t, random.random(), random.random()] for s in mi1 for t in mi2]
df = pandas.DataFrame(data, columns=['i1', 'i2', 'a', 'b'])
df.set_index(['i1', 'i2'], inplace=True)
print(df)

            a         b
i1  i2                     
bar one  0.438596  0.734058
    two  0.183522  0.272922
baz one  0.581694  0.522173
    two  0.776081  0.941120

# calculate the series
# the data type is still floats
sr = df.apply(lambda row: row.a + row.b, axis=1)

# construct the series of objects
result = sr.apply(lambda x: {"boo":x})
print(result)

i1   i2 
bar  one     {'boo': 1.17265404527}
     two    {'boo': 0.456443829892}
baz  one     {'boo': 1.10386719117}
     two     {'boo': 1.71720149706}
dtype: object

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM