大熊猫行具体适用

Question

Similar to this R question , I'd like to apply a function to each item in a Series (or each row in a DataFrame) using Pandas, but want to use as an argument to this function the index or id of that row. 与此R问题类似，我想使用Pandas将函数应用于Series（或DataFrame中的每一行）中的每个项目，但是希望将该行的索引或id用作此函数的参数。 As a trivial example, suppose one wants to create a list of tuples of the form [(index_i, value_i), ..., (index_n, value_n)]. 作为一个简单的例子，假设有人想要创建[（index_i，value_i），...，（index_n，value_n）]形式的元组列表。 Using a simple Python for loop, I can do: 使用简单的Python for循环，我可以这样做：

In [1] L = []
In [2] s = Series(['six', 'seven', 'six', 'seven', 'six'],
           index=['a', 'b', 'c', 'd', 'e'])
In [3] for i, item in enumerate(s):
           L.append((i,item))
In [4] L
Out[4] [(0, 'six'), (1, 'seven'), (2, 'six'), (3, 'seven'), (4, 'six')]

But there must be a more efficient way to do this? 但必须有一个更有效的方法来做到这一点？ Perhaps something more Panda-ish like Series.apply? 或许更像Panda-likeh喜欢Series.apply？ In reality, I'm not worried (in this case) about returning anything meaningful, but more for the efficiency of something like 'apply'. 实际上，我并不担心（在这种情况下）返回任何有意义的东西，但更多的是为了“应用”之类的效率。 Any ideas? 有任何想法吗？

Answer 1

If you use the apply method with a function what happens is that every item in the Series will be mapped with such a function. 如果对函数使用apply方法，那么系列中的每个项都将使用这样的函数进行映射。 Eg 例如

>>> s.apply(enumerate)
a    <enumerate object at 0x13cf910>
b    <enumerate object at 0x13cf870>
c    <enumerate object at 0x13cf820>
d    <enumerate object at 0x13cf7d0>
e    <enumerate object at 0x13ecdc0>

What you want to do is simply to enumerate the series itself. 你想要做的只是枚举系列本身。

>>> list(enumerate(s))
[(0, 'six'), (1, 'seven'), (2, 'six'), (3, 'seven'), (4, 'six')]

What if for example you wanted to sum the string of all the entities? 如果您想要对所有实体的字符串求和，该怎么办？

>>> ",".join(s)
'six,seven,six,seven,six'

A more complex usage of apply would be this one: 申请的更复杂用法是：

>>> from functools import partial
>>> s.apply(partial(map, lambda x: x*2 ))
a                ['ss', 'ii', 'xx']
b    ['ss', 'ee', 'vv', 'ee', 'nn']
c                ['ss', 'ii', 'xx']
d    ['ss', 'ee', 'vv', 'ee', 'nn']
e                ['ss', 'ii', 'xx']

[Edit] [编辑]

Following the OP's question for clarifications: Don't confuse Series (1D) with DataFrames (2D) http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe - as I don't really see how you can talk about rows. 根据OP的澄清问题：不要将系列（1D）与DataFrames（2D） http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe混淆 - 因为我没有真正看到你可以谈论行。 However you can include indices in your function by creating a new series (apply wont give you any information about the current index): 但是，您可以通过创建新系列在函数中包含索引（应用不会为您提供有关当前索引的任何信息）：

>>> Series([s[x]+" my index is:  "+x for x in s.keys()], index=s.keys())
a      six index  a
b    seven index  b
c      six index  c
d    seven index  d
e      six index  e

Anyhow I would suggest that you switch to other data types to avoid huge memory leaks. 无论如何，我建议你切换到其他数据类型，以避免巨大的内存泄漏。

Answer 2

Here's a neat way, using itertools's count and zip : 这是一个简洁的方法，使用itertools的count和zip ：

import pandas as pd
from itertools import count

s = pd.Series(['six', 'seven', 'six', 'seven', 'six'],
                  index=['a', 'b', 'c', 'd', 'e'])

In [4]: zip(count(), s)
Out[4]: [(0, 'six'), (1, 'seven'), (2, 'six'), (3, 'seven'), (4, 'six')]

Unfortunately, only as efficient than enumerate(list(s)) ! 不幸的是，只有enumerate(list(s))才有效！

大熊猫行具体适用

问题描述

2 个解决方案

解决方案1
7 已采纳 2012-06-23 16:00:18

解决方案2
3 2012-12-11 20:47:51

大熊猫行具体适用

问题描述

2 个解决方案

解决方案1 7 已采纳 2012-06-23 16:00:18

解决方案2 3 2012-12-11 20:47:51

解决方案1
7 已采纳 2012-06-23 16:00:18

解决方案2
3 2012-12-11 20:47:51