简体   繁体   English

操作 pandas 数据框列中的列表(例如,除以另一列)

[英]Manipulate lists in a pandas data frame column (e.g. divide by another column)

I have a pandas data frame with one column containing lists.我有一个 pandas 数据框,其中一列包含列表。 I wish to divide each list element in each row by a scalar value in another column.我希望将每一行中的每个列表元素除以另一列中的标量值。 In the following example, I wish to divide each element in a by b:在以下示例中,我希望将 a 中的每个元素除以 b:

              a   b
0  [11, 22, 33]  11
1  [12, 24, 36]   2
2  [33, 66, 99]   3

Thus yielding the following result:从而产生以下结果:

              a   b                   c
0  [11, 22, 33]  11     [1.0, 2.0, 3.0]
1  [12, 24, 36]   2   [6.0, 12.0, 18.0]
2  [33, 66, 99]   3  [11.0, 22.0, 33.0]

I can achieve this by the following code:我可以通过以下代码实现这一点

import pandas as pd

df = pd.DataFrame({"a":[[11,22,33],[12,24,36],[33,66,99]], "b" : [11,2,3]})

result = {"c":[]}
for _, row in df.iterrows():
    result["c"].append([x / row["b"] for x in row["a"]])

df_c = pd.DataFrame(result)
df = pd.concat([df,df_c], axis="columns")

But explicit iteration over rows and collecting the result in a dictionary, converting it to a dataframe and then concatenation to the original data frame seems very inefficient and inelegant .但是对行进行显式迭代并将结果收集到字典中,将其转换为 dataframe 然后连接到原始数据框似乎非常低效和不优雅

Does anyone have a better solution?有没有人有更好的解决方案?

Thanks in advance and cheeers!在此先感谢和欢呼!


PS : In case you are wondering why I would store lists in a column: These are the resulting amplitudes of a Fourier-Transformation. PS :如果您想知道我为什么要将列表存储在列中:这些是傅立叶变换的结果幅度。

Why I don't use one column for each frequency?为什么我不为每个频率使用一列?

  1. Creating a new column for each frequency is horribly slow为每个频率创建一个新列非常慢
  2. With different sampling rates and FFT-window sizes in my project, there are multiple sets of frequencies.在我的项目中使用不同的采样率和 FFT 窗口大小,有多组频率。

zip the two columns, divide each entry in col a with its corresponding entry in col b, through a combination of product and starmap , and convert the iterator back into a list. zip 两列,通过productstarmap的组合,将 col a 中的每个条目与其对应的 col b 条目分开,并将迭代器转换回列表。

from itertools import product,starmap
from operator import floordiv
df['c'] = [list(starmap(floordiv,(product(num,[denom])))) 
           for num, denom in zip(df.a,df.b)]


        a           b       c
0   [11, 22, 33]    11  [1, 2, 3]
1   [12, 24, 36]    2   [6, 12, 18]
2   [33, 66, 99]    3   [11, 22, 33]

Alternatively, u could just use numpy array within the iteration:或者,您可以在迭代中只使用 numpy 数组:

df['c'] = [list(np.array(num)/denom) for num, denom in zip(df.a,df.b)]

Thanks to @jezrael for the suggestion - All of this might be unnecessary as scipy has something for FFT - have a look at the link and see if it helps out.感谢@jezrael 的建议——所有这些可能都是不必要的,因为 scipy 有FFT功能——看看链接,看看它是否有帮助。

I would convert the lists to numpy arrays:我会将列表转换为 numpy arrays:

df['c'] = df['a'].apply(np.array) / df['b']

You will get np.array s in column c.您将在 c 列中获得np.array If you really need lists, you will have to convert them back如果您真的需要列表,则必须将它们转换回来

df['c'] = df['c'].apply(list)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 熊猫将一列的条目除以另一个数据帧中的条目 - Pandas divide entries of a column by entries from another data frame 熊猫:在DataFrame问题中选择列-例如row [1] ['Column'] - Pandas: selecting columns in a DataFrame question - e.g. row[1]['Column'] Pandas 按非时间序列列(例如价格)对数据重新采样 - Pandas resample data by a non-timeseries column (e.g. Price) 通过与熊猫中的另一个数据框匹配替换列表列的有效方法 - Efficient way to replace column of lists by matches with another data frame in Pandas 如何将多个 pandas 数据帧(例如二维矩阵)转换为张量? - How to convert multiple pandas data-frame (e.g. 2D matrices) into a tensor? 如何将出生年份的熊猫数据框列转换为年龄? (例如'1991'-> 28) - How can I convert a pandas dataframe column with birth year to age? (e.g. '1991' -> 28) python pandas 嵌套循环:将函数应用于例如第 2 列的每个元素,涉及复合前一列中的相同元素 - python pandas nested loop: to apply a function to each element of e.g. column 2 involving compounding same elements in previous column 1s 通过在熊猫的另一列中拆分字符串来创建新的数据框列 - Creating a new data frame column, by splitting a string in another column in pandas 根据另一列的值向python pandas数据框添加一列 - Adding a column to a python pandas data frame based on the value of another column 将列添加到熊猫数据框中,这是另一列的功能 - Add a column to a pandas data frame that is a function of another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM