简体   繁体   English

将 function 应用于 dataframe 的每一列

[英]Apply a function to each column of a dataframe

I have a dataframe of numbers going from 1 to 13 (each number is a location).我有一个 dataframe 数字从 1 到 13(每个数字都是一个位置)。 As the index, I have set a timeline representing timesteps of 2 min during 24h (720 rows).作为索引,我设置了一个时间线,表示 24 小时内 2 分钟的时间步长(720 行)。 Each column represents a single person.每列代表一个人。 So I have columns of locations along 24h in 2 min timesteps.所以我在 2 分钟的时间步长中有 24 小时的位置列。

I am trying to convert this numbers to binary (if it's a 13, I want a 1, and otherwise a 0).我正在尝试将此数字转换为二进制(如果是 13,我想要 1,否则为 0)。 But when I try to apply the function I get an error:但是当我尝试应用 function 时出现错误:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Here's the code:这是代码:

import pandas as pd
from datetime import timedelta
df = pd.read_csv("dataset_belgium/all_patterns_2MINS.csv", encoding="utf-8")
df = df.transpose()

df.reset_index(drop=True, inplace=True)


timeline = []
for timestep in range(len(df.index)):
    time = timedelta(seconds=timestep*2*60)
    time = str(time)
    timeline.append(time)


tl = pd.DataFrame(timeline)
tl.columns = ['timeline']

df=df.join(tl, how='left')

df = df.set_index('timeline')
#df.drop(['0:00:00'])

def to_binary(element):
    if element == 13:
        element = 1
    else:
        element = 0
    return element

binary_df = df.apply(to_binary)

Also I would like to eliminate the 1st row, the one of index ('0:00:00'), since it doesn't contain numbers from 1 to 13. Thanks in advance!另外我想消除第一行,即索引('0:00:00')之一,因为它不包含从1到13的数字。提前致谢!

As you say in the title, you apply the function to each column of the data frame.正如您在标题中所说,您将 function 应用于数据框的每一列。 So what you call element within the function is actually a whole column.所以你所说的 function 中的element实际上是一整列。 That's why the line if element == 13: raises an error.这就是if element == 13:行引发错误的原因。 Python doesn't know what it would mean for a whole column to be equal to one number. Python 不知道整列等于一个数字意味着什么。 One straightforward solution would be to use a for loop:一个直接的解决方案是使用 for 循环:

def to_binary(column):
    for element in column:
        if element == 13:
            element = 1
        else:
            element = 0
    return column

However, this would still not solve the more basic issue that the function doesn't actually change anything with lasting effect, because it uses only local variables.但是,这仍然不能解决更基本的问题,即 function 实际上并没有改变任何具有持久效果的东西,因为它只使用局部变量。

An easy alternative approach is to use the pandas replace method, which allows you to explicitly replace arbitrary values with other ones:一种简单的替代方法是使用 pandas replace方法,它允许您用其他值显式替换任意值:

df.replace([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], 
           [0, 0, 0, 0, 0, 0, 0, 0, 0,  0,  0,  0,  1], 
           inplace=True)

To delete the first row, you can use df = df[1:] .要删除第一行,您可以使用df = df[1:]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM