简体   繁体   English

使用用户定义的函数时如何返回两个函数?

[英]How do you get two functions to return when using a user-defined function?

I am just starting to use user-defined functions, so this is probably not a very complex question, forgive me. 我刚刚开始使用用户定义的函数,所以请原谅这可能不是一个非常复杂的问题。

I have a few dataframes, which all have a column named 'interval_time' (for example) and I would like to rename this column 'Timestamp', and then make this renamed column into the index. 我有几个数据框,所有的数据框都有一个名为“ interval_time”的列(例如),我想将此列重命名为“ Timestamp”,然后将此重命名的列设置为索引。

I know that I can do this manually with this; 我知道我可以用这个手动完成;

df = df.rename(index=str, columns={'interval_time': 'Timestamp'})
df = df.set_index('Timestamp')

but now I would like to define a function called rename that does this for me. 但现在我想定义一个名为重命名的函数来为我完成此任务。 I have seen that this works; 我已经看到这行得通;

def rename_col(data, col_in='tempus_interval_time', col_out='Timestamp'):
    return data.rename(index=str, columns={col_in: col_out}, inplace=True)

but when I try to add the second function it does not seem to do anything, but if I define the second part as its own function and run it it does seem to work. 但是,当我尝试添加第二个函数时,它似乎没有任何作用,但是,如果我将第二个部分定义为其自己的函数并运行它,则它似乎确实可以工作。

I am trying this 我正在尝试

def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'):
    return data.rename(index=str, columns={col_in: col_out}, inplace=True)
    return data.set_index('Timestamp', inplace=True)

The dataframes that I am using have the following form; 我正在使用的数据框具有以下形式;

df_scada
              interval_time                 A         ...             X                 Y 
0       2010-11-01 00:00:00                0.0        ...                396.36710         381.68860
1       2010-11-01 00:05:00                0.0        ...                392.97974         381.40634
2       2010-11-01 00:10:00                0.0        ...                390.15695         379.99493
3       2010-11-01 00:15:00                0.0        ...                389.02786         379.14810

You don't need to return anything , because your operations are done in place . 您不需要返回任何东西 ,因为您的操作已就位 You can do the in-place changes in your function: 您可以在函数中进行就地更改:

def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'):
    data.rename(index=str, columns={col_in: col_out}, inplace=True)
    data.set_index('Timestamp', inplace=True)

and any other references to the dataframe you pass into the function will see the changes made: 您传递给函数的数据框的任何其他引用都将看到所做的更改:

>>> import pandas as pd
>>> df = pd.DataFrame({'interval_time': pd.to_datetime(['2010-11-01 00:00:00', '2010-11-01 00:05:00', '2010-11-01 00:10:00', '2010-11-01 00:15:00']),
...     'A': [0.0] * 4}, index=range(4))
>>> df
     A       interval_time
0  0.0 2010-11-01 00:00:00
1  0.0 2010-11-01 00:05:00
2  0.0 2010-11-01 00:10:00
3  0.0 2010-11-01 00:15:00
>>> def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'):
...     data.rename(index=str, columns={col_in: col_out}, inplace=True)
...     data.set_index('Timestamp', inplace=True)
...
>>> rename_n_index(df, 'interval_time')
>>> df
                       A
Timestamp
2010-11-01 00:00:00  0.0
2010-11-01 00:05:00  0.0
2010-11-01 00:10:00  0.0
2010-11-01 00:15:00  0.0

In the above example, the df reference to the dataframe shows the changes made by the function. 在上面的示例中,对数据帧的df引用显示了该函数所做的更改。

If you remove the inplace=True arguments, the method calls return a new dataframe object. 如果删除inplace=True参数,则该方法调用将返回一个新的dataframe对象。 You can store an intermediate result as a local variable, then apply the second method to the dataframe referenced in that local variable: 您可以将中间结果存储为局部变量,然后将第二种方法应用于该局部变量中引用的数据框:

def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'):
    renamed = data.rename(index=str, columns={col_in: col_out})
    return renamed.set_index('Timestamp')

or you can chain the method calls directly to the returned object: 或者,您可以将方法调用直接链接到返回的对象:

def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'):
    return data.rename(index=str, columns={col_in: col_out})\
               .set_index('Timestamp'))

Because renamed is already a new dataframe, you can apply the set_index() call in-place to that object, then return just renamed , as well: 因为renamed已经是一个新的数据帧,所以可以将set_index()调用就地应用于该对象,然后也返回刚刚renamed

def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'):
    renamed = data.rename(index=str, columns={col_in: col_out})
    renamed.set_index('Timestamp', inplace=True)
    return renamed

Either way, this returns a new dataframe object, leaving the original dataframe unchanged: 无论哪种方式,这都会返回一个新的数据框对象,而使原始数据框保持不变:

>>> def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'):
...     renamed = data.rename(index=str, columns={col_in: col_out})
...     return renamed.set_index('Timestamp')
...
>>> df = pd.DataFrame({'interval_time': pd.to_datetime(['2010-11-01 00:00:00', '2010-11-01 00:05:00', '2010-11-01 00:10:00', '2010-11-01 00:15:00']),
...     'A': [0.0] * 4}, index=range(4))
>>> rename_n_index(df, 'interval_time')
                       A
Timestamp
2010-11-01 00:00:00  0.0
2010-11-01 00:05:00  0.0
2010-11-01 00:10:00  0.0
2010-11-01 00:15:00  0.0
>>> df
     A       interval_time
0  0.0 2010-11-01 00:00:00
1  0.0 2010-11-01 00:05:00
2  0.0 2010-11-01 00:10:00
3  0.0 2010-11-01 00:15:00

See @MartijnPieters' explanation for resolving the errors in your code. 有关解决代码中的错误的信息,请参见@MartijnPieters的说明

However, note that the Pandorable method is to use method chaining. 但是,请注意Pandorable方法是使用方法链接。 Some find it aesthetically pleasing to see method names visually aligned. 有些人发现从外观上看方法名称在美学上令人愉悦。 Here's an example: 这是一个例子:

def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'):

    renamed = data.rename(index=str, columns={col_in: col_out})\
                  .set_index('Timestamp')

    return renamed

Then to apply these to a sequence of dataframes as in your previous question : 然后,将这些应用到dataframes序列在以前的问题

dfs = [df.pipe(rename_n_index) for df in (df1, df2, df3)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在pandas中使用用户自定义的function根据列值和Timestamp返回一个值 - How to return a value based on column value and Timestamp using user-defined function in pandas 如何定义一个函数以返回两个函数的和? - How do you define a function to return the sum of two functions? 如果使用用户定义的函数输入数字,如何添加数字 - How to add numbers if numbers are input using a user-defined function 当我为使用 geopandas 绘制地图指定用户定义的颜色时,如何获取图例? - How can I get legend when I specify user-defined color for plotting maps using geopandas? 如何将参数传递给用户定义函数? - How to pass the parameter to User-Defined Function? Python/Pandas:在用户定义函数中使用内置函数作为参数 - Python/Pandas: Using built-in functions as arguments in user-defined function 如何获取用户定义的方法名称? - How to get user-defined methods name? / help()函数不返回用户定义函数的文档字符串 - /help() function not return docstring of user-defined function 在另一个用户定义的函数中调用用户定义的函数时发生Nameerror - Nameerror when calling a user-defined function in another user-defined function 如何使math和numpy模块中的现有函数支持用户定义的对象? - How do I make existing functions in math and numpy module support user-defined objects?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM