简体   繁体   English

.transform('first') 有什么作用?

[英]What does .transform('first') do?

Somebody helped me for a code.有人帮我写了一个代码。 I understood everything in the code except the very last row .transform('first') I see what it does (I can see it), but I'd like to precisely know what it's doing behind to obtain this result.我理解代码中的所有内容,除了最后一行.transform('first')我看到它做了什么(我可以看到它),但我想确切地知道它在做什么来获得这个结果。

This is the part of the code I understand :这是我理解的代码部分:

df['Date'] = pd.to_datetime(df['Date'])
df['YEP'] = ( df[::-1].loc[df['Type'].eq('Budget')]
                     .groupby(df['Date'].dt.year)
                     .Value
                     .cumsum()
                     .sub(df['Value'])
                     .add(df['YTD'])
)

This is the output of this first part :这是第一部分的输出:

    Value    Type       Date    YTD     YEP
0     100  Budget 2019-01-01  101.0   974.0
1      50  Budget 2019-02-01  199.0  1022.0
2      20  Budget 2019-03-01  275.0  1078.0
3     123  Budget 2019-04-01  332.0  1012.0
4      56  Budget 2019-05-01    NaN     NaN
5      76  Budget 2019-06-01    NaN     NaN
6      98  Budget 2019-07-01    NaN     NaN
7     126  Budget 2019-08-01    NaN     NaN
8      90  Budget 2019-09-01    NaN     NaN
9      80  Budget 2019-10-01    NaN     NaN
10     67  Budget 2019-11-01    NaN     NaN
11     87  Budget 2019-12-01    NaN     NaN
12    101  Actual 2019-01-01  101.0     NaN
13     98  Actual 2019-02-01  199.0     NaN
14     76  Actual 2019-03-01  275.0     NaN
15     57  Actual 2019-04-01  332.0     NaN

This is the entire code :这是整个代码:

df['Date'] = pd.to_datetime(df['Date'])
df['YEP'] = ( df[::-1].loc[df['Type'].eq('Budget')]
                     .groupby(df['Date'].dt.year)
                     .Value
                     .cumsum()
                     .sub(df['Value'])
                     .add(df['YTD'])
                     .groupby(df['Date'])
                     .transform('first') )

I got this after running the entire code :运行整个代码后我得到了这个:

    Value    Type       Date    YTD     YEP
0     100  Budget 2019-01-01  101.0   974.0
1      50  Budget 2019-02-01  199.0  1022.0
2      20  Budget 2019-03-01  275.0  1078.0
3     123  Budget 2019-04-01  332.0  1012.0
4      56  Budget 2019-05-01    NaN     NaN
5      76  Budget 2019-06-01    NaN     NaN
6      98  Budget 2019-07-01    NaN     NaN
7     126  Budget 2019-08-01    NaN     NaN
8      90  Budget 2019-09-01    NaN     NaN
9      80  Budget 2019-10-01    NaN     NaN
10     67  Budget 2019-11-01    NaN     NaN
11     87  Budget 2019-12-01    NaN     NaN
12    101  Actual 2019-01-01  101.0   974.0
13     98  Actual 2019-02-01  199.0  1022.0
14     76  Actual 2019-03-01  275.0  1078.0
15     57  Actual 2019-04-01  332.0  1012.0

I know that "transform" is like "apply".我知道“转换”就像“应用”。 But I don't get what it means to apply - or transform -with this parameter first .但是我不明白first此参数应用或transform意味着什么。 What does first do here combined with transform ?这里firsttransform结合做什么?

Thank you谢谢

  1. What does it mean 'first'? “第一”是什么意思?

    The parameter in the .transform() method may be a NumPy function, a string function name or a user defined function. .transform()方法中的参数可以是 NumPy 函数、字符串函数名称或用户定义的函数。 It means that in the line这意味着在该行

    .transform('first')

    it's a string function name .它是一个字符串函数名称 So it represents the function first() .所以它代表函数first()


  1. Where is the function first() come?函数first()哪里?

    It's a GroupBy method .first() .这是一个 GroupBy 方法.first()


  1. What does the function first() return?函数first()返回什么?

    It returns the first non- NaN value in a series, or NaN if there is none.它返回第一个非NaN在一系列或NaN ,如果是没有的。


  1. What does the method .transform() do?方法.transform()什么作用?

    It applies its parameter-function to every column (ie the series) of dataframe to obtain a new (transformed) column.它将其参数函数应用于数据帧的每一列(即系列)以获得新的(转换的)列。 Then it returns a dataframe consisting of such (transformed) columns.然后它返回一个由此类(转换后的)列组成的数据帧。

    In the case of series it returns — of course — a transformed series .在 series 的情况下,它返回——当然——一个转换后的series


  1. It means that function-parameter of .transform method must return a series with the same size?这意味着.transform方法的函数参数必须返回一个大小相同的系列?

    No, it is only one possibility.不,这只是一种可能。
    The other is a scalar — it will be broadcasted (repeated) to make a series with the same size.另一个是标量——它将被广播(重复)以制作具有相同大小的系列。

    The used function (the GroupBy method first() ) is a good example of such a function.使用的函数(GroupBy 方法first() )就是这种函数的一个很好的例子。


  1. So what does the method .transform('first') return?那么.transform('first')返回什么?

    It returns a series / dataframe with the same shape as the source group chunk, in which all values in every individual column are replaced with the first non- NaN value in this column, or with NaN if there is none.它返回一个与源组块具有相同形状的系列/数据帧,其中每个单独列中的所有值都替换为该列中的第一个非NaN,如果没有,则替换为NaN


Conclusion:结论:

The lines线条

                 .groupby(df['Date'])
                 .transform('first') 

first split your (intermediate) series into groups of individual dates and then — just before recombination — apply the first() function to every series in every group.首先将您的(中间)系列分成单独的日期组,然后 - 就在重组之前 - 将first()函数应用于每个组中的每个系列。

It effectively replaces every value in every group with the first non- NaN value in its series if such a value exists.如果存在这样的值,它会用其系列中的第一个非NaN值有效地替换每个组中的每个值。

This means that in the resulting series (your new column) will be all values of (intermediate) series replaced with the first non- NaN value in the same day (if such a value in the same day exists).这意味着在结果系列(您的新列)中,(中间)系列的所有值都将替换为同一天的第一个非NaN值(如果同一天存在这样的值)。

After grouping by dates ( groupby(df['Date'].dt.year) ), each value is changed to the value of the row where this date first appears.按日期分组后( groupby(df['Date'].dt.year) ),每个值都会更改为该日期第一次出现的行的值。 This changes the last value of the 'Actual' rows to the original values from 'Budget' rows.这会将'Actual'行的最后一个值更改为'Budget'行中的原始值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 “数字=(第一,第二)+休息”有什么作用? - What does “numbers = (first, second) + rest” do? pandas 重采样中的 first() 做什么? - what does first() do in pandas resample? sklearn.decomposition 中的 PCA 中的 fit、transform 和 fit_transform 有什么作用? - What does fit, transform, and fit_transform do in PCA available in sklearn.decomposition? imp.load_source方法的第一个参数是做什么的? - What does the first argument of the imp.load_source method do? 第一个变量在nparange(x,y,z)中做什么? - What does the first variable do in nparange(x, y, z)? Pygame.transform.scale 中的参数 DestSurface 是什么意思,我该如何使用它? - What does the parameter DestSurface in Pygame.transform.scale mean and how do I use it? 在文本分析的情况下,当我应用 fit() 方法时,究竟会发生什么? transform() 对文本数据做了什么? - In case of text analysis, when I apply fit() method, what exactly happens? And what does transform() do on the text data? sklearn.impute SimpleImputer:为什么transform()首先需要fit_transform()? - sklearn.impute SimpleImputer: why does transform() need fit_transform() first? 什么是“argv”,它有什么作用? - What is "argv", and what does it do? 什么 - >在python中做 - What does -> do in python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM