如何在其他两列上创建熊猫数据框列循环？

Question

I have a Pandas dataframe and am at a bit of a loss with how to do what I am hoping to.我有一个 Pandas 数据框，但对如何做我希望做的事情有点茫然。 This is a snippet of the dataframe, and I am uploading a screenshot as well.这是数据框的片段，我也在上传屏幕截图。 Effectively, I would like to create a new column defined as pitches where count is '3--2'.实际上，我想创建一个新列，定义为计数为“3--2”的间距。

To do this, I would like to, loop through all rows.为此，我想循环遍历所有行。 For a given row (which I'll refer to as the original row), if prev_count == '3--2' I then want to对于给定的行（我将其称为原始行），如果prev_count == '3--2'然后我想

step down dataframe rows to where prev_count != '3--2'将数据帧行降低到prev_count != '3--2'
confirm that row has the same batter-pitcher identifier as the original row确认该行与原始行具有相同的batter-pitcher标识符
once in a row that satisfies the conditions prev_count != '3--2' AND batter-pitcher (original row) == batter-pitcher (new row), I would like to extract pitch_number of the new row连续一次满足条件prev_count != '3--2' batter-pitcher （原始行）== batter-pitcher （新行），我想提取新行的pitch_number
then would calculate a value for the new column in the original row using the formula:然后将使用以下公式计算原始行中新列的值：

pitch_number (original row) + 1 - pitch_number (new row) pitch_number （原始行）+ 1 - pitch_number （新行）

To use the existing dataframe as an example... indices 62, 4, 186, 87, and 252 would have a value of 1 for the new column.以现有数据框为例...索引 62、4、186、87 和 252 的新列的值为 1。 Index 171 would have a value of 3;索引 171 的值为 3； 177 a value of 2; 177 值为 2； and 192 a value of 1. Likewise, 191 would have a value of 5;和 192 的值为 1。同样，191 的值为 5； 229, 10, and 57 would also have values of 1 for this new column variable.对于这个新的列变量，229、10 和 57 的值也将为 1。

            player_name   batter-pitcher  pitch_number count prev_count
62   Graveman, Kendall  501303---608665             6  3--2       3--1
4          Smyly, Drew  608665---592767             6  3--2       2--2
186  Graveman, Kendall  592696---608665             8  3--2       2--2
87         Maton, Phil  621020---664208             6  3--2       3--1
252      Martin, Chris  514888---455119             6  3--2       2--2
171      Urquidy, José  624585---664353             8  3--2       3--2
177      Urquidy, José  624585---664353             7  3--2       3--2
192      Urquidy, José  624585---664353             6  3--2       3--1
191       García, Yimi  594807---554340            12  3--2       3--2
198       García, Yimi  594807---554340            11  3--2       3--2
209       García, Yimi  594807---554340            10  3--2       3--2
219       García, Yimi  594807---554340             9  3--2       3--2
229       García, Yimi  594807---554340             8  3--2       2--2
10     Valdez, Framber  592696---664285             6  3--2       2--2
57     Valdez, Framber  518692---664285             6  3--2       2--2

I am a bit at a loss as how to 1) loop through rows on a dataframe, and then 2) within each block of the loop, step down rows and 3) reference other columns in the dataframe within another row, so would really appreciate some guidance here.我有点不知所措，因为如何 1) 遍历数据帧上的行，然后 2) 在循环的每个块中，逐步减少行和 3) 在另一行中引用数据帧中的其他列，所以真的很感激这里有一些指导。 Thanks so much!非常感谢！

Answer 1

For your given dataset, I think this works.对于您给定的数据集，我认为这是有效的。 But it assumes your pitch counts are always incremented by one and you're not missing any data, otherwise this wouldn't work.但它假设您的音高计数总是加一并且您没有丢失任何数据，否则这将不起作用。 I'd suggest looking into cumcount(), cummax(), cummin() grouping on pitcher-batter.我建议在投手-击球手上查看 cumcount()、cummax()、cummin() 分组。

Column 'new1' is the final answer, column 'new' is just an intermediate step. 'new1' 列是最终答案，'new' 列只是一个中间步骤。

# get dataframe into right order
df.sort_values(by=['batter-pitcher', 'pitch_number'], ascending=[True, False], inplace=True)


df['new'] = df.groupby(['batter-pitcher', 'prev_count'])['count'].cumcount(ascending=False) + 1

df['new1'] = np.where((df['new']==1) & (df['prev_count']!='3--2'), 1, df['new']+1)

           player_name   batter-pitcher  pitch_number count prev_count  new  new1
62   Graveman, Kendall  501303---608665             6  3--2       3--1    1     1
4          Smyly, Drew  608665---592767             6  3--2       2--2    1     1
186  Graveman, Kendall  592696---608665             8  3--2       2--2    1     1
87         Maton, Phil  621020---664208             6  3--2       3--1    1     1
252      Martin, Chris  514888---455119             6  3--2       2--2    1     1
171      Urquidy, José  624585---664353             8  3--2       3--2    2     3
177      Urquidy, José  624585---664353             7  3--2       3--2    1     2
192      Urquidy, José  624585---664353             6  3--2       3--1    1     1
191       García, Yimi  594807---554340            12  3--2       3--2    4     5
198       García, Yimi  594807---554340            11  3--2       3--2    3     4
209       García, Yimi  594807---554340            10  3--2       3--2    2     3
219       García, Yimi  594807---554340             9  3--2       3--2    1     2
229       García, Yimi  594807---554340             8  3--2       2--2    1     1
10     Valdez, Framber  592696---664285             6  3--2       2--2    1     1
57     Valdez, Framber  518692---664285             6  3--2       2--2    1     1

如何在其他两列上创建熊猫数据框列循环？

问题描述

1 个解决方案

解决方案1
0 2021-11-12 02:49:55

如何在其他两列上创建熊猫数据框列循环？

问题描述

1 个解决方案

解决方案1 0 2021-11-12 02:49:55

解决方案1
0 2021-11-12 02:49:55