在Pandas DataFrame中编号子序列

Question

I've got a readings DataFrame that consists of two columns, experiment and value . 我有一个由两列组成的读数 DataFrame： experiment和value 。 experiment keys into an experiments DataFrame; experiment键插入实验数据框； there are 500 rows in a row with the same experiment and different value representing 500 readings on the same experiment where the order in the DF is the order the data was taken. 一行中有500行具有相同的experiment而不同的value代表同一实验中的500个读数，其中DF中的顺序是获取数据的顺序。 Then 500 for the next experiment, etc. 然后500用于下一个实验，依此类推。

I want to look for time-based trends in the experiments, so I assume that I want to label each point pos in 0-499 and then groupby('pos') . 我想在实验中寻找基于时间的趋势，因此我假设我想在0-499中标记每个点pos ，然后再标记groupby('pos') 。 How do I create that pos column, an incrementing value that resets to 0 every time experiment resets? 如何创建该pos列，一个递增的值，每次experiment重置时该值都会重置为0？ Which is, I guess, the same as the number of rows that experiment has been constant for. 我猜这与experiment恒定行数相同。

Answer 1

If I understand you correctly... 如果我理解正确的话...

>>> df = pd.DataFrame({'Experiment' : [1,1,1,2,2,2,2,3,3,3], 
                       'Value' : np.random.randn(10)})
>>> df

   Experiment     Value
0           1 -0.924851
1           1 -0.599875
2           1  0.069982
3           2 -1.106909
4           2  0.463922
5           2  0.210568
6           2 -0.171456
7           3 -0.768618
8           3 -0.269928
9           3  0.055613

You will use groupby followed by cumcount() to get the desired effect: 您将在groupby之后使用cumcount()获得所需的效果：

>>> df['Position'] = df.groupby('Experiment').cumcount()
>>> df

   Experiment     Value  Position
0           1 -0.924851         0
1           1 -0.599875         1
2           1  0.069982         2
3           2 -1.106909         0
4           2  0.463922         1
5           2  0.210568         2
6           2 -0.171456         3
7           3 -0.768618         0
8           3 -0.269928         1
9           3  0.055613         2

在Pandas DataFrame中编号子序列

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-10-09 17:33:09

在Pandas DataFrame中编号子序列

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-10-09 17:33:09

解决方案1
1 已采纳 2017-10-09 17:33:09