[英]increment number in this list comprehension
Consider the following DF. 考虑以下DF。
ID Name Week Course Hours
0 1 John A 1922 Bike Tech 5.5
1 2 John B 1922 Auto Tech 3.2
2 3 John C 1922 Prison 3.5
3 4 John D 1922 Comp 6.5
4 5 John E 1922 Awareness 7.0
5 6 John F 1922 First Aid 7.2
6 7 John G 1922 BasketBall 2.5
7 8 John H 1922 Tech 5.4
I'm using the following code to duplicate rows 我正在使用以下代码重复行
duplicate = [3 if val == 'Prison' else 1 for val in df.Course]
which is great, but I need to increment the Week Number for every duplication so John C would have 3 rows with Week 1922, 1923 and 1924. 很好,但是我需要为每个重复项增加周数,因此John C在1922、1923和1924周将有3行。
I've tried 我试过了
[3 if val == 'Prison' and df.Week +1 else 1 for val in df.Course]
and a few other basic chains but I can't figure this out. 以及其他一些基本链,但我无法弄清楚。
ID Name Week Course Hours
0 1 John A 1922 Bike Tech 5.5
1 2 John B 1922 Auto Tech 3.2
2 3 John C 1922 Prison 3.5
2 3 John C 1923 Prison 3.5
2 3 John C 1924 Prison 3.5
3 4 John D 1922 Comp 6.5
4 5 John E 1922 Awareness 7.0
5 6 John F 1922 First Aid 7.2
6 7 John G 1922 BasketBall 2.5
7 8 John H 1922 Tech 5.4
If I understand correctly, you could just create a helper dataframe of the rows you want duplicated, then increment the Week
number on that helper dataframe, then concatenate to your original: 如果我理解正确,则可以只创建要复制的行的帮助器数据框,然后在该帮助器数据框上增加Week
号,然后串联到原始数据:
helper = pd.concat([df.loc[df.Course == 'Prison']]*2)
helper['Week'] += helper.reset_index().index+1
df = pd.concat((df,helper)).sort_values('ID')
>>> df
ID Name Week Course Hours
0 1 John A 1922 Bike Tech 5.5
1 2 John B 1922 Auto Tech 3.2
2 3 John C 1922 Prison 3.5
2 3 John C 1923 Prison 3.5
2 3 John C 1924 Prison 3.5
3 4 John D 1922 Comp 6.5
4 5 John E 1922 Awareness 7.0
5 6 John F 1922 First Aid 7.2
6 7 John G 1922 BasketBall 2.5
7 8 John H 1922 Tech 5.4
Can pass a row , which is a pd.Series
with values compatible with your df
. 可以传递一行 ,它是一个pd.Series
,其值与df
兼容。 For example, take 例如拿
>>> row = df.loc[df.Course.eq('Prison'), :].iloc[0,:].copy()
ID 3
Name John C
Week 1922
Course Prison
Hours 3.5
Name: 2, dtype: object
Then 然后
def duplicate(n, row, df):
week = row['Week']
for i in range(1, n+1):
row['Week'] = week + i
df.loc[-i, :] = row
return df.sort_values('ID').reset_index(drop=True)
>>> duplicate(3, row, df )
ID Name Week Course Hours
0 1.0 John A 1922.0 Bike Tech 5.5
1 2.0 John B 1922.0 Auto Tech 3.2
2 3.0 John C 1922.0 Prison 3.5
3 3.0 John C 1923.0 Prison 3.5
4 3.0 John C 1924.0 Prison 3.5
5 3.0 John C 1925.0 Prison 3.5
6 4.0 John D 1922.0 Comp 6.5
7 5.0 John E 1922.0 Awareness 7.0
8 6.0 John F 1922.0 First Aid 7.2
9 7.0 John G 1922.0 BasketBall 2.5
10 8.0 John H 1922.0 Tech 5.4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.