[英]Add the same row multiple times from a pandas dataframe to a new one, each time altering a value in a specific column
I have a df like this:我有一个这样的df:
MEMBER_ID FirstName LastName I MONTH
0 1 John Doe 10 0
1 2 Mary Jones 15 0
2 3 Andy Right 8 0
I need to create a new df (df_new) which contains each row corresponding to a unique MEMBER_ID, replicated by the amount of times that is in the 'I' column, and the 'MONTH' column has to be filled from 0 and up to and including the value of 'I' in the original df.我需要创建一个新的 df (df_new),其中包含对应于唯一 MEMBER_ID 的每一行,按“I”列中的次数复制,并且“MONTH”列必须从 0 到最多填充并在原始 df 中包含“I”的值。 For example: first row (MEMBER_ID==1) has to be replicated 10 times (value of 'I') and the only difference would be the 'MONTH' column which will go from 0 to 10. After that the rows continue for the next unique value in the 'MEMBER_ID' column.例如:第一行 (MEMBER_ID==1) 必须被复制 10 次('I' 的值),唯一的区别是 'MONTH' 列,它将 go 从 0 到 10。之后行继续'MEMBER_ID' 列中的下一个唯一值。 So I need the df_new to look like this:所以我需要 df_new 看起来像这样:
MEMBER_ID FirstName LastName I MONTH
0 1 John Doe 10 0
1 1 John Doe 10 1
2 1 John Doe 10 2
3 1 John Doe 10 3
...
10 1 John Doe 10 10
11 2 Mary Jones 15 0
12 2 Mary Jones 15 1
13 2 Mary Jones 15 2
...
N-1 3 Andy Right 8 7
N 3 Andy Right 8 8
I have tried this but it gives me gibberish:我试过这个,但它给了我胡言乱语:
df_new=pd.DataFrame(columns=['MEMBER_ID','FirstName','LastName','I','MONTH'])
for i in range(len(df)):
max_i=df.iloc[i]["I"] #this gets the value in the "I" column
for j in range(0,max_i+1): #to append same row max_i+1 times since I need MONTH to start with 0
df_new.loc[i]=df.iloc[i] #this picks the whole row from the original df
df_new["MONTH"]=j #this assigns the value of each iteration to the MONTH column
df_new=df_new.append(df_new.loc[i],ignore_index=True)
Thank you for your help, dear community!感谢您的帮助,亲爱的社区!
I was able to fix the SettingWithCopyWarning with this:我能够通过以下方式修复 SettingWithCopyWarning:
index =0
for i in range(len(df)):
for j in range(df.iloc[i]["I"]+1):
row=df.iloc[i]
df_new=df_new.append(row,ignore_index=True)
df_new.at[index,'MONTH']=j
index+=1
df.head()
The problem is, that you overwrite df_new many times.问题是,您多次覆盖 df_new 。 This should work.这应该有效。 df
ist the old DataFrame df
是旧的 DataFrame
df_new = pd.DataFrame()
for member in range(len(df)): #iterate over every member
for count in range(df.iloc[member]['I']+1): # you want to add 'I'+1 rows
row = df.iloc[member] # select the row you want to add
row['MONTH'] = count #change the month-vale of the row to add
df_new = df_new.append(row,ignore_index=True) # add the row to the new DataFrame
df_new
Otherwise please show, what's wrong with the output.否则请显示,output 有什么问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.