在Pandas数据框中重复行，但使用不同的ID

Question

我有一个pandas数据框，看起来像这样：

id    c1    c2    c3
100    2    7     4
100    3    4     1 
100    4    0     10
105    2    3     4
105    3    6     8
105    4    9     2
115    2    1     0
115    3    7     14
115    4    0     20

现在，我希望重复此数据帧的行，但要使用new_id = id + 10 ，如果原始数据帧中已存在此new_id ，则new_id = new_id(the repeated one) + 10

样品：

id    c1    c2    c3
100    2    7     4
100    3    4     1 
100    4    0     10
105    2    3     4
105    3    6     8
105    4    9     2    
115    2    1     0
115    3    7     14
115    4    0     20
## Repeated data
110    2    7     4
110    3    4     1 
110    4    0     10
##Since 115 already exists it shall now be 125, if 125 exists it shall be 135
125    2    3     4
125    3    6     8
125    4    9     2 
.
.
.

Answer 1

如果我正确理解了您的问题，请看一看。

d = {'id': [100,100,100,105,105,105,115,115,115], 
 'c1': [2,3,4,2,3,4,2,3,4], 
 'c2':[7,4,0,3,6,9,1,7,0], 
 'c3':[4,1,10,4,8,2,0,14,20]}

df = pd.DataFrame(data=d)

def IDcheck(uniqueID, ID):
  while(True):
    #Increasing the value of the ID by 10
    ID += 10
    #Checking if the new_id is contained within the uniqueID list
    if(((ID) in uniqueID) == True):
        #The new ID exists within the old IDS
        #Updating the value of ID
        ID += 10
    else:
        return ID


def updateRow(df):
   #Selecting unique values from the 'id' column
   uniqueID = df['id'].unique().tolist()

   for ID in uniqueID:    
      #Select all rows with the same 'id' 
      temp = df.loc[df['id'] == ID]

      #Getting the new ID value
      new_id = IDcheck(uniqueID, ID)

      #Updating the ID's in temp to the new_id value
      temp['id'] = new_id

      #Adding the temporary dataframe to the original
      df = df.append(temp, ignore_index=True)

  #Unsorted
  return df

  #Sorted
  #return df.sort_values(by=['id'])


 updateRow(df)

Answer 2

您可以先在ID列中添加10，如果新ID已存在，则再添加10。

(
    df.assign(id=df.id.add(10).add(df.id.add(10).isin(df.id).mul(10)))
    .pipe(lambda x: pd.concat([df, x]))
)

    id  c1  c2  c3
0   100 2   7   4
1   100 3   4   1
2   100 4   0   10
3   105 2   3   4
4   105 3   6   8
5   105 4   9   2
6   115 2   1   0
7   115 3   7   14
8   115 4   0   20
0   110 2   7   4
1   110 3   4   1
2   110 4   0   10
3   125 2   3   4
4   125 3   6   8
5   125 4   9   2
6   125 2   1   0
7   125 3   7   14
8   125 4   0   20

在Pandas数据框中重复行，但使用不同的ID

问题描述

2 个解决方案

解决方案1
1 2019-08-17 11:55:15

解决方案2
0 2019-08-17 12:03:16

在Pandas数据框中重复行，但使用不同的ID

问题描述

2 个解决方案

解决方案1 1 2019-08-17 11:55:15

解决方案2 0 2019-08-17 12:03:16

解决方案1
1 2019-08-17 11:55:15

解决方案2
0 2019-08-17 12:03:16