如何根據列的值擴展 DataFrame 行？

Question

我有一個像這樣的數據幀：

df = pd.DataFrame({'Column 1': ['a', 'a', 'b', 'c'],
                  'Column 2': [2, 2, 3, 4],
                  'Column 3': [100, 110, 120, 130]}
                  )

>
  Column 1  Column 2  Column 3
0        a         2       100
1        a         2       110
2        b         3       120
3        c         4       130

我需要一個像這樣的新 DF：

df = pd.DataFrame({'Column 1': ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'c'],
                  'New Column': ['a1', 'a2', 'a3', 'a4', 'b1', 'b2', 'b3', 'c1', 'c2', 'c3', 'c4'],
                  'Column 3': [100, 100, 110, 110, 120, 120, 120, 130, 130, 130, 130]}
                  )

   Column 1 New Column  Column 3
0         a         a1       100
1         a         a2       100
2         a         a3       110
3         a         a4       110
4         b         b1       120
5         b         b2       120
6         b         b3       120
7         c         c1       130
8         c         c2       130
9         c         c3       130
10        c         c4       130

我使用 itterrows 進行了 2 個循環，並按“關鍵”第 1 列到第 3 列進行分組，但是運行時間很長，而且可能不是最佳解決方案，所以我想知道是否有更好的方法。

Answer 1

使用index.repeat + loc根據Column 2的數字reset_index ，然后reset_index轉換為唯一范圍索引。 然后insert的New Column到df使用groupby cumcount ：

# Scale up the DataFrame
df = df.loc[df.index.repeat(df.pop('Column 2'))].reset_index(drop=True)
# Insert new column in the correct place
df.insert(
    1, 'New Column',
    # Create New Column based on new Column 1 Values
    df['Column 1'] + df.groupby('Column 1').cumcount().add(1).astype(str)
)

df ：

   Column 1 New Column  Column 3
0         a         a1       100
1         a         a2       100
2         a         a3       110
3         a         a4       110
4         b         b1       120
5         b         b2       120
6         b         b3       120
7         c         c1       130
8         c         c2       130
9         c         c3       130
10        c         c4       130

如何根據列的值擴展 DataFrame 行？

問題描述

1 個解決方案

解決方案1
1 已采納 2021-10-21 13:49:11

如何根據列的值擴展 DataFrame 行？

問題描述

1 個解決方案

解決方案1 1 已采納 2021-10-21 13:49:11

解決方案1
1 已采納 2021-10-21 13:49:11