[英]Merge two rows pandas dataframe
這是一種解決您的問題的方法:
df[['State_new', 'Solution_new']] = df[['Power State', 'Recommended Solution']].shift()
mask = ~df['State_new'].isna()
df.loc[mask, 'State'] = df.loc[mask, 'State_new']
df.loc[mask, 'Recommended Solutuin'] = df.loc[mask, 'Solution_new']
df = df.drop(columns=['State_new', 'Solution_new', 'Power State', 'Recommended Solution'])[~df['State'].isna()].reset_index(drop=True)
解釋:
State
和Recommended Solutuin
列的內容(注意:使用 OP 問題中的原始列標簽逐字逐句地使用來自您的代碼的更新數據包含在移位列中reset_index
創建一個沒有間隙的新整數范圍索引。如果有幫助,這里是從 Excel 中提取數據框的示例代碼:
import pandas as pd
df = pd.read_excel('TestBook.xlsx', sheet_name='TestSheet', usecols='AD:AM')
這是輸入數據框:
MAC RLC RLC 2 PDCCH Down PDCCH Uplink Unnamed: 34 Recommended Solutuin State Power State Recommended Solution
0 122.9822 7119.503 125.7017 1186.507 784.9464 NaN Downtitlt antenna serving cell is overshooting NaN NaN
1 4.1000 7119.503 24.0000 11.000 51.0000 NaN Downtitlt antenna serving cell is overshooting NaN NaN
2 121.8900 2127.740 101.3300 1621.000 822.0000 NaN uptilt antenna bad coverage NaN NaN
3 86.5800 2085.250 94.6400 1650.000 880.0000 NaN uptilt antenna bad coverage NaN NaN
4 64.7500 1873.540 63.8600 1259.000 841.0000 NaN uptilt antenna bad coverage NaN NaN
5 84.8700 1735.070 60.3800 1423.000 474.0000 NaN uptilt antenna bad coverage NaN NaN
6 49.3400 1276.190 59.9600 1372.000 450.0000 NaN uptilt antenna bad coverage NaN NaN
7 135.0200 2359.840 164.1300 1224.000 704.0000 NaN NaN NaN Bad Power Check hardware etc.
8 135.0200 2359.840 164.1300 1224.000 704.0000 NaN uptilt antenna bad coverage NaN NaN
9 163.7200 1893.940 90.0300 1244.000 753.0000 NaN NaN NaN Bad Power Check hardware etc.
10 163.7200 1893.940 90.0300 1244.000 753.0000 NaN uptilt antenna bad coverage NaN NaN
11 129.6400 1163.140 154.3200 663.000 798.0000 NaN NaN NaN Bad Power Check hardware etc.
12 129.6400 1163.140 154.3200 663.000 798.0000 NaN uptilt antenna bad coverage NaN NaN
這是示例輸出:
MAC RLC RLC 2 PDCCH Down PDCCH Uplink Unnamed: 34 Recommended Solutuin State
0 122.9822 7119.503 125.7017 1186.507 784.9464 NaN Downtitlt antenna serving cell is overshooting
1 4.1000 7119.503 24.0000 11.000 51.0000 NaN Downtitlt antenna serving cell is overshooting
2 121.8900 2127.740 101.3300 1621.000 822.0000 NaN uptilt antenna bad coverage
3 86.5800 2085.250 94.6400 1650.000 880.0000 NaN uptilt antenna bad coverage
4 64.7500 1873.540 63.8600 1259.000 841.0000 NaN uptilt antenna bad coverage
5 84.8700 1735.070 60.3800 1423.000 474.0000 NaN uptilt antenna bad coverage
6 49.3400 1276.190 59.9600 1372.000 450.0000 NaN uptilt antenna bad coverage
7 135.0200 2359.840 164.1300 1224.000 704.0000 NaN Check hardware etc. Bad Power
8 163.7200 1893.940 90.0300 1244.000 753.0000 NaN Check hardware etc. Bad Power
9 129.6400 1163.140 154.3200 663.000 798.0000 NaN Check hardware etc. Bad Power
您可以使用 groupby 按列組合行:
df = pd.DataFrame(data)
new_df = df.groupby(['MAC', 'RLC1', 'RLC2', 'POCCH', 'POCCH Up']).sum()
new_df.reset_index()
您可以執行以下操作:
fill_cols = ['Power State', 'Recommended Solution 2']
dup_cols = ['MAC_UL','RLC_Through_1','RLC_Through_2','PDCCH Down', 'PDCCH Up']
m = df.duplicated(subset=dup_cols, keep=False)
df_fill = df.loc[m,fill_cols]
df_fill[df_fill['Power State']==''] = np.NaN
df_fill[df_fill['Recommended Solution 2']==''] = np.NaN
df.loc[m,fill_cols]=df_fill.ffill()
duplicated
獲取重復的行ffill
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.