[英]Merge two rows pandas dataframe
这是一种解决您的问题的方法:
df[['State_new', 'Solution_new']] = df[['Power State', 'Recommended Solution']].shift()
mask = ~df['State_new'].isna()
df.loc[mask, 'State'] = df.loc[mask, 'State_new']
df.loc[mask, 'Recommended Solutuin'] = df.loc[mask, 'Solution_new']
df = df.drop(columns=['State_new', 'Solution_new', 'Power State', 'Recommended Solution'])[~df['State'].isna()].reset_index(drop=True)
解释:
State
和Recommended Solutuin
列的内容(注意:使用 OP 问题中的原始列标签逐字逐句地使用来自您的代码的更新数据包含在移位列中reset_index
创建一个没有间隙的新整数范围索引。如果有帮助,这里是从 Excel 中提取数据框的示例代码:
import pandas as pd
df = pd.read_excel('TestBook.xlsx', sheet_name='TestSheet', usecols='AD:AM')
这是输入数据框:
MAC RLC RLC 2 PDCCH Down PDCCH Uplink Unnamed: 34 Recommended Solutuin State Power State Recommended Solution
0 122.9822 7119.503 125.7017 1186.507 784.9464 NaN Downtitlt antenna serving cell is overshooting NaN NaN
1 4.1000 7119.503 24.0000 11.000 51.0000 NaN Downtitlt antenna serving cell is overshooting NaN NaN
2 121.8900 2127.740 101.3300 1621.000 822.0000 NaN uptilt antenna bad coverage NaN NaN
3 86.5800 2085.250 94.6400 1650.000 880.0000 NaN uptilt antenna bad coverage NaN NaN
4 64.7500 1873.540 63.8600 1259.000 841.0000 NaN uptilt antenna bad coverage NaN NaN
5 84.8700 1735.070 60.3800 1423.000 474.0000 NaN uptilt antenna bad coverage NaN NaN
6 49.3400 1276.190 59.9600 1372.000 450.0000 NaN uptilt antenna bad coverage NaN NaN
7 135.0200 2359.840 164.1300 1224.000 704.0000 NaN NaN NaN Bad Power Check hardware etc.
8 135.0200 2359.840 164.1300 1224.000 704.0000 NaN uptilt antenna bad coverage NaN NaN
9 163.7200 1893.940 90.0300 1244.000 753.0000 NaN NaN NaN Bad Power Check hardware etc.
10 163.7200 1893.940 90.0300 1244.000 753.0000 NaN uptilt antenna bad coverage NaN NaN
11 129.6400 1163.140 154.3200 663.000 798.0000 NaN NaN NaN Bad Power Check hardware etc.
12 129.6400 1163.140 154.3200 663.000 798.0000 NaN uptilt antenna bad coverage NaN NaN
这是示例输出:
MAC RLC RLC 2 PDCCH Down PDCCH Uplink Unnamed: 34 Recommended Solutuin State
0 122.9822 7119.503 125.7017 1186.507 784.9464 NaN Downtitlt antenna serving cell is overshooting
1 4.1000 7119.503 24.0000 11.000 51.0000 NaN Downtitlt antenna serving cell is overshooting
2 121.8900 2127.740 101.3300 1621.000 822.0000 NaN uptilt antenna bad coverage
3 86.5800 2085.250 94.6400 1650.000 880.0000 NaN uptilt antenna bad coverage
4 64.7500 1873.540 63.8600 1259.000 841.0000 NaN uptilt antenna bad coverage
5 84.8700 1735.070 60.3800 1423.000 474.0000 NaN uptilt antenna bad coverage
6 49.3400 1276.190 59.9600 1372.000 450.0000 NaN uptilt antenna bad coverage
7 135.0200 2359.840 164.1300 1224.000 704.0000 NaN Check hardware etc. Bad Power
8 163.7200 1893.940 90.0300 1244.000 753.0000 NaN Check hardware etc. Bad Power
9 129.6400 1163.140 154.3200 663.000 798.0000 NaN Check hardware etc. Bad Power
您可以使用 groupby 按列组合行:
df = pd.DataFrame(data)
new_df = df.groupby(['MAC', 'RLC1', 'RLC2', 'POCCH', 'POCCH Up']).sum()
new_df.reset_index()
您可以执行以下操作:
fill_cols = ['Power State', 'Recommended Solution 2']
dup_cols = ['MAC_UL','RLC_Through_1','RLC_Through_2','PDCCH Down', 'PDCCH Up']
m = df.duplicated(subset=dup_cols, keep=False)
df_fill = df.loc[m,fill_cols]
df_fill[df_fill['Power State']==''] = np.NaN
df_fill[df_fill['Recommended Solution 2']==''] = np.NaN
df.loc[m,fill_cols]=df_fill.ffill()
duplicated
获取重复的行ffill
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.