[英]pandas merge rows into previous row
I want to merge all blank rows such as 4,5,6 into 3 and 8,9,10 into 7 and so on...我想将所有空白行(例如 4、5、6 合并到 3 中,将 8、9、10 合并到 7 中,依此类推...
I tried this but getting wrong answer.我试过了但是得到了错误的答案。
dataframe=dataframe.replace(r'^\s*$', np.nan, regex=True)
dataframe=dataframe[:-1].fillna(method='ffill')
final_data = dataframe[dataframe['Value Date'].notna()]
final_data = final_data.fillna(0)
x = final_data.groupby(['Booking','Debit','Credit', 'Balance'])['Text'].apply(','.join)
x = x.reset_index()
Help me with it.帮我处理一下。
My solution starts as @Manakin, assuming the dict is called data
:我的解决方案以@Manakin 开头,假设 dict 被称为data
:
Timestamp = pd.Timestamp
NaT = pd.NaT
df = pd.DataFrame(data)
and then:接着:
(df.groupby((df['Balance'] != '').cumsum())
.agg(dict(dict.fromkeys(df,'first'), **{'Booking Text':', '.join}))
)
produces产生
Balance Booking Date Txn Date Booking Text Value Date Debit Credit Balance
--------- ------------------- ------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------- ---------- -------------- --------------
1 2018-03-31 00:00:00 2018-03-31 00:00:00 Initial Balance NaT 0.00
2 2018-01-04 00:00:00 2018-01-04 00:00:00 VALUE DATED BALANCE BROUGHT FORWARD 2018-01-04 00:00:00 180,431,640.00 180,431,640.00
3 2018-06-04 00:00:00 2018-06-04 00:00:00 INTEREST-FIXED TERM LOAN, Contract No: 3001-AA18091ZN72C|Interest rate:, 0.810000%|Capital: 1,000,000,000.00|Period: 16.03.2018 -, 06.04.2018|Days: 21/360 2018-06-04 00:00:00 472,500.00 179,959,140.00
4 2018-06-04 00:00:00 2018-06-04 00:00:00 INTEREST-FIXED TERM LOAN, Contract No: 3001-AA1809166QNF|Interest rate: 2018-06-04 00:00:00 315,000.00 179,644,140.00
B anking, 0.810000%|Capital: 1,000,000,000.00|Period: 23.03.2018 -, 06.04.2018|Days: 14/360
5 2018-06-04 00:00:00 2018-06-04 00:00:00 FOREX SPOT, EUR/JPY 130.7271 2018-06-04 00:00:00 472,500.00 180,116,640.00
6 2018-06-04 00:00:00 2018-06-04 00:00:00 FOREX SPOT, EUR/JPY 130.7021 2018-06-04 00:00:00 315,000.00 180,431,640.00
7 2018-09-04 00:00:00 2018-09-04 00:00:00 INTEREST-FIXED TERM LOAN, Contract No: 3001-AA18091NTT8N|Interest rate:, 0.810000%|Capital: 500,000,000.00|Period: 26.03.2018 -, 09.04.2018|Days: 14/360, , 2018-09-04 00:00:00 157,500.00 180,274,140.00
8 2018-09-04 00:00:00 2018-09-04 00:00:00 FOREX SPOT 2018-09-04 00:00:00 157,500.00 180,431,640.00
EUR/JPY 131.1407
9 2018-10-04 00:00:00 2018-10-04 00:00:00 INTEREST-FIXED TERM LOAN, , Contract No: 3001-AA18091NL9XG|Interest rate:, 0.810000%|Capital: 500,000,000.00|Period: 27.03.2018 - 2018-10-04 00:00:00 157,500.00 180,274,140.00
10.04.2018|Days: 14/360
Here we groupby rows that have 'Balance' empty to the row above, and then apply first
aggregation to all columns except to 'Booking Text'
where we concatenate strings together在这里,我们将“余额”为空的行分组到上面的行,然后将first
聚合应用于所有列,但'Booking Text'
除外,我们将字符串连接在一起
From your dict, assuming it's called d
从你的字典,假设它叫做d
we can create a new key
to groupby and join the string columns.我们可以为 groupby 创建一个新key
并加入字符串列。
import pandas as pd
Timestamp = pd.Timestamp, NaT = pd.NaT
df = pd.DataFrame.from_dict(d)
df['key'] = (df['Booking Date'].isnull() & df['Booking Date'].shift(1).isnull()==False).astype(int).cumsum()
s = df.groupby('key')['Booking Text'].agg(' '.join)
s.index = s.index - 1
df1 = df.dropna(subset=['Booking Date']).copy()
df1['Booking Text'] = df1['Booking Text'] + ' ' + df1.pop('key').map(s)
print(df1)
Booking Date Txn Date Booking Text \
1 2018-03-31 2018-03-31 Initial Balance VALUE DATED BALANCE BROUGHT FO...
2 2018-01-04 2018-01-04 VALUE DATED BALANCE BROUGHT FORWARD INTEREST-F...
3 2018-06-04 2018-06-04 INTEREST-FIXED TERM LOAN Contract No: 3001-AA1...
7 2018-06-04 2018-06-04 INTEREST-FIXED TERM LOAN Contract No: 3001-AA1...
11 2018-06-04 2018-06-04 FOREX SPOT EUR/JPY 130.7271
13 2018-06-04 2018-06-04 FOREX SPOT EUR/JPY 130.7021
15 2018-09-04 2018-09-04 INTEREST-FIXED TERM LOAN Contract No: 3001-AA1...
21 2018-09-04 2018-09-04 FOREX SPOT\nEUR/JPY 131.1407 INTEREST-FIXED TE...
22 2018-10-04 2018-10-04 INTEREST-FIXED TERM LOAN Contract No: 3001-AA...
Value Date Debit Credit Balance
1 NaT 0.00
2 2018-01-04 180,431,640.00 180,431,640.00
3 2018-06-04 472,500.00 179,959,140.00
7 2018-06-04 315,000.00 179,644,140.00
11 2018-06-04 472,500.00 180,116,640.00
13 2018-06-04 315,000.00 180,431,640.00
15 2018-09-04 157,500.00 180,274,140.00
21 2018-09-04 157,500.00 180,431,640.00
22 2018-10-04 157,500.00 180,274,140.00
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.