I've got the following dataframes setup
A
[ClaimantId], [ClaimId], [LenderId], [IsWorked] 1 1 1 1 1 2 4 0 1 3 3 1 2 6 1 1
B
[ClaimantId], [Forename], [Surname]
1 Bruce Wayne
2 Peter Parker
My Desired output would be
[ClaimantId], [Forename], [Surname], [C1], [C2], [C3], [L1], [L2], [L3], [W1], [W2], [W3]
1 Bruce Wayne 1 2 3 1 4 3 1 0 1
2 Peter Parker 6 Nan Nan 1 Nan Nan 1 Nan Nan
I'm not sure what i can apply to this, the number of C/L/W columns has an upper limit of 20 and won't ever be exceeded.
I'd really appreciate any help.
Thanks,
Use:
d = {'ClaimId':'C', 'LenderId':'L','IsWorked':'W'}
df = (A.rename(columns=d)
.set_index(['ClaimantId',A.groupby('ClaimantId').cumcount()])
.unstack())
df.columns = [f'{i}{j+1}' for i, j in df.columns]
print (df)
C1 C2 C3 L1 L2 L3 W1 W2 W3
ClaimantId
1 1.0 2.0 3.0 1.0 4.0 3.0 1.0 0.0 1.0
2 6.0 NaN NaN 1.0 NaN NaN 1.0 NaN NaN
df1 = B.join(df, on='ClaimantId')
print (df1)
ClaimantId Forename Surname C1 C2 C3 L1 L2 L3 W1 W2 \
0 1 Bruce Wayne 1.0 2.0 3.0 1.0 4.0 3.0 1.0 0.0
1 2 Peter Parker 6.0 NaN NaN 1.0 NaN NaN 1.0 NaN
W3
0 1.0
1 NaN
Explanation :
rename
columns by dictset_index
by counter Series
created by cumcount
unstack
list comprehension
with f-string
sjoin
second DataFrame
EDIT:
If need same length all columns use reindex
by new MultiIndex
created by range
:
d = {'ClaimId':'C', 'LenderId':'L','IsWorked':'W'}
df = (A.rename(columns=d)
.set_index(['ClaimantId',A.groupby('ClaimantId').cumcount()])
.unstack())
mux = pd.MultiIndex.from_product([df.columns.get_level_values(0).unique(), range(5)])
df = df.reindex(columns=mux, fill_value=0)
df.columns = [f'{i}{j+1}' for i, j in df.columns]
print (df)
C1 C2 C3 C4 C5 L1 L2 L3 L4 L5 W1 W2 W3 W4 \
ClaimantId
1 1.0 2.0 3.0 0 0 1.0 4.0 3.0 0 0 1.0 0.0 1.0 0
2 6.0 NaN NaN 0 0 1.0 NaN NaN 0 0 1.0 NaN NaN 0
W5
ClaimantId
1 0
2 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.