I have dataframe
and Pivot Table and I need to replace some values in dataframe
from pivot_table's columns.
Dataframe:
access_code ID cat1 cat2 cat3
g1gw8bzwelo83mhb 0433a3d29339a4b295b486e85874ec66 1 2
g0dgzfg4wpo3jytg 04467d3ae60fed134077a26ae33e0eae 1 2
g1gwui6r2ep471ht 06e3395c0b64a3168fbeab6a50cd8f18 1 2
g05ooypre5l87jkd 089c81ebeff5184e6563c90115186325 1
g0ifck11dix7avgu 0d254a81dca0ff716753b67a50c41fd7 1 2 3
Pivot Table:
type 1 2 \
access_code ID member_id
g1gw8bzwelo83mhb 0433a3d29339a4b295b486e85874ec66 1045794 1023 923 1 122
g05ooypre5l87jkd 089c81ebeff5184e6563c90115186325 768656 203 243 1 169
g1gwui6r2ep471ht 06e3395c0b64a3168fbeab6a50cd8f18 604095 392 919 1 35
g06q0itlmkqmz5cv f4a3b3f2fca77c443cd4286a4c91eedc 1457307 243 1
g074qx58cmuc1a2f 13f2674f6d5abc888d416ea6049b57b9 5637836 1
g0dgzfg4wpo3jytg 04467d3ae60fed134077a26ae33e0eae 5732738 111 2343 1
Desire output:
access_code ID cat1 cat2 cat3
g1gw8bzwelo83mhb 0433a3d29339a4b295b486e85874ec66 1023 923
g0dgzfg4wpo3jytg 04467d3ae60fed134077a26ae33e0eae 111 2343
g1gwui6r2ep471ht 06e3395c0b64a3168fbeab6a50cd8f18 392 919
g05ooypre5l87jkd 089c81ebeff5184e6563c90115186325 1
g0ifck11dix7avgu 0d254a81dca0ff716753b67a50c41fd7 1 2 3
If I use
df.ix[df.cat1 == 1] = pivot_table['1']
It returns error ValueError: cannot set using a list-like indexer with a different length than the value
As long as your dataframe is not exceedingly large, you can make it happen in some really ugly ways. I am sure someone else will provide you with a more elegant solution, but in the meantime this duct tape might point you in the right direction.
Keep in mind that in this case I did this with 2 dataframes instead of 1 dataframe and 1 pivot table, as I already had enough trouble formatting the dataframes from the textual data.
As there are empty fields in your data and my dataframes did not like this, first convert the empty fields to zeros.
df = df.replace(r'\s+', 0, regex=True)
Now ensure that your data is actually floats, else the comparisons will fail
df[['cat1', 'cat2', 'cat3']] = df[['cat1', 'cat2', 'cat3']].astype(float)
And for the fizzly fireworks:
df.cat1.loc[df.cat1 == 1] = piv['1'].loc[df.loc[df.cat1 == 1].index].dropna()
df.cat1 = df.cat1.fillna(1)
df.cat2.loc[df.cat2 == 2] = piv['2'].loc[df.loc[df.cat2 == 2].index].dropna()
df.cat2 = df.cat2.fillna(2)
df = df.replace(0, ' ')
The fillna is just to recreate your intended output, in which you clearly did not process some lines yet. I guess this column-by-column NaN-filling will not happen in your actual use.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.