简体   繁体   中英

Pandas: replace values in dataframe from pivot_table

I have dataframe and Pivot Table and I need to replace some values in dataframe from pivot_table's columns.

Dataframe:

       access_code                                ID cat1 cat2 cat3 
 g1gw8bzwelo83mhb  0433a3d29339a4b295b486e85874ec66    1    2             

 g0dgzfg4wpo3jytg  04467d3ae60fed134077a26ae33e0eae    1    2             

 g1gwui6r2ep471ht  06e3395c0b64a3168fbeab6a50cd8f18    1    2              

 g05ooypre5l87jkd  089c81ebeff5184e6563c90115186325    1                  

 g0ifck11dix7avgu  0d254a81dca0ff716753b67a50c41fd7    1    2    3

Pivot Table:

type                                                              1      2                                                                                                               \ 
access_code      ID                               member_id         
g1gw8bzwelo83mhb 0433a3d29339a4b295b486e85874ec66 1045794        1023   923                                     1                 122      
g05ooypre5l87jkd 089c81ebeff5184e6563c90115186325 768656         203    243                              1                 169   
g1gwui6r2ep471ht 06e3395c0b64a3168fbeab6a50cd8f18 604095         392    919                              1                  35   
g06q0itlmkqmz5cv f4a3b3f2fca77c443cd4286a4c91eedc 1457307        243                          1                       
g074qx58cmuc1a2f 13f2674f6d5abc888d416ea6049b57b9 5637836                                       1                       
g0dgzfg4wpo3jytg 04467d3ae60fed134077a26ae33e0eae 5732738        111      2343                               1                      

Desire output:

       access_code                                ID cat1 cat2 cat3 
 g1gw8bzwelo83mhb  0433a3d29339a4b295b486e85874ec66  1023  923             

 g0dgzfg4wpo3jytg  04467d3ae60fed134077a26ae33e0eae  111   2343             

 g1gwui6r2ep471ht  06e3395c0b64a3168fbeab6a50cd8f18  392   919                  

 g05ooypre5l87jkd  089c81ebeff5184e6563c90115186325    1                  

 g0ifck11dix7avgu  0d254a81dca0ff716753b67a50c41fd7    1    2    3

If I use

df.ix[df.cat1 == 1] = pivot_table['1']

It returns error ValueError: cannot set using a list-like indexer with a different length than the value

As long as your dataframe is not exceedingly large, you can make it happen in some really ugly ways. I am sure someone else will provide you with a more elegant solution, but in the meantime this duct tape might point you in the right direction.

Keep in mind that in this case I did this with 2 dataframes instead of 1 dataframe and 1 pivot table, as I already had enough trouble formatting the dataframes from the textual data.

As there are empty fields in your data and my dataframes did not like this, first convert the empty fields to zeros.

df = df.replace(r'\s+', 0, regex=True)

Now ensure that your data is actually floats, else the comparisons will fail

df[['cat1', 'cat2', 'cat3']] = df[['cat1', 'cat2', 'cat3']].astype(float)

And for the fizzly fireworks:

df.cat1.loc[df.cat1 == 1] = piv['1'].loc[df.loc[df.cat1 == 1].index].dropna()
df.cat1 = df.cat1.fillna(1)

df.cat2.loc[df.cat2 == 2] = piv['2'].loc[df.loc[df.cat2 == 2].index].dropna()
df.cat2 = df.cat2.fillna(2)

df = df.replace(0, ' ')

The fillna is just to recreate your intended output, in which you clearly did not process some lines yet. I guess this column-by-column NaN-filling will not happen in your actual use.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM