I want to be able to update the values in a multi-indexed dataframe, using the output of a separate function that performs calculations on another existing dataframe.
Let's say for example, I have the following:
import numpy as np, pandas as pd
names = ['Johnson','Jackson','Smith']
attributes = ['x1','x2','x3','x4','x5']
categories = ['y1','y2','y3','y4','y5','y6']
index = pd.MultiIndex.from_product([names, attributes])
placeholders = np.zeros((len(names)*len(attributes), len(categories)), dtype=int)
df = pd.DataFrame(placeholders, index=index, columns=categories)
Which generates the corresponding dataframe:
y1 y2 y3 y4 y5 y6
Johnson x1 0 0 0 0 0 0
x2 0 0 0 0 0 0
x3 0 0 0 0 0 0
x4 0 0 0 0 0 0
x5 0 0 0 0 0 0
Jackson x1 0 0 0 0 0 0
x2 0 0 0 0 0 0
x3 0 0 0 0 0 0
x4 0 0 0 0 0 0
x5 0 0 0 0 0 0
Smith x1 0 0 0 0 0 0
x2 0 0 0 0 0 0
x3 0 0 0 0 0 0
x4 0 0 0 0 0 0
x5 0 0 0 0 0 0
Now, I have another function that generates a series of values that I want to then use to update this dataframe. For example:
x1 = pd.Series([2274, 556, 1718, 1171, 183, 194], index=categories)
x2 = pd.Series([627, 154, 473, 215, 68, 77], index=categories)
How would I go about updating the series values for ('Johnson','x1')
?
The vectors x1
and x2
are generated by calling the function inside of two nested for loops. I can't seem to figure out how to update the dataframe, the values just remain all zeros:
for i in names:
for j in attributes:
x1 = generate_data_list('x1')
df.loc[i,j].update(x1)
Appreciate any help!
Just assign x1
to df.loc[i, j]
:
df.loc['Johnson', 'x1'] = x1
Or:
df.loc[('Johnson', 'x1')] = x1
df
# y1 y2 y3 y4 y5 y6
#Johnson x1 2274 556 1718 1171 183 194
# x2 0 0 0 0 0 0
# x3 0 0 0 0 0 0
# x4 0 0 0 0 0 0
# x5 0 0 0 0 0 0
#Jackson x1 0 0 0 0 0 0
# x2 0 0 0 0 0 0
# x3 0 0 0 0 0 0
# x4 0 0 0 0 0 0
# x5 0 0 0 0 0 0
#Smith x1 0 0 0 0 0 0
# x2 0 0 0 0 0 0
# x3 0 0 0 0 0 0
# x4 0 0 0 0 0 0
# x5 0 0 0 0 0 0
You can create the information in the right format then using update
x1 = pd.DataFrame(data=[[2274, 556, 1718, 1171, 183, 194]], index=pd.MultiIndex.from_arrays([['Johnson'],['x1']]),columns=categories)
x1
y1 y2 y3 y4 y5 y6
Johnson x1 2274 556 1718 1171 183 194
df.update(x1)
df
y1 y2 y3 y4 y5 y6
Johnson x1 2274.0 556.0 1718.0 1171.0 183.0 194.0
x2 0.0 0.0 0.0 0.0 0.0 0.0
x3 0.0 0.0 0.0 0.0 0.0 0.0
x4 0.0 0.0 0.0 0.0 0.0 0.0
x5 0.0 0.0 0.0 0.0 0.0 0.0
Jackson x1 0.0 0.0 0.0 0.0 0.0 0.0
x2 0.0 0.0 0.0 0.0 0.0 0.0
x3 0.0 0.0 0.0 0.0 0.0 0.0
x4 0.0 0.0 0.0 0.0 0.0 0.0
x5 0.0 0.0 0.0 0.0 0.0 0.0
Smith x1 0.0 0.0 0.0 0.0 0.0 0.0
x2 0.0 0.0 0.0 0.0 0.0 0.0
x3 0.0 0.0 0.0 0.0 0.0 0.0
x4 0.0 0.0 0.0 0.0 0.0 0.0
x5 0.0 0.0 0.0 0.0 0.0 0.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.