I have some data inside a specific period range (period 0, 1, 2...) and I would like to create values that are inside the periods that will get the difference of the values and divide for the total of periods that will be set;
For instance:
import pandas as pd
data = [{'metric': '3f00d0b5', 'time':52.66, 'time_order': 0, 'variable': 'var1', 'value': 0.035},
{'metric': '3f00d0b5', 'time':422.4, 'time_order': 1, 'variable': 'var1', 'value': 0.512},
{'metric': '3f00d0b5', 'time':620.1, 'time_order': 2, 'variable': 'var1', 'value': 0.0},
{'metric': '3f00d0b5', 'time':52.66, 'time_order': 0, 'variable': 'var2', 'value': 0.007},
{'metric': '3f00d0b5', 'time':422.4, 'time_order': 1, 'variable': 'var2', 'value': 0.012},
{'metric': '3f00d0b5', 'time':620.1, 'time_order': 2, 'variable': 'var2', 'value': 0.214},
{'metric': '83e7fdd1', 'time':25.42, 'time_order': 0, 'variable': 'var1', 'value': 0.0},
{'metric': '83e7fdd1', 'time':322.45, 'time_order': 1, 'variable': 'var1', 'value': 0.241},
{'metric': '83e7fdd1', 'time':678.12, 'time_order': 2, 'variable': 'var1', 'value': 0.005},
{'metric': '83e7fdd1', 'time':25.42, 'time_order': 0, 'variable': 'var2', 'value': 0.02},
{'metric': '83e7fdd1', 'time':322.45, 'time_order': 1, 'variable': 'var2', 'value': 0.007},
{'metric': '83e7fdd1', 'time':678.12, 'time_order': 2, 'variable': 'var2', 'value': 0.0}
]
df = pd.DataFrame.from_dict(data)
Based on the data above the final result I'm looking for is:
{'metric': '3f00d0b5', 'time':52.66, 'time_order': 0, 'variable': 'var1', 'value': 0.035},
{'metric': '3f00d0b5', 'time':52.66, 'time_order': 0.1, 'variable': 'var1', 'value': 0.083},
...
{'metric': '3f00d0b5', 'time':52.66, 'time_order': 0.9, 'variable': 'var1', 'value': 0.4643},
{'metric': '3f00d0b5', 'time':422.4, 'time_order': 1, 'variable': 'var1', 'value': 0.512},
There are a straightforward way to implement this in a pythonic way?
Thank you in advance, Leonardo
You can use groupby
and a custom function to augment the data:
def data_augment(df):
new_index = np.arange(df['time_order'].min(), df['time_order'].max()+0.1, 0.1)
return (df.set_index('time_order')['value']
.reindex(new_index).interpolate())
out = (df.groupby(['metric', 'variable']).apply(data_augment)
.stack().rename('value').reset_index()[df.columns])
Output:
>>> out
metric time_order variable value
0 3f00d0b5 0.0 var1 0.0350
1 3f00d0b5 0.1 var1 0.0827
2 3f00d0b5 0.2 var1 0.1304
3 3f00d0b5 0.3 var1 0.1781
4 3f00d0b5 0.4 var1 0.2258
.. ... ... ... ...
79 83e7fdd1 1.6 var2 0.0028
80 83e7fdd1 1.7 var2 0.0021
81 83e7fdd1 1.8 var2 0.0014
82 83e7fdd1 1.9 var2 0.0007
83 83e7fdd1 2.0 var2 0.0000
[84 rows x 4 columns]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.