[英]How to apply function to dataframe based on row value
Consider the table below:考虑下表:
Value1 Value2
1 1
1 2
1 3
2 7
2 8
2 9
... ...
100 1
100 2
100 3
The column Value1 contains sets of triplicates for numbers 1 to 100. The column Value2 contains random numbers.列 Value1 包含数字 1 到 100 的三元组。列 Value2 包含随机数。 I am wondering how to apply a function for each set of triplicates and append their results together.
我想知道如何将 function 和 append 它们的结果一起应用。
For example, consider an example function called "Interpolate", which linearly interpolates each group of triplicates into five rows, affecting only the column Value2.例如,考虑一个名为“Interpolate”的示例 function,它将每组三元组线性插入五行,仅影响列 Value2。 The result would be:
结果将是:
Value1 Value2
1 1
1 1.5
1 2
1 2.5
1 3
2 7
2 7.5
2 8
2 8.5
2 9
... ...
You can groupby
"Value1" and apply a function to each group:您可以按“Value1”分组并将
groupby
应用于每个组:
def interpolate(x):
# change index to even numbers
x.index = range(0, 2*len(x), 2)
# add NaN rows for odd indices
x = x.reindex(range(2*len(x)-1))
# interpolate
x = x.interpolate()
return x
out = df.groupby('Value1').apply(interpolate).droplevel(0)
The above code as a one-liner:上面的代码作为单行代码:
out = df.groupby('Value1').apply(lambda x: x.set_axis(range(0, 2*len(x), 2)).reindex(range(2*len(x)-1))).interpolate().droplevel(0)
Output: Output:
Value1 Value2
0 1.0 1.0
1 1.0 1.5
2 1.0 2.0
3 1.0 2.5
4 1.0 3.0
0 2.0 7.0
1 2.0 7.5
2 2.0 8.0
3 2.0 8.5
4 2.0 9.0
0 100.0 1.0
1 100.0 1.5
2 100.0 2.0
3 100.0 2.5
4 100.0 3.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.