[英]How to dynamically generate Pipes in Python Pandas?
I'm building a data augmentation pipeline using dataframes.我正在使用数据框构建数据增强管道。 I've created a function,
h3_int
that takes an int input and appends a column of hex values to the dataframe.我创建了一个 function,
h3_int
,它接受一个 int 输入并将一列十六进制值附加到 dataframe。 Here's the implementation of h3_int
:这是
h3_int
的实现:
from h3.unstable import vect
def h3_int(df, level):
df['h3_' + str(level)] = vect.geo_to_h3(df.lat.values, df.lng.values, level).tolist()
return df
df
is comprised of a lng
and lat
column: df
由lng
和lat
列组成:
lat lng
0 43.64617 -79.42451
1 43.64105 -79.37628
2 43.66724 -79.41598
3 43.69602 -79.45468
4 43.66890 -79.32592
... ... ...
9515 36.10644 -115.16711
9516 36.00814 -115.17496
9517 36.10711 -115.16607
9518 36.03119 -115.05352
9519 36.13554 -115.11541
Here's the simple usage of h3_int
:这是
h3_int
的简单用法:
df.pipe(h3_int, 8)
Since the input is dynamic, I'd like to dynamically generate the pipes as well, but I've been having difficulty implementing this.由于输入是动态的,我也想动态生成管道,但我一直难以实现这一点。 The code:
编码:
(df.pipe(h3_int, i) for i in range(8, 10))
returns:返回:
<generator object <genexpr> at 0x7fd4858557b0>
while:尽管:
(df.pipe((h3_int, i) for i in range(8, 10)))
returns:返回:
TypeError: 'generator' object is not callable
What's the correct method for implementing dynamic pipes in pandas?在 pandas 中实现动态管道的正确方法是什么? Unfortunately I've found the documentation and SO lacking in answers.
不幸的是,我发现文档并没有答案。
Using list comprehension inside of parentheses returns a generator
, which is not indexable, as the error message indicates.如错误消息所示,在括号内使用列表推导会返回一个不可索引的
generator
。 Instead, you can use square brackets to create a list, which is indexable:相反,您可以使用方括号创建一个可索引的列表:
>>> [df.pipe(h3_int, i) for i in range(8, 9)][0]
lat lng h3_8
0 43.64617 -79.42451 613256717813153791
1 43.64105 -79.37628 613256717559398399
2 43.66724 -79.41598 613256718316470271
3 43.69602 -79.45468 613256716607291391
4 43.66890 -79.32592 613256718037549055
5 36.10644 -115.16711 613220086766895103
6 36.00814 -115.17496 613220073288499199
7 36.10711 -115.16607 613220086766895103
8 36.03119 -115.05352 613220075656183807
9 36.13554 -115.11541 613220087052107775
Note that df
was modified in place because your function h3_int
does not copy it before modifying it.请注意,
df
已就地修改,因为您的 function h3_int
在修改之前不会复制它。 That's not bad, it's just something to keep in mind.这还不错,只是需要牢记。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.