[英]How put latitude and longitude in one column quickly?
I need to calculate distances between two data points ((lat1,lon1) and (lat2,lon2)).我需要计算两个数据点((lat1,lon1)和(lat2,lon2))之间的距离。
I found a way how to do it here :我在这里找到了一种方法:
import geopy.distance
coords_1 = (52.2296756, 21.0122287)
coords_2 = (52.406374, 16.9251681)
print geopy.distance.vincenty(coords_1, coords_2).km
As a result I need to convert latitude and longitude to one column I found a way here , however, it takes to much time.结果,我需要将纬度和经度转换为一列,我在这里找到了一种方法,但是,这需要很多时间。
df["point1"] = df[["lon1", "lat1"]].apply(Point, axis=1)
df["point2"] = df[["lon2", "lat2"]].apply(Point, axis=1)
Is there a faster solution?有更快的解决方案吗?
Try using geopandas.points_from_xy()
:尝试使用
geopandas.points_from_xy()
:
import geopandas
df['points1'] = geopandas.points_from_xy(df.lon1, df.lat1)
df['points2'] = geopandas.points_from_xy(df.lon2, df.lat2)
If it is still too slow, install pygeos
which will vectorize points_from_xy()
and speed it up more.如果它仍然太慢,请安装
pygeos
它将矢量化points_from_xy()
并加快速度。
If you want tuples of the form (x,y) you can do this:如果你想要 (x,y) 形式的元组,你可以这样做:
Imagine your dataframe looks like this:想象一下您的 dataframe 看起来像这样:
df = pd.read_csv(r"C:\users\k_sego\LatLong.csv", sep=";")
print(df)
Lat Lon
0 59.214735 18.062262
1 59.214735 18.062262
2 59.214735 18.062262
3 59.213542 18.063627
4 59.212553 18.064678
.. ... ...
70 59.199559 18.046147
71 59.199559 18.046147
72 59.199559 18.046147
73 59.198898 18.051291
74 59.199044 18.055571
Then然后
df['new_col'] = list(zip(df.Lat, df.Lon))
produces this:产生这个:
Lat Lon new_col
0 59.214735 18.062262 (59.214735, 18.062262)
1 59.214735 18.062262 (59.214735, 18.062262)
2 59.214735 18.062262 (59.214735, 18.062262)
3 59.213542 18.063627 (59.213542, 18.063627)
4 59.212553 18.064678 (59.212553, 18.064678)
.. ... ... ...
70 59.199559 18.046147 (59.199559, 18.046147)
71 59.199559 18.046147 (59.199559, 18.046147)
72 59.199559 18.046147 (59.199559, 18.046147)
73 59.198898 18.051291 (59.198898, 18.051291)
74 59.199044 18.055571 (59.199044, 18.055571)
If you want 'point' as a tuple -如果你想'点'作为一个元组 -
df['point1'] = list(zip(df['lat1'].values, df['lon1'].values))
If you want 'point' as a list -如果您想将“点”作为列表-
df['point1'] = list(map(list,zip(df['lat1'].values, df['lon1'].values)))
Performance Comparison ->性能比较 ->
%timeit geopandas.points_from_xy(df.D, df.B)
108 µs ± 2.55 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit list(map(list,zip(df['D'].values, df['B'].values)))
4.82 µs ± 12.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
As you can see if you use zip/list/map it'll be a lot faster.正如您所看到的,如果您使用zip/list/map ,它会快很多。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.