简体   繁体   中英

Python, pandas data frame, conditional formatting for coordinates

I have got data frame witch has coordinates on it (recorded route). Data frame structure is something like this (has more columns):

No Latitude Longitude Altitude Speed Course Date Time etc..

0 59.303758 18.078915 NaN 0.0 114.9 2017/04/01 13:21:48

1 59.303758 18.078915 -8.5 0.0 114.9 2017/04/01 13:21:49

2 59.303758 18.078915 -8.5 0.0 114.9 2017/04/01 13:21:50

.

.

and list goes on...

I'm trying to parse unwanted points from the data frame. Example at picture. Red line represents coordinate points from data frame, i want to get only points on the Greenish fields.

Route

Example code:

#north
y_1n=59.33551 #point 1 latitude
x_1n=18.02649 #point 1 longitude
y_2n=59.33327 #point 2 latitude
x_2n=18.04500 #point 2 longitude
#south
y_1s=59.33478 #point 3 latitude
x_1s=18.02645 #point 3 longitude
y_2s=59.33246 #point 4 latitude
x_2s=18.04422 #point 4 longitude
#
test = df1[(df1['Latitude'] <= y_1n) & (df1['Latitude'] >= y_2n) &
            (df1['Latitude'] <= y_1s) & (df1['Latitude'] >= y_2s) &
            (df1['Longitude'] >= x_1n) & (df1['Longitude'] <= x_2n) &
            (df1['Longitude'] >= x_1s) & (df1['Longitude'] <= x_2s)
          ]

So the idea is that only the data, inside these predefined 2 North and 2 South points (coordinate points) are included in the new data frame.

With that code i managed to parse the data, but it was faraway from the North & South points (only half of the street was included). So it did over parse it or something odd happened..

Is there some better or efficient way to do this?

The rectangle isn't aligned with longitude and latitude, so you can't use your simple long/lat check. A simple way to do this would be to consider a line from a given longitude/latitude, and extend it several miles (Some amount much larger than the rectangle) in a random direction (Probably a cardinal direction for ease).

Then, write an intersect function intersect(Point1, Point2, Point3, Point4) that returns true if Line(P1, P2) intersects Line(P1, P2). Then, with your extended line, check how many edges of your bounding box that it intersects. If the answer is one, then you're good, you're inside of the box.

I did solve this following way..

First i created Geopandas Dataframe and used Shapely to create polygon. Then i added the polygon to the dataframe. Also added location to correspond the polygon.

import geopandas as gpd
from shapely.geometry import Point, Polygon, LineString
polygon = gpd.GeoDataFrame()
coord = [(18.02649,59.33551),(18.04500,59.33327),(18.02645,59.33478), 
         (18.04422,59.33246)]

polygon.loc[0, 'geometry'] = Polygon(coord)
polygon.loc[0, 'Location'] = 'Fleminggatan'

Then i made copy from the Pandas DataFrame to Geopandas Dataframe.

df2 = gpd.GeoDataFrame(df1)

After that i made new series to the DataFrame witch combined Latitude & Longitude series.

df2['geometry'] = [Point(xy) for xy in zip(df2.Longitude, df2.Latitude)]

Then i used Geopandas Spatial Join. (op) doesn't matter in this cos i'm joining points to polygon. if these were lines it would make a difference.

df3 = gpd.sjoin(df2,polygon, how='inner', op='intersects')

After this i was left with DataFrame with data in the location wanted.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM