简体   繁体   中英

Plotting a scatter plot in python3 where x axis is latitude/longitude in km and y axis is depth

I am trying to find the best way to plot some data. Basically I have a data file that has the columns latitude, longitude, depth, sample_ID,Group_ID. I would like to generate a 2-D scatter plot where y is depth and x is distance in km from north to south (or have transect distances calculated relative to the first station sampled in the indicated orientation), similar to an ODV style map like the one below:

在此处输入图像描述

UPDATED

I wanted to add a little more information to my initial question. After some more searching and testing I found a possible solution in R using the geosphere package and the distGEO function to convert my coordinates to distance in km which then can be mapped. ( https://www.rdocumentation.org/packages/geosphere/versions/1.5-10/topics/distGeo )

If anyone knows a python way to do this though that'd be great!

UPDATED

ODV doesn't allow me to do the customization I need though. I would like to generate a plot like this where I can specify metadata variable to color the dots. To be more specific by the group_ID column in my data file seen in the example of my file below.

Latitude    Longitude   Depth_m Sample_ID   Group_ID
49.7225 -42.4467    10  S1  1
49.7225 -42.4467    50  S2  1
49.7225 -42.4467    75  S3  1
49.7225 -42.4467    101 S4  1
49.7225 -42.4467    152 S5  1
49.7225 -42.4467    199 S6  1
46.312  -39.658 10  S7  2
46.312  -39.658 49  S8  2
46.312  -39.658 73  S9  2
46.312  -39.658 100 S10 2
46.312  -39.658 153 S11 2
46.312  -39.658 198 S12 2

Its been giving me a lot of trouble trying to figure it out though. I have calculated distance between coordinates using the haversine calculation but once I get there I am not sure how to use those distances to incorporate into a scatter plot. This is what I have so far:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
#import haversine as hs
from math import radians
from sklearn.neighbors import DistanceMetric
df=pd.read_csv("locations.csv",sep="\t",index_col="Sample_ID")
#plt.scatter(df['Latitude'], df['Depth_m'])
#plt.show()
df['Latitude'] = np.radians(df['Latitude'])
df['Longitude'] = np.radians(df['Longitude'])
dist = DistanceMetric.get_metric('haversine')
x = dist.pairwise(np.unique(df[['Latitude','Longitude']].to_numpy(),axis=0))*6373
print(x)

This code lands me with a distance matrix for my coordinates but I honestly can't figure out how to take that and pull it in to a scatter plot that sets the x-axis from north to south. Especially since there are multiple depths with the same coordinate that have to be accounted for. Any help plotting is much appreciated!

For the distance calculation you can to use the geopy package, specifically geopy.distance.geodesic() , to calculate the distance along an arc by assuming a particular ellipsoid (eg WGS84).

To generate a plot similar to what you've described you can use the matplotlib library's scatterplot functionality, specifically matplotlib.pyplot.scatter() .

The code example below will step you through both the distance calculation (distance from some reference lat/long to another lat/long... this isn't necessarily the NS component but it's easy enough to calculate). As well as how to generate the scatter plot using your Group_ID field to colour the points using two methods.

import matplotlib.pyplot as plt
import geopy
import pandas as pd

# Load your sample data to a Pandas DataFrame where each column corresponds to
# 'Latitude', 'Longitude', 'Depth_m', 'Sample_ID', 'Group_ID'
datafile = r'<path to a file containing your data>'
df = pd.read_csv(datafile)

# Defining one end of our arc to calculate distance along (arbitrarily taking 
# the first point in the example data as the reference point).
ref_point = (df['Latitude'].iloc[0], df['Longitude'].iloc[0])

#  Loop over each sample location calculating the distance along the arc using
#  pygeo.distance.geodesic function.
dist = []
for i in range(len(df)):
    cur_point = (df['Latitude'].iloc[i], df['Longitude'].iloc[i])
    cur_geodesic = geopy.distance.geodesic(ref_point, cur_point)
    cur_dist = cur_geodesic.km
    dist.append(cur_dist)

# Add computed distances to the df DataFrame as column 'Distance_km'
df['Distance_km'] = dist

# Create a matplotlib figure and add two axes for plotting
fig = plt.figure()
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)

# Example 1: of creating a scatter plot using the calculated distance field and
# colouring the points using a numeric field (i.e. Group_ID in this case is numeric)
pts = ax1.scatter(df['Distance_km'], df['Depth_m'], s=30, c=df['Group_ID'], cmap=plt.cm.jet)
plt.colorbar(pts, ax=ax1)

ax1.set_xlabel('Arc Distance from Reference Point (km)')
ax1.set_ylabel('Depth (m)')
ax1.set_title('Colouring Points by a Numeric Field')
ax1.invert_yaxis()
ax1.grid(True)

# Example of creating basically the same scatter plot as above but handling the
# case of non-numeric values in the field to be used for colour (e.g. imagine 
# wanting to the the Sample_ID field instead)
groups = list(set(df['Group_ID'])) # get a list of the unique Group_ID values
for gid in groups:
    df_tmp = df[df['Group_ID'] == gid]
    ax2.scatter(df_tmp['Distance_km'], df_tmp['Depth_m'], s=30, label=gid)
    
ax2.legend(loc='upper center', title='Legend')
ax2.set_xlabel('Arc Distance from Reference Point (km)')
ax2.set_ylabel('Depth (m)')
ax2.set_title('Colouring Points with Using Categorical Values')
ax2.invert_yaxis()
ax2.grid(True)

fig.tight_layout()
plt.show()

And the output figure... 在此处输入图像描述

I am not sure what you are trying to with distance, but conceptually you need to get your x output into your dataframe as a new column as I have done.In terms of having a different color for groups, I would use seaborn for this as they have a hue parameter. Please see the output below of your first scatterplot and an attempt at what you are trying to do with your second scatterplot:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from math import radians
from sklearn.neighbors import DistanceMetric
import seaborn as sns
fig, ax = plt.subplots(nrows=2)
sns.scatterplot(data=df, x='Latitude', y='Depth_m', hue='Group_ID', ax=ax[0])
df['Latitude'] = np.radians(df['Latitude'])
df['Longitude'] = np.radians(df['Longitude'])
dist = DistanceMetric.get_metric('haversine')
df['Distance'] = (dist.pairwise(df[['Latitude','Longitude']].to_numpy())*6373)[0]
sns.scatterplot(data=df, x='Distance' , y='Depth_m', hue='Group_ID', ax=ax[1])
plt.show()

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM