简体   繁体   中英

Generating a scatter plot in Matplotlib with negative and positive axes

I am working on a project that plots clinical values using Matplotlib and want to display a y-axis with both negative and positive values going from -3 to 3. I'm getting the data from a DataFrame.

An example of the data I'm trying to plot:

analyte_name = ['Uric Acid - Basic', 'Urea', 'Triglycerides - Basic', 'Sodium', 'Potassium - Basic', 'Glucose - Basic', 'Gamma Glutamytranferase - Basic', 'Creatinine - Basic', 'Cholesterol Total - Basic', 'Cholesterol LDL - Basic', 'Cholesterol HDL - Basic', 'Chloride - Basic']
z_scores = ['-0.10', '-0.60', '-0.01', '-0.77', '-12.95', '-0.55', '-0.58', '-0.37', '-0.07', '0.19', '0.88', '0.69']

This is what I could come up with:

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np

df = pd.DataFrame({'x_':analyte_names, 'y_':z_scores})
fig = plt.figure()
ax = fig.add_subplot(111)

ax.set_xlabel('analyte name')
ax.set_ylabel('z-score')

# plt.axhline(0, color='black')
plt.ylim(-3, 3)
plt.xticks(rotation=90)
plt.scatter('x_', 'y_' ,data=df, marker='o')
# plt.style.use('seaborn-dark')
plt.show()

But this gives me a plot that looks like this:

y-axis plotted in sequence from z_scores[0] onwards but not displaying all z_scores

在此输入图像描述

Commenting out the plt.ylim(-3, 3) line gives me an image like this:

y-axis plotted in sequence from z_scores[0] onwards and displaying all z_score but in sequence

在此输入图像描述

The code I'm using is modified from one I tried using before which was:

df = pd.DataFrame({'x_':['A','B','C','D','E'], 
'y_':np.random.uniform(-3,3,5)})

fig = plt.figure()
ax = fig.add_subplot(111)

# ax.spines['top'].set_visible(False)
# ax.spines['right'].set_visible(False)

ax.set_xlabel('sample')
ax.set_ylabel('z-score')

plt.axhline(0, color='black')
plt.ylim(-3, 3)
plt.scatter('x_', 'y_' ,data=df, marker='o')
# plt.style.use('seaborn-dark')
plt.show()

That code generated what I want my final output to look like before some slight styling:

y axis with negative an positive values

在此输入图像描述

I've been trying to use different methods to pass the data to the x and y axis like passing it as a dictionary but the results have been the same.

I'm still learning how to plot data and hope to can get some help.

Thanks.

Your problem is because your z-scores are stored as strings. Matplotlib clearly doesn't interpret these as a numeric and just plots a straight line of the two 'categorical variables' against each other. To fix the issue convert your z-scores to floats:

import numpy as np

# convert to numpy arrays
analyte_name = np.array(['Uric Acid - Basic', 'Urea', 'Triglycerides - Basic', 'Sodium', 'Potassium - Basic', 'Glucose - Basic', 'Gamma Glutamytranferase - Basic', 'Creatinine - Basic', 'Cholesterol Total - Basic', 'Cholesterol LDL - Basic', 'Cholesterol HDL - Basic', 'Chloride - Basic'])
z_scores = np.array(['-0.10', '-0.60', '-0.01', '-0.77', '-12.95', '-0.55', '-0.58', '-0.37', '-0.07', '0.19', '0.88', '0.69'])

# plot, converting your z-scores to floats
plt.plot(analyte_name, z_scores.astype(float))

This will fix your problem!

Without converting them to floats I got this image:

zscores_as_strings

When converted you can see things are being plotted correctly:

z_scores_as_float

Edit:

You can see the reason it only plots 4 data points when you call plt.ylim(-3, 3) because it doesn't have any numerical points on the y-axis and so has no concept of this range. Therefore, it just plots the -3-->3 data points (ie, the 0th, 1st, 2nd and 3rd data points).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM