I have a CSV-file with about 28 columns and 4000 rows. From two of these columns i want to plot about 50 specific rows. I used pandas to select this part of the file, but i cannot figure out, how it reads the scientific numbers in a right way.
My code:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("20180416309.csv", sep=";")
x = df.loc[df[u'run#'] == 3, [u' Diameter']].values
y = df.loc[df[u'run#'] == 3, [u' dN/dlnD']].values
plt.plot(x, y)
plt.show
So, i am trying to plot the columns u' Diameter' and u' dN/dlnD' when in column u'run#' displays the number 3. Typing "x" or "y" in the IPython console, the right numbers are given.
Unfortunately, the plot looks like this:
As you can see, the decimal power of the scientific notation of these numbers on the y-axis is ignored. How can i fix this? This is my first try using matplotlib and pandas, so please excuse my beginner question.
Edit:
The file´s data looks like this:
run#; Diameter; dN/dlnD;
12; +3,58151E+01; +1,17336E+03;
13; +3,26913E+01; +6,06044E+03;
13; +2,98524E+01; +1,76516E+04;
13; +2,72704E+01; +4,88716E+04;
13; +2,49202E+01; +1,00035E+05;
Reading out my "x" or "y" data with the IPython console, the output is like this:
[' +1,94251E+02'],
[' +5,23981E+02'],
[' +0,00000E+00'],
[' +1,10525E+02'],
[' +0,00000E+00'],
[' +4,76363E+01'],
[' +1,61714E+01'],
[' +1,65482E+02'],
[' +0,00000E+00'],
[' +4,75312E+02'],
[' +4,20174E+01']], dtype=object)
SOLUTION:
As you pointed out, the comma was the problem. I simply added the decimal setting in the code:
df = pd.read_csv("test.csv", sep=";", decimal=",")
Now the graph looks like, how it is supposed to look.
Thank you!
It's clear that the csv data wasn't read correctly or more specifically as you expected. Based on your examples, all of your data was read as strings including the numbers. The reason is that the format of the numbers in your file will not be interpreted correctly depending on your locale. I modified the small snippet of data you provided so that the period and not the comma represents the decimal point which is customary in my locale. As you can see, the data is properly read into the dataframe.
df = pd.read_csv("d:\\users\\floyd\\documents\\sample.csv", sep=';'); df
Out[72]:
run# Diameter dN/dlnD
0 12 35.8151 1173.36
1 13 32.6913 6060.44
2 13 29.8524 17651.60
3 13 27.2704 48871.60
4 13 24.9202 100035.00
I also removed the annoying leading spaces in the column names with this.
df.columns = [col.strip() for col in df.columns]; df.columns
Now it plots properly.
plt.plot(df['Diameter'], df['dN/dlnD'])
Out[75]: [<matplotlib.lines.Line2D at 0x25ef97bd0b8>]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.