简体   繁体   中英

Plot data from a row using column name as x axis in bokeh

I'm starting on a project where I want to create an interactive plot from this dataset:

这个数据集

For now I'm just trying to plot the first row from the 2000 to 2012 columns, for that I use this:

import pandas as pd
from bokeh.io import output_file
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
from bokeh.plotting import show

output_file('test.html')

df = pd.read_csv('Swedish_Population_Statistics.csv', encoding="ISO-8859-1")
df.dropna(inplace=True)  # Drop rows with missing attributes
df.drop_duplicates(inplace=True)  # Remove duplicates

# Drop all the column I don't use for now
df.drop(['region', 'marital_status', 'sex'], inplace=True, axis=1)

x = df.loc[[0]]

print(x)

Which gives me this dataframe

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
0 10406 10362 10322 10288 10336 10336 10429 10585 10608 10718 10860 11121 11288

Now I want to take the column names as x-axis and the row values as y-axis.

This is where I'm stuck.

I figure the code would look like this but can't figure what to put in x and y

x = df.columns.tolist() #Take columns names into a list
y = df.loc[[0]].values.tolist() # Take the first row
source = ColumnDataSource(x, y)

p = figure(title="Test")
p.line(x='x', y='y', source=source, line_color="blue", line_width=2)

I get this error:

BokehUserWarning: ColumnDataSource's columns must be of the same length. Current lengths: ('x', 13), ('y', 1)

I don't understand why the lengths are not the same as I used tolist() on both.

Any help would be very appreciated, I've been trying to find a solution for the past 3 hours with no success.

Okay so I found my problem, the main thing was that y was a 2-dimensional list but I needed a 1-d list. Which leads me to this this working code:

output_file('test.html')

df = pd.read_csv('Swedish_Population_Statistics.csv', encoding="ISO-8859-1")
df.dropna(inplace=True)  # Drop rows with missing attributes
df.drop_duplicates(inplace=True)  # Remove duplicates

# Drop all the column I don't use for now
df.drop(['region', 'marital_status', 'sex'], inplace=True, axis=1)

x = df.columns.tolist()
y = df.loc[[0]]
temp = []
temp2 = []

# Append each value of the dataframe row in a 1-dimension list one by one

for i in range(13):
    temp.append(y[str(2000+i)].tolist())
    temp2.append(temp[i][0])

p = figure(title="Test", sizing_mode="scale_both")
p.line(x, temp2, line_color="blue", line_width=2)
p.circle(x, temp2, fill_color="white", size=8)

show(p)

With this result:

Plot

no need to create a loop. You were on the right track but you should not use double brackets

>>> df.loc[0].values.tolist()
[111, 222, 333]

Then the dimensions of x and y are the same.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM