简体   繁体   English

使用python bokeh进行Python数据可视化

[英]Python Data visualization with python bokeh

Recently I start working on data visualization with bokeh library. 最近,我开始使用bokeh库进行数据可视化。 my task is to take a CSV data an turn it to graph via python. 我的任务是获取CSV数据,然后通过python将其转换为图形。 i'm facing some issues here. 我在这里面临一些问题。 below is my environment structure and problem. 下面是我的环境结构和问题。

Environment 环境

  • python = 2.7.14 蟒蛇= 2.7.14
  • bokeh = 0.12.13 散景= 0.12.13

problem Description 问题描述

I need to take a data from CSV file named by "data.csv". 我需要从以“ data.csv”命名的CSV文件中获取数据。 my file structure is look like: Id, upbyte, downbyte,time "timestamp". 我的文件结构如下:id,upbyte,downbyte,时间“时间戳”。 I need assistance to drow the data with figure.multi_line. 我需要协助以图.multi_line拖出数据。 i toke my chance but still the data not coming like i wanted. 我抓住了机会,但数据仍然不如我所愿。

My_Code: 我的代码:

def run_graph():
df = pandas.read_csv("/Users/path/fetch_data.csv",parse_dates["StatTime"])
p = Figure(width=500, height=250, x_axis_type="datetime", responsive=True, 
    tools="pan, box_zoom, wheel_zoom, save, reset",logo =None, 
    title="Graph:", x_axis_label="Time Frame", y_axis_label="Traffic")

timeFrame = df["Time"]
upbyte = df["up"]
downbyte = df["Down"]
protocolname = df["Name"]

p.multi_line(x = [timeFrame, upbyte], y = [timeFrame, downbyte], color=['Red', 'green'], line_width=1)
p.circle(x = [timeFrame, upbyte], y = [timeFrame, downbyte], fill_color='orange', size=6)

output_file("/Users/path/graph.html", title="Reports")

show(p)


run_graph()

Error 错误

The script error is: Error:TypeError: multiline() takes exactly 3 arguments (1 given) 脚本错误是:Error:TypeError:multiline()恰好接受3个参数(给定1个)

i hope my question was clear for everyone. 我希望我的问题对每个人都清楚。 If not please let me know to provide you with more details. 如果没有,请告诉我,以为您提供更多详细信息。 Thank you in advance Gent's. 预先感谢您根特的。

I think you want to plot the upbytes and downbytes both with x-axis as time stamp. 我认为您想以x轴作为时间戳绘制上行字节和下行字节。 I see that your data has multiple records for each timestamp. 我看到您的数据在每个时间戳记中都有多个记录。 I just added a few more rows to make graph a bit more understandable - 我只添加了几行以使图形更易于理解-

在此处输入图片说明

To get the graph correct, use the code - 要使图形正确无误,请使用以下代码-

p = figure(width=500, height=250, x_axis_type="datetime",  
    tools="pan, box_zoom, wheel_zoom, save, reset",logo =None, 
    title="OTT Traffic Utilization Graph:", x_axis_label="Time Frame", y_axis_label="Traffic Utilization")
p.multi_line(xs = [timeFrame, timeFrame], ys = [upbyte, downbyte], color=['Red', 'green'], line_width=1)
p.circle(x = timeFrame, y = upbyte, fill_color='red', size=6)
p.circle(x = timeFrame, y = downbyte, fill_color='green', size=6)
show(p)

在此处输入图片说明

multi_line requires all the xs of different series and and all ys of different series as list of lists. multi_line需要不同系列的所有xs和不同系列的所有y作为列表​​列表。 So your Xs are just the repeat of timestamps. 因此,您的X只是时间戳的重复。

Also, you want to highlight the points using circles. 另外,您想使用圆圈突出显示点。 For that you need to use circle method twice, as it doesn't provide any such option as multi_circle. 为此,您需要使用circle方法两次,因为它不提供multi_circle这样的选项。

Now, I guess you want to first summarize your data at timestamp level and then plot. 现在,我想您想先在时间戳级别汇总数据,然后再绘制。 If you plot summarized data, it will look like this - 如果您绘制汇总数据,它将看起来像这样- 在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM