簡體   English   中英

我在 Bokeh 中嘗試使用 ColumnDataSource 時遇到錯誤

[英]I'm getting and error trying to use ColumnDataSource in Bokeh

我收到此錯誤:

TypeError: Object of type Interval is not JSON serializable

這是我的代碼。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math

from bokeh.io import output_file, show
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.models import NumeralTickFormatter


def construct_labels(start, end):
    labels = []
    for index, x in enumerate(start):
        y = end[index]
        labels.append('({}, {}]'.format(x, y))
    return labels


values = {'Length': np.random.uniform(0, 4, 10)}

df = pd.DataFrame(values, columns=['Length'])

bin_step_size = 0.5

# List of bin points.
p_bins = np.arange(0, (df['Length'].max() + bin_step_size), bin_step_size)

# Reduce the tail to create the left side bounds.
p_left_limits = p_bins[:-1].copy()

# Cut the head to create the right side bounds.
p_right_limits = np.delete(p_bins, 0)

# Create the bins.
p_range_bins = pd.IntervalIndex.from_arrays(p_left_limits, p_right_limits)

# Create labels.
p_range_labels = construct_labels(p_left_limits, p_right_limits)

p_ranges_binned = pd.cut(
    df['Length'],
    p_range_bins,
    labels=p_range_labels,
    precision=0,
    include_lowest=True)

out = p_ranges_binned
counts = out.value_counts(sort=False)

total_element_count = len(df.index)
foo = pd.DataFrame({'bins': counts.index, 'counts': counts})
foo.reset_index(drop=True, inplace=True)
foo['percent'] = foo['counts'].apply(lambda x: x / total_element_count)
foo['percent_full'] = foo['counts'].apply(lambda x: x / total_element_count * 100)

bin_labels = p_range_labels

# Data Container
source = ColumnDataSource(dict(
    bins=foo['bins'],
    percent=foo['percent'],
    count=foo['counts'],
    labels=pd.DataFrame({'labels': bin_labels})
))

p = figure(x_range=bin_labels, plot_height=600, plot_width=1200, title="Range Counts",
           toolbar_location=None, tools="")

p.vbar(x='labels', top='percent', width=0.9, source=source)

p.yaxis[0].formatter = NumeralTickFormatter(format="0.0%")
p.xaxis.major_label_orientation = math.pi / 2
p.xgrid.grid_line_color = None
p.y_range.start = 0

output_file("bars.html")
show(p)

錯誤來自這里:

source = ColumnDataSource(dict(
    bins=foo['bins'],
    percent=foo['percent'],
    count=foo['counts'],
    labels=pd.DataFrame({'labels': bin_labels})
))

您傳入的bins是一種無法 JSON 序列化的interval類型。

查看您的代碼后,您的繪圖中未使用此bins變量,因此您可以將其更改為:

source = ColumnDataSource(dict(
    percent=foo['percent'],
    count=foo['counts'],
    labels=bin_labels
))

請注意,我還將您的標簽更改為bin_labels ,這是一個列表, ColumnDataSource可以使用列表作為輸入。 但是您可能想要進一步格式化這些標簽,因為現在它們就像

['(0.0, 0.5]',
 '(0.5, 1.0]',
 '(1.0, 1.5]',
 '(1.5, 2.0]',
 '(2.0, 2.5]',
 '(2.5, 3.0]',
 '(3.0, 3.5]',
 '(3.5, 4.0]']

您可能希望將它們格式化為更漂亮的格式。

在這個小小的改變之后,你應該能夠看到你的條形圖:

在此處輸入圖片說明

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM