简体   繁体   English

使用Python将元组列表转换为直方图/ barchart

[英]Using Python to turn a list of tuples into a histogram/barchart

I'm working with a dataset that is encoded like so: 我正在使用这样编码的数据集:

[
    [
        (u'90000', 100318), 
        (u'21000', 58094), 
        (u'50000', 14695), 
        (u'250000', 8190), 
        (u'100000', 5718), 
        (u'40000', 4276)
    ]
]

I'd like to transmogrify it into a histogram/barchart. 我想将其转化为直方图/ barchart。

I've been looking at XXX, thus far I've tried like this: 我一直在看XXX,到目前为止,我已经尝试过这样的操作:

fig, ax = plt.subplots()
ax.yaxis.set_major_formatter(formatter)
plt.bar(x, counts)
plt.xticks(counts[0], counts[1])

plt.xticks(rotation=70)
plt.show()

However, it generated the error: 但是,它产生了错误:

NameError: name 'formatter' is not defined

The code used to generate that data structure looks like this: 用于生成该数据结构的代码如下所示:

with open('toy_two.json', 'rb') as inpt:

    dict_hash_gas = list()
    for line in inpt:
        resource = json.loads(line)
        dict_hash_gas.append({resource['first']:resource['second']})

# Count up the values
counts = collections.Counter(v for d in dict_hash_gas for v in d.values())

counts = counts.most_common()

# Apply a threshold
threshold = 4275
counts = [list(group) for val, group in itertools.groupby(counts, lambda x: x[1] > threshold) if val]

print(counts)

And the data like this: 像这样的数据:

{"first":"A","second":"1","third":"2"} 
{"first":"B","second":"1","third":"2"} 
{"first":"C","second":"2","third":"2"} 
{"first":"D","second":"3","third":"2"} 
{"first":"E","second":"3","third":"2"} 
{"first":"F","second":"3","third":"2"} 
{"first":"G","second":"3","third":"2"} 
{"first":"H","second":"4","third":"2"} 
{"first":"I","second":"4","third":"2"} 
{"first":"J","second":"0","third":"2"} 
{"first":"K","second":"0","third":"2"} 
{"first":"L","second":"0","third":"2"} 
{"first":"M","second":"0","third":"2"} 
{"first":"N","second":"0","third":"2"} 

The question: 问题:

To be clear, the question is: how to render the data at the beginning of this post, ie 明确地说,问题是:如何在这篇文章的开头呈现数据,即

[
    [
        (u'90000', 100318), 
        (u'21000', 58094), 
        (u'50000', 14695), 
        (u'250000', 8190), 
        (u'100000', 5718), 
        (u'40000', 4276)
    ]
]

as a histogram? 作为直方图?

The x-axis would be u'90000' , u'21000' , ..., u'40000' . x轴为u'90000'u'21000' ,..., u'40000'

The y-axis would be 100318 , 58094 , ..., 4276 . y轴将是10031858094 ,..., 4276

data = [
    [
        (u'90000', 100318), 
        (u'21000', 58094), 
        (u'50000', 14695), 
        (u'250000', 8190), 
        (u'100000', 5718), 
        (u'40000', 4276)
    ]
]

Transpose the data to get the x and y values 转置数据以获取x和y值

#data = data[0]
#x, y = zip(*data)
x, y = zip(*data[0])

Compress the y values so they will fit on the screen 压缩y值,使其适合屏幕

import math
y = [int(math.log(n, 1.5)) for n in y]

Iterate over the data and create the histogram 遍历数据并创建直方图

for label, value in zip(x, y):
    print('{:>10}: {}'.format(label, 'x'*value))


>>>
     90000: xxxxxxxxxxxxxxxxxxxxxxxxxxxx
     21000: xxxxxxxxxxxxxxxxxxxxxxxxxxx
     50000: xxxxxxxxxxxxxxxxxxxxxxx
    250000: xxxxxxxxxxxxxxxxxxxxxx
    100000: xxxxxxxxxxxxxxxxxxxxx
     40000: xxxxxxxxxxxxxxxxxxxx
>>> 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM