简体   繁体   English

使用 matplotlib 的 python 3.7 中的黑噪声

[英]black noise in python 3.7 using matplotlib

I was running this code in python 3.7:我在 python 3.7 中运行此代码:

import matplotlib.pylab as plt

LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'


def frequency_analysis(plain_text):

    #the text we analyise
    plain_text = plain_text.upper()

    #we use a dictionary to store the letter-frequency pair
    letter_frequency = {}

    #initialize the dictionary (of course with 0 frequencies)
    for letter in LETTERS:
        letter_frequency[letter] = 0

    #let's consider the text we want to analyse 
    for letter in plain_text:
        #we keep incrementing the occurence of the given letter
        if letter in LETTERS:
            letter_frequency[letter] += 1

    return letter_frequency

def plot_distribution(letter_frequency):
    centers = range(len(LETTERS))
    plt.xlabel("Letters")
    plt.ylabel("Numbers")
    plt.bar(centers, letter_frequency.values(), align='center', tick_label=letter_frequency.keys())
    plt.xlim([0,len(LETTERS)-1])
    plt.show()

if __name__ == "__main__":

    plain_text = "Shannon defined the quantity of information produced by a source for example, the quantity in a message by a formula similar to the equation that defines thermodynamic entropy in physics. In its most basic terms, Shannon's informational entropy is the number of binary digits required to encode a message. Today that sounds like a simple, even obvious way to define how much information is in a message. In 1948, at the very dawn of the information age, this digitizing of information of any sort was a revolutionary step. His paper may have been the first to use the word bit, short for binary digit. As well as defining information, Shannon analyzed the ability to send information through a communications channel. He found that a channel had a certain maximum transmission rate that could not be exceeded. Today we call that the bandwidth of the channel. Shannon demonstrated mathematically that even in a noisy channel with a low bandwidth, essentially perfect, error-free communication could be achieved by keeping the transmission rate within the channel's bandwidth and by using error-correcting schemes: the transmission of additional bits that would enable the data to be extracted from the noise-ridden signal. Today everything from modems to music CDs rely on error-correction to function. A major accomplishment of quantum-information scientists has been the development of techniques to correct errors introduced in quantum information and to determine just how much can be done with a noisy quantum communications channel or with entangled quantum bits (qubits) whose entanglement has been partially degraded by noise."
    frequencies = frequency_analysis(plain_text)
    plot_distribution(frequencies)

I am getting this output: It is having black noise in the x-axis.我得到这个输出:它在 x 轴上有黑噪声。

在此处输入图片说明

This is the output of the same code when I run it on python 2.7:这是我在 python 2.7 上运行相同代码时的输出:

在此处输入图片说明

The black noise does not appear in python 2.7 python 2.7中没有出现黑噪声

Is there any solution that can i remove the black noise in python 3.7有什么解决方案可以消除python 3.7中的黑噪声

The problem is in the ticklabels argument.问题出在 ticklabels 参数中。 In python 3.6, it is taken as dictionary and hence the labels appear in a weird overlap way.在 python 3.6 中,它被视为字典,因此标签以奇怪的重叠方式出现。 Just convert it to a list to resolve the problem.只需将其转换为list即可解决问题。

If you print letter_frequency.keys() in python 3.6 , you get如果你在python 3.6 print letter_frequency.keys() ,你会得到

dict_keys(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']) 

If you do the same in python 2.x , you will get如果你在python 2.x做同样的事情,你会得到

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']

Hence, if you are using python 3.6 , convert the letter_frequency.keys() to a list.因此,如果您使用的是python 3.6 ,请将letter_frequency.keys()转换为列表。 This post discusses this python version issue comprehensively. 这篇文章全面讨论了这个 python 版本问题。



Code代码

def plot_distribution(letter_frequency):
    centers = range(len(LETTERS))
    plt.xlabel("Letters")
    plt.ylabel("Numbers")
    plt.bar(centers, letter_frequency.values(), align='center', 
            tick_label=list(letter_frequency.keys())) # <--- list conversion
    plt.xlim([0,len(LETTERS)-1])

在此处输入图片说明

It's always a bit dangerous to rely on the order of a dictionary.依赖字典的顺序总是有点危险。 May I hence suggest the following solution, which is much shorter and would not require a sorted dictionary.因此,我可以建议以下解决方案,它要短得多并且不需要排序的字典。 It would work with python 2.7 or 3.5 or higher, but requires matplotlib >= 2.2.它适用于 python 2.7 或 3.5 或更高版本,但需要 matplotlib >= 2.2。

from collections import Counter
import matplotlib.pylab as plt

LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

def frequency_analysis(plain_text):
    return Counter(plain_text.replace(" ", "").upper())

def plot_distribution(letter_frequency):
    plt.xlabel("Letters")
    plt.ylabel("Numbers")
    plt.bar(list(LETTERS), [letter_frequency[c] for c in LETTERS], align='center')
    plt.show()

if __name__ == "__main__":
    plain_text = "Shannon defined the quantity of information produced by a source for example, the quantity in a message by a formula similar to the equation that defines thermodynamic entropy in physics. In its most basic terms, Shannon's informational entropy is the number of binary digits required to encode a message. Today that sounds like a simple, even obvious way to define how much information is in a message. In 1948, at the very dawn of the information age, this digitizing of information of any sort was a revolutionary step. His paper may have been the first to use the word bit, short for binary digit. As well as defining information, Shannon analyzed the ability to send information through a communications channel. He found that a channel had a certain maximum transmission rate that could not be exceeded. Today we call that the bandwidth of the channel. Shannon demonstrated mathematically that even in a noisy channel with a low bandwidth, essentially perfect, error-free communication could be achieved by keeping the transmission rate within the channel's bandwidth and by using error-correcting schemes: the transmission of additional bits that would enable the data to be extracted from the noise-ridden signal. Today everything from modems to music CDs rely on error-correction to function. A major accomplishment of quantum-information scientists has been the development of techniques to correct errors introduced in quantum information and to determine just how much can be done with a noisy quantum communications channel or with entangled quantum bits (qubits) whose entanglement has been partially degraded by noise."
    frequencies = frequency_analysis(plain_text)
    plot_distribution(frequencies)

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM