我可以根据 python 中的列表创建直方图/条形图吗？ typeError：预期 str，找到列表

Question

I am using matplotlib, pandas and gensim.我正在使用 matplotlib、pandas 和 gensim。 I am trying to create a histogram based on frequent words by extracting text directly from a website.我正在尝试通过直接从网站中提取文本来创建基于常用词的直方图。 I am receiving a typeError in this instance:在这种情况下我收到一个类型错误：

text = ','.join(map(str, description_list))
word_frequency = Counter(" ".join(description_list[0]).split()).most_common(10)

from this part of my code:从我的代码的这一部分：

#start of problems
data = {
    "description": [text_corpus]
    }

df = pd.DataFrame(data)
description_list = df['description'].values.tolist()

text = ','.join(map(str, description_list))
word_frequency = Counter(" ".join(description_list[0]).split()).most_common(10)

# `most_common` returns a list of (word, count) tuples
words = [word for word, _ in word_frequency]
counts = [counts for _, counts in word_frequency]

plt.bar(words, counts)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show()

print(text)

Here is the initial part of my code, which works in extracting textual data from a website:这是我的代码的初始部分，用于从网站提取文本数据：

from urllib.request import urlopen
from bs4 import BeautifulSoup
import pprint
from re import X
import string
from tokenize import Token
from collections import Counter
import matplotlib.pyplot as plt
import pandas as pd

url = "https://www.bbc.com/news/world-us-canada-61294585"
html = urlopen(url).read()
soup = BeautifulSoup(html, features="html.parser")

# kill all script and style elements
for script in soup(["script", "style"]):
    script.extract()   

# get text
text = soup.get_text()

document = text 

text_corpus = [text]

# Create a set of frequent words
stoplist = set('for a of the and to in'.split(' '))
# Lowercase each document, split it by white space and filter out stopwords
texts = [[word for word in document.lower().split() if word not in stoplist]
         for document in text_corpus]

# Count word frequencies
from collections import defaultdict
frequency = defaultdict(int)
for text in texts:
    for token in text:
        frequency[token] += 1

# Only keep words that appear more than once
text_corpus = [[token for token in text if frequency[token] > 1] for text in texts]
pprint.pprint(text_corpus)

I am new to Python so please any advice will help.我是 Python 的新手，所以请提供任何建议。 Please let me know If I have something fundamentally wrong with my code, and If i have to restart.如果我的代码有根本性的错误，请告诉我，如果我必须重新启动。

Or if not, if i could be pointed in the right direction in creating graphs from frequent words would be much appreciated or how to convert this particular list into a string.或者，如果没有，如果我能指出正确的方向来从常用词创建图表，或者如何将这个特定列表转换为字符串，我将不胜感激。

Additional question: Would it be better to search for specific words from a website instead of extracting all text?附加问题：从网站上搜索特定词而不是提取所有文本会更好吗？

Thank you very much.非常感谢你。

Answer 1

As soon as you've got text_corpus you may proceed as follows:一旦你有了text_corpus ，你就可以进行如下操作：

#url = "https://stackoverflow.com/questions/72091588/can-i-create-a-histogram-bar-graph-from-a-list-in-python-typeerror-expected-st"

counter = Counter(text_corpus[0]).most_common(10)
words, counts = list(zip(*counter))
plt.bar(words, counts)

Answer 2

You missed a [0] , it should be:你错过了一个[0] ，它应该是：

word_frequency = Counter(" ".join(description_list[0][0]).split()).most_common(10)

Output: Output：

我可以根据 python 中的列表创建直方图/条形图吗？ typeError：预期 str，找到列表

问题描述

2 个解决方案

解决方案1
0 2022-05-02 19:59:08

解决方案2
0 2022-05-02 20:02:22

我可以根据 python 中的列表创建直方图/条形图吗？ typeError：预期 str，找到列表

问题描述

2 个解决方案

解决方案1 0 2022-05-02 19:59:08

解决方案2 0 2022-05-02 20:02:22

解决方案1
0 2022-05-02 19:59:08

解决方案2
0 2022-05-02 20:02:22