简体   繁体   English

我可以根据 python 中的列表创建直方图/条形图吗? typeError:预期 str,找到列表

[英]Can I create a histogram/bar graph from a list in python? typeError: expected str, found List

I am using matplotlib, pandas and gensim.我正在使用 matplotlib、pandas 和 gensim。 I am trying to create a histogram based on frequent words by extracting text directly from a website.我正在尝试通过直接从网站中提取文本来创建基于常用词的直方图。 I am receiving a typeError in this instance:在这种情况下我收到一个类型错误:

text = ','.join(map(str, description_list))
word_frequency = Counter(" ".join(description_list[0]).split()).most_common(10)

from this part of my code:从我的代码的这一部分:

#start of problems
data = {
    "description": [text_corpus]
    }

df = pd.DataFrame(data)
description_list = df['description'].values.tolist()

text = ','.join(map(str, description_list))
word_frequency = Counter(" ".join(description_list[0]).split()).most_common(10)

# `most_common` returns a list of (word, count) tuples
words = [word for word, _ in word_frequency]
counts = [counts for _, counts in word_frequency]

plt.bar(words, counts)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show()

print(text)

Here is the initial part of my code, which works in extracting textual data from a website:这是我的代码的初始部分,用于从网站提取文本数据:

from urllib.request import urlopen
from bs4 import BeautifulSoup
import pprint
from re import X
import string
from tokenize import Token
from collections import Counter
import matplotlib.pyplot as plt
import pandas as pd

url = "https://www.bbc.com/news/world-us-canada-61294585"
html = urlopen(url).read()
soup = BeautifulSoup(html, features="html.parser")

# kill all script and style elements
for script in soup(["script", "style"]):
    script.extract()   

# get text
text = soup.get_text()

document = text 

text_corpus = [text]

# Create a set of frequent words
stoplist = set('for a of the and to in'.split(' '))
# Lowercase each document, split it by white space and filter out stopwords
texts = [[word for word in document.lower().split() if word not in stoplist]
         for document in text_corpus]

# Count word frequencies
from collections import defaultdict
frequency = defaultdict(int)
for text in texts:
    for token in text:
        frequency[token] += 1

# Only keep words that appear more than once
text_corpus = [[token for token in text if frequency[token] > 1] for text in texts]
pprint.pprint(text_corpus)

I am new to Python so please any advice will help.我是 Python 的新手,所以请提供任何建议。 Please let me know If I have something fundamentally wrong with my code, and If i have to restart.如果我的代码有根本性的错误,请告诉我,如果我必须重新启动。

Or if not, if i could be pointed in the right direction in creating graphs from frequent words would be much appreciated or how to convert this particular list into a string.或者,如果没有,如果我能指出正确的方向来从常用词创建图表,或者如何将这个特定列表转换为字符串,我将不胜感激。

Additional question: Would it be better to search for specific words from a website instead of extracting all text?附加问题:从网站上搜索特定词而不是提取所有文本会更好吗?

Thank you very much.非常感谢你。

As soon as you've got text_corpus you may proceed as follows:一旦你有了text_corpus ,你就可以进行如下操作:

#url = "https://stackoverflow.com/questions/72091588/can-i-create-a-histogram-bar-graph-from-a-list-in-python-typeerror-expected-st"

counter = Counter(text_corpus[0]).most_common(10)
words, counts = list(zip(*counter))
plt.bar(words, counts)

在此处输入图像描述

You missed a [0] , it should be:你错过了一个[0] ,它应该是:

word_frequency = Counter(" ".join(description_list[0][0]).split()).most_common(10)

Output: Output:

在此处输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用列表创建直方图条形图 - using a list to create a histogram bar graph 我怎样才能真正打印出七字? 我收到一个错误:TypeError:序列项0:预期的str实例,找到列表 - How can I get my bigrams to actually print? I get an error: TypeError: sequence item 0: expected str instance, list found (Python)尝试制作条形图时出错“TypeError:列表索引必须是整数或切片,而不是str” - (Python) Error when trying to make Bar Graph "TypeError: list indices must be integers or slices, not str" 串联 TypeError:序列项 0:预期的 str 实例,找到列表 - concatenation TypeError: sequence item 0: expected str instance, list found TypeError:序列项4:预期的str实例,找到列表 - TypeError: sequence item 4: expected str instance, list found 为什么我收到 TypeError: sequence item 0: expected str instance, list found - Why am I getting TypeError: sequence item 0: expected str instance, list found Python 脚本错误:TypeError:只能将 str(而不是“列表”)连接到 str - Python Script Error: TypeError: can only concatenate str (not "list") to str 如何解决“TypeError:预期的 str、bytes 或 os.PathLike 对象,而不是列表” - How can I solve "TypeError: expected str, bytes or os.PathLike object, not list" 类型错误:只能将列表(不是“str”)连接到列表:Python - TypeError: can only concatenate list (not "str") to list : Python Graph.create_png错误TypeError:序列项0:预期的str实例,找到的字节 - Graph.create_png error TypeError: sequence item 0: expected str instance, bytes found
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM