[英]Why can data read from a .CSV file with Pandas not be plotted using matplotlib after turning it into integers?
My Goal我的目标
Display a bar chart showing the names durations of the first 30 Netflix shows from a.CSV file显示一个条形图,显示来自 .CSV 文件的前 30 个 Netflix 节目的名称持续时间
Relevant Code after Trail & Error Trail & Error 后的相关代码
names = pd.read_csv("netflix_titles.csv", nrows=31, usecols=[2])
durations = pd.read_csv("netflix_titles.csv", nrows=31, usecols=[9])
durations[['duration']] = durations[['duration']].astype(int)
Then I plot it.然后我 plot 吧。
plt.bar(names,durations)
plt.title("Show Durations")
plt.xlabel("Name of Shows")
plt.ylabel("Durations (In Minutes)")
plt.show()
31 rows are read as the first rows are headers.读取 31 行,因为第一行是标题。 durations is turned into integers as the numbers in the column count as string or something else, and wouldn't work with matplotlib.持续时间转换为整数,因为列中的数字计为字符串或其他内容,并且不适用于 matplotlib。
Error Message错误信息
TypeError: unhashable type: 'numpy.ndarray' TypeError:不可散列的类型:'numpy.ndarray'
I don't think Numpy applies with what I'm trying to do, so I'm at a dead end here.我不认为 Numpy 适用于我正在尝试做的事情,所以我在这里陷入了死胡同。
This was able to print out a bar chart for the first 31 values这能够打印出前 31 个值的条形图
dataset = pd.read_csv("netflix_titles.csv")
names = dataset['title'].head(31)
durations = dataset['duration'].head(31)
plt.bar(names,durations)
plt.title("Show Durations")
plt.xlabel("Name of Shows")
plt.ylabel("Durations (In Minutes)")
plt.show
The problem is that your are making two different DataFrames from the csv file and trying to plot them against each other.问题是您正在从 csv 文件中制作两个不同的 DataFrame,并尝试将它们相互对抗 plot。 While this is possible, a much simpler approach is to create a single Dataframe from the selected columns and rows of the csv file and then plot it as demonstrated below:虽然这是可能的,但更简单的方法是从 csv 文件的选定列和行创建单个 Dataframe,然后从 plot 中创建一个,如下所示:
import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_csv("netflix_titles.csv", nrows=31, usecols=[2,9])
df.columns = ['name', 'duration']
df['duration'] = df['duration'].astype(int)
df.set_index('name', inplace=True)
df.plot(kind = 'bar')
plt.title("Show Durations")
plt.xlabel("Name of Shows")
plt.ylabel("Durations (In Minutes)")
plt.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.