[英]Python wont take a value from a .csv file
I have this assignment question and I wrote the following code for it.我有这个作业问题,我为此编写了以下代码。 But Python keeps telling me that "Reservoir" is not in the dataframe even though it is.
但是 Python 一直告诉我“水库”不在 dataframe 中,尽管它是。 How do I fix this?
我该如何解决? Here is a link to the.CVS file if needed.
如果需要,这里是 .CVS 文件的链接。 https://drive.google.com/file/d/1SZ639cUA3DdrlI_lG2Hq0vs6HiT8OAU3/view?usp=sharing
https://drive.google.com/file/d/1SZ639cUA3DdrlI_lG2Hq0vs6HiT8OAU3/view?usp=sharing
My code is below:我的代码如下:
df = pd.read_csv('CF Around Lubbock Production Table.CSV')
By_County = df.groupby(['County/Parish']).sum().Reservoir
x = By_County.index
y = By_County.values
plt.figure(figsize=(10, 8))
plt.bar(x,y)
for i, j in zip(x,y):
plt.text(i, j+10, int(j), ha = 'center')
plt.xlabel('County', fontsize = 20)
plt.ylabel('Total Clearfork Wells', fontsize = 20)
plt.xticks(fontsize = 12)
plt.yticks(fontsize = 15)
plt.show()
Column Reservoir
appears to be of type object
(values are strings in your case). Column
Reservoir
似乎是object
类型(在您的情况下,值是字符串)。 So pandas won't sum columns with string values, if you aggregate on the whole dataframe, hence the column is left out.因此,如果您对整个 dataframe 进行聚合,则 pandas 不会将列与字符串值相加,因此该列被忽略。
What you can try:你可以尝试什么:
By_County = df.groupby(['County/Parish'])['Reservoir'].sum()
It works on Series.它适用于系列。 But do you really want concatenated strings?
但是你真的想要连接字符串吗?
County/Parish
CROSBY (TX) CLEAR FORKCLEAR FORKCLEAR FORKCLEAR FORKCLEAR ...
GARZA (TX) CLEARFORKCLEARFORKCLEARFORKCLEARFORKCLEARFORKC...
HALE (TX) CLEARFORKCLEARFORKCLEARFORKCLEARFORKCLEARFORKC...
HOCKLEY (TX) CLEARFORKCLEAR FORKCLEARFORKCLEAR FORKCLEAR FO...
LAMB (TX) CLEARFORKCLEARFORKCLEARFORKCLEARFORKCLEARFORKC...
Name: Reservoir, dtype: object
Are you looking for something like this?你在寻找这样的东西吗?
df_grouped=data.groupby(['County/Parish','Reservoir'])['Reservoir'].count()
Output: Output:
County/Parish Reservoir
CROSBY (TX) CLEAR FORK 1837
CLEARFORK 2
GARZA (TX) CLEAR FORK 22
CLEARFORK 32
HALE (TX) CLEAR FORK 2
CLEARFORK 441
HOCKLEY (TX) CLEAR FORK 485
CLEARFORK 218
CLEARFORK, LO 1
L. CLEARFORK 1
LOWER CLEARFORK 26
UPPER CLEARFORK 13
LAMB (TX) CLEAR FORK 3
CLEARFORK 108
L. CLEARFORK 1
LOWER CLEARFORK 12
LUBBOCK (TX) CLEAR FORK 726
CLEARFORK 300
CLEARFORK, LO 60
CLEARFORK, LO. 4
L. CLEARFORK 2
LOWER CLEARFORK 1
UPPER CLEARFORK 9
LYNN (TX) CLEARFORK 1
TERRY (TX) CLEAR FORK 3
CLEARFORK 1
CLEARFORK, LO 2
CLEARFORK, LO. 2
LOWER CLEARFORK 1
Name: Reservoir, dtype: int64
Below code will allow you get the count of the specific group:下面的代码将允许您获取特定组的计数:
df_grouped=data.groupby(['County/Parish','Reservoir'])
CROSBY_TX_CLEAR_FORK_count= df_grouped.get_group(('CROSBY (TX)', 'CLEAR FORK'))['Reservoir'].count()
CROSBY_TX_CLEAR_FORK_count
You can change the parameters inside get_group to get the count of your wished group.您可以更改 get_group 中的参数以获取所需组的计数。
This will plot bar graph for reservoir 'CLEAR FORK' for all County/Parish types.这将为所有县/教区类型的水库“CLEAR FORK”提供 plot 条形图。
CLEAR_FORK_Count={}
count=0
for cat in data['County/Parish'].unique():
try:
count = df_grouped.get_group((cat, 'CLEAR FORK'))['Reservoir'].count()
except:
count=0
CLEAR_FORK_Count[cat]=count
plt.bar(CLEAR_FORK_Count.keys(), CLEAR_FORK_Count.values())
plt.xticks(rotation=30)
Solution:解决方案:
def getUniqueReservoirs(x):
return x.nunique()
rs=data.groupby(['County/Parish','Reservoir']).agg({'Entity ID':'count',
'Reservoir':getUniqueReservoirs
})
rs
Plotting the graph:绘制图表:
import matplotlib.pyplot as plt
rs.plot()
plt.xticks(rotation=90)
plt.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.