简体   繁体   English

Python 不会从 a.csv 文件中取值

[英]Python wont take a value from a .csv file

I have this assignment question and I wrote the following code for it.我有这个作业问题,我为此编写了以下代码。 But Python keeps telling me that "Reservoir" is not in the dataframe even though it is.但是 Python 一直告诉我“水库”不在 dataframe 中,尽管它是。 How do I fix this?我该如何解决? Here is a link to the.CVS file if needed.如果需要,这里是 .CVS 文件的链接。 https://drive.google.com/file/d/1SZ639cUA3DdrlI_lG2Hq0vs6HiT8OAU3/view?usp=sharing https://drive.google.com/file/d/1SZ639cUA3DdrlI_lG2Hq0vs6HiT8OAU3/view?usp=sharing

  1. Create and show a Bar Chart showing the number of wells by county:创建并显示一个条形图,按县显示井数:
  • Category: County类别:县
  • Y Axis: Total Clearfork Wells (Named "Reservoir" in file) Y 轴:Total Clearfork Wells(在文件中命名为“Reservoir”)

My code is below:我的代码如下:

df = pd.read_csv('CF Around Lubbock Production Table.CSV')

By_County = df.groupby(['County/Parish']).sum().Reservoir

x = By_County.index
y = By_County.values

plt.figure(figsize=(10, 8))

plt.bar(x,y)


for i, j in zip(x,y):
    plt.text(i, j+10, int(j), ha = 'center')


plt.xlabel('County', fontsize = 20)
plt.ylabel('Total Clearfork Wells', fontsize = 20)

plt.xticks(fontsize = 12)
plt.yticks(fontsize = 15)

plt.show()

Column Reservoir appears to be of type object (values are strings in your case). Column Reservoir似乎是object类型(在您的情况下,值是字符串)。 So pandas won't sum columns with string values, if you aggregate on the whole dataframe, hence the column is left out.因此,如果您对整个 dataframe 进行聚合,则 pandas 不会将列与字符串值相加,因此该列被忽略。

What you can try:你可以尝试什么:

By_County = df.groupby(['County/Parish'])['Reservoir'].sum()

It works on Series.它适用于系列。 But do you really want concatenated strings?但是你真的想要连接字符串吗?

County/Parish
CROSBY (TX)     CLEAR FORKCLEAR FORKCLEAR FORKCLEAR FORKCLEAR ...
GARZA (TX)      CLEARFORKCLEARFORKCLEARFORKCLEARFORKCLEARFORKC...
HALE (TX)       CLEARFORKCLEARFORKCLEARFORKCLEARFORKCLEARFORKC...
HOCKLEY (TX)    CLEARFORKCLEAR FORKCLEARFORKCLEAR FORKCLEAR FO...
LAMB (TX)       CLEARFORKCLEARFORKCLEARFORKCLEARFORKCLEARFORKC...
Name: Reservoir, dtype: object

Are you looking for something like this?你在寻找这样的东西吗?

df_grouped=data.groupby(['County/Parish','Reservoir'])['Reservoir'].count()

Output: Output:

County/Parish  Reservoir      
CROSBY (TX)    CLEAR FORK         1837
               CLEARFORK             2
GARZA (TX)     CLEAR FORK           22
               CLEARFORK            32
HALE (TX)      CLEAR FORK            2
               CLEARFORK           441
HOCKLEY (TX)   CLEAR FORK          485
               CLEARFORK           218
               CLEARFORK, LO         1
               L. CLEARFORK          1
               LOWER CLEARFORK      26
               UPPER CLEARFORK      13
LAMB (TX)      CLEAR FORK            3
               CLEARFORK           108
               L. CLEARFORK          1
               LOWER CLEARFORK      12
LUBBOCK (TX)   CLEAR FORK          726
               CLEARFORK           300
               CLEARFORK, LO        60
               CLEARFORK, LO.        4
               L. CLEARFORK          2
               LOWER CLEARFORK       1
               UPPER CLEARFORK       9
LYNN (TX)      CLEARFORK             1
TERRY (TX)     CLEAR FORK            3
               CLEARFORK             1
               CLEARFORK, LO         2
               CLEARFORK, LO.        2
               LOWER CLEARFORK       1
Name: Reservoir, dtype: int64

Below code will allow you get the count of the specific group:下面的代码将允许您获取特定组的计数:

df_grouped=data.groupby(['County/Parish','Reservoir'])
    CROSBY_TX_CLEAR_FORK_count= df_grouped.get_group(('CROSBY (TX)', 'CLEAR FORK'))['Reservoir'].count()

CROSBY_TX_CLEAR_FORK_count

You can change the parameters inside get_group to get the count of your wished group.您可以更改 get_group 中的参数以获取所需组的计数。

This will plot bar graph for reservoir 'CLEAR FORK' for all County/Parish types.这将为所有县/教区类型的水库“CLEAR FORK”提供 plot 条形图。

CLEAR_FORK_Count={}

count=0

for cat in data['County/Parish'].unique():
    try:
        count = df_grouped.get_group((cat, 'CLEAR FORK'))['Reservoir'].count()  
    except:
        count=0
    
    CLEAR_FORK_Count[cat]=count

plt.bar(CLEAR_FORK_Count.keys(), CLEAR_FORK_Count.values())
plt.xticks(rotation=30)

Solution:解决方案:

def getUniqueReservoirs(x):
    return x.nunique()

rs=data.groupby(['County/Parish','Reservoir']).agg({'Entity ID':'count',
                                                    'Reservoir':getUniqueReservoirs
                                     })
rs

Plotting the graph:绘制图表:

import matplotlib.pyplot as plt

rs.plot()
plt.xticks(rotation=90)
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM