I have aa file of data and want to select a specific State. From there I need to return this in a list, but there will be years that correspond to the date with missing data, so I need to replace the missing data.
I am having some issue with my code, likely something is slightly off in my for loop:
def stateCountAsList(filepath,state):
import pandas as pd
pd.set_option('display.width',200)
import numpy as np
dataFrame = pd.read_csv(filepath,header=0,sep='\t')
df = dataFrame.iloc[0:638,:]
dfState = df[df['State'] == state]
yearList = range(1999,2012)
countsList = []
for dfState['Year'] in yearList:
countsList = dfState['Count']
else:
countsList.append(np.nan)
return countsList
print countsList.tolist()
stateCountAsList(filepath, state)
state = 'California'
Traceback:
C:\Users\Michael\workspace\UCIIntrotoPythonDA\src\Michael_Madani_week3.py:59: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
for dfState['Year'] in yearList:
Traceback (most recent call last):
File "C:\Users\Michael\workspace\UCIIntrotoPythonDA\src\Michael_Madani_week3.py", line 67, in <module>
stateCountAsList(filepath, state)
File "C:\Users\Michael\workspace\UCIIntrotoPythonDA\src\Michael_Madani_week3.py", line 62, in stateCountAsList
countsList.append(np.nan)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\core\series.py", line 1466, in append
verify_integrity=verify_integrity)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\tools\merge.py", line 754, in concat
copy=copy)
File "C:\Users\Michael\Anaconda\lib\site-packages\pandas\tools\merge.py", line 805, in __init__
raise TypeError("cannot concatenate a non-NDFrame object")
TypeError: cannot concatenate a non-NDFrame object
You have at least two different issues in your code:
The warning
A value is trying to be set on a copy of a slice from a DataFrame.
is triggered by for dfState['Year'] in yearList
(line 59 in your code). In this line you try to loop over a range of years (1999 to 2012), but instead you implicitely try to assign the year value to dfState['Year']. This is not a copy, but a "view" ( http://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy ), since df = dataFrame.iloc[0:638,:]
returns a view.
But as mentioned earlier, you don't want to assign a value to the DataFrame here, only loop over years. So the for-loop should look like:
for year in range(1999,2012):
...
The second issue is in line 62. Here, you try to append np.nan
to your "list" countsList - but countsList is not a list anymore, but a DataFrame!
Two lines before, you assign a pd.Series
( countsList = dfState['Count']
), effectively changing the type. This gives you the TypeError: cannot concatenate a non-NDFrame object
With this information you should be able to correct your loop.
As an alternative, you can get the desired result using Pandas query method ( http://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method-experimental ):
def stateCountAsList(filepath,state):
import pandas as pd
import numpy as np
dataFrame = pd.read_csv(filepath,header=0,sep='\t')
df = dataFrame.iloc[0:638,:]
stateList = df.query("(State == @state) & (Year > 1999 < 2005").Count.tolist()
return stateList
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.