I have the following command:
def convert_housing_data_to_quarters():
import pandas as pd
housing = pd.read_csv('City_Zhvi_AllHomes.csv')
housing = housing.drop(housing.columns[6:51],axis=1)
times = housing[housing.columns[6:len(housing.columns)]]
def quarters(col):
if col.endswith(("01","02","03")):
s = col[:4] + "q1"
elif col.endswith(("04", "05", "06")):
s = col[:4] + "q2"
elif col.endswith(("07", "08", "09")):
s = col[:4] + "q3"
else:
s = col[:4] + "q4"
return s
times = times.groupby(quarters,axis=1).mean()
df = pd.concat([times, housing[['State','RegionName']]], axis=1)
arrays = [housing['State'],housing['RegionName']]
index = pd.MultiIndex.from_arrays(arrays)
index = index.sortlevel(level=0)
df = df.reindex(index,level=0)
return df
convert_housing_data_to_quarters()
However, I keep getting the error message:
'Index' object has no attribute 'levels'
I am trying to create a hierarchical multi-index with 'State' at the top of the index (level=0) followed by 'RegionName' (level=1).
Would anybody be able to give me a helping hand as to where I am going wrong?
You cannot use reindex
, because there is no MultiIndex
. So use DataFrame.set_index
by both columns before aggregate, so solution is possible simplify:
def convert_housing_data_to_quarters():
import pandas as pd
housing = pd.read_csv('City_Zhvi_AllHomes.csv')
housing = housing.drop(housing.columns[6:51],axis=1)
times = housing[housing.columns[6:len(housing.columns)]]
times = times.set_index(['State','RegionName']).sort_index()
def quarters(col):
if col.endswith(("01","02","03")):
s = col[:4] + "q1"
elif col.endswith(("04", "05", "06")):
s = col[:4] + "q2"
elif col.endswith(("07", "08", "09")):
s = col[:4] + "q3"
else:
s = col[:4] + "q4"
return s
return times.groupby(quarters,axis=1).mean()
convert_housing_data_to_quarters()
Also is possible convert datimes to quarters instead your function:
def convert_housing_data_to_quarters():
housing = pd.read_csv('City_Zhvi_AllHomes.csv')
housing = housing.drop(housing.columns[6:51],axis=1)
times = housing.iloc[:, 6:len(housing.columns)]
times = times.set_index(['State','RegionName']).sort_index()
quarters = pd.to_datetime(times.columns, format='%Y-%m').to_period('Q')
return times.groupby(quarters,axis=1).mean()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.