简体   繁体   中英

Python stats and visualization

I am new to Python and am currently working on a set of real estate data from redfinn.

Currently my data looks like this: 数据集

There are many different neighborhoods in the dataset. I would like to:

  1. get the average homes_sold per month(date field was cut out of the screenshot) per neighborhood
  2. graph the above using only the neighborhoods I wish to use (about 4).

Any help is greatly appreciated.

As I understood, you have different values of sold per month houses and you want to take an average of it. If so, try this code (provide your data instead):

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline

data = pd.DataFrame({'neighborhood':['n1','n1','n2','n3','n3','n4','n5'],'homes_sold per month':[5,7,2,6,4,1,5],'something_else':[5,3,3,5,5,5,5]})
neighborhoods_to_plot = ['n1','n2','n4','n5'] #provide here a list you want to plot
plot = pd.DataFrame()
for n in neighborhoods_to_plot:
    plot.at[n,'homes_sold per month'] = data.loc[data['neighborhood']==n]['homes_sold per month'].mean()
plot.index.name = 'neighborhood'
plt.figure(figsize=(4,3),dpi=300,tight_layout=True)
sns.barplot(x=plot.index,y=plot['homes_sold per month'],data=plot)
plt.savefig('graph.png', bbox_inches='tight')

Plot

Okay so I am going to assume that you are using Pandas and Matplotlib in order to handle this data. Then in order to get the average number of homes sold for month you just need to do:

import pandas as pd
mean_number_of_homes_sold = data[['neighborhood','homes_sold']].groupby['neighborhood'].agg('mean')

In order to get the information plotted with only the neighborhoods you want you will need something like this

import pandas as pd
import matplotlib.pyplot as plt
#fill this list with strings representing the names of the data you need plotted
neighborhoods_to_plot = ['Albany Park', 'Tinley Park']
data_to_graph = data[data.neighborhood.isin(neighborhoods_to_plot)]
fig, ax = plt.subplots()
data_to_graph.plot(kind='scatter', x='avg_sale_to_list', y ='inventory_mom')
ax.set(title='Relationship between time to sale from listing and inventory momentum for selected neighborhoods')
fig.savefig('neighborhood.png', transparent=False, dpi=300, bbox_inches="tight")

You can obviously change which data is graphed or the type of graph but this should give you a decent starting point.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM