简体   繁体   English

使用 Pandas 数据框中的特征创建计算器

[英]creating a calculator using features in a pandas data frame

我想创建一个计算器来计算 Airbnb 房间的平均价格,当我们将邻里、床位、浴室、卧室数作为输入(这些特征已经在数据集中给出)时,邻域、床位、卧室、浴室和价格是数据集中的特征,,,请帮忙

It would help if you provide more details and ask specific questions.如果您提供更多详细信息并提出具体问题,将会有所帮助。

Calculating average price in pandas can be approached in the following way:可以通过以下方式计算大熊猫的平均价格:

import pandas as pd

df = pd.read_csv(path_to_file.csv) # assuming the file has all the relevant fields

def calculate_price(row):
    return row['price_per_room'] * row['number_of_rooms'] * row['number_of_nights']

df['price'] = df.apply(calculate_price)

average_price = df['price'].mean()

print(f"The average price is {average_price }")

## use group by to aggregate across categories

Hope this helps!希望这可以帮助!

I'm not sure it's what you exactly need (you should specify your question a bit better, add sample data, preferred output, your code...), but groupby could be useful... Something like this:我不确定这是你真正需要的(你应该更好地指定你的问题,添加示例数据,首选输出,你的代码......),但 groupby 可能很有用......像这样的东西:

df = pd.DataFrame({
    'neighbourhood' : ['nice', 'not so nice', 'nice', 'awesome', 'not so nice'],
    'room_type' : ['a', 'a', 'b', 'b', 'a']
    'beds': [7,2,1,6,6],
    'bedrooms': [3,1,1,3,2],
    'bathrooms': [2,1,1,1,1],
    'price': [220,100,125,320,125]
})

print('Mean of all prices:\n', df['price'].mean())
print('\nMean grouped by neighbourhood:\n', df.groupby(['neighborhood']).mean().price)
print('\nMean grouped by more cols:\n', df.groupby(['neighbourhood', 'beds', 'bedrooms']).mean().price) 

Output:输出:

Mean of all prices:
 178.0

Mean grouped by neighbourhood:
 neighbourhood
awesome        320.0
nice           172.5
not so nice    112.5

Mean grouped by more cols:
 neighbourhood  beds  bedrooms
awesome         6     3           320
nice            1     1           125
                7     3           220
not so nice     2     1           100
                6     2           125

You can also filter the DataFrame before you apply groupy, ie like this:您还可以在应用 groupy 之前过滤 DataFrame,即像这样:

# select requested data data in loc[...] and then apply groupby
df_filtered = df.loc[(df['neighbourhood']=='nice') & (df['beds']==1)]
df_filtered.groupby('neighbourhood')['price'].mean()
# or the same on one line:
df.loc[(df['neighbourhood']=='nice') & (df['beds']==1)].groupby('neighbourhood')['price'].mean()

And your function (from last comment) may look like this:您的函数(来自最后一条评论)可能如下所示:

def calculate_price(air_df):
    a = str(input("Enter the Neighbourhood : "))
    b = str(input("Enter the Room Type : "))
    c = float(input("Enter number of Beds : "))
    d = float(input("Enter number of Bedrooms : "))
    e = float(input("Enter number of Bathrooms : "))
    return air_df.loc[
        (air_df['neighbourhood']==a) & 
        (air_df['room_type']==b) &
        (air_df['beds']==c) &
        (air_df['bedrooms']==d) &
        (air_df['bathrooms']==e)
    ].groupby('neighbourhood')['price'].mean()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM