Python Pandas多个条件

Question

抱歉，我很抱歉，刚刚开始学习Python并试图使某些东西正常工作。

好的数据集是

Buy, typeid, volume, issued, duration, Volume Entered,Minimum Volume, range, price, locationid, locationname

SELL    20  2076541 2015-09-12T06:31:13 90  2076541 1   region  331.21  60008494    Amarr

SELL    20  194642  2015-09-07T19:36:49 90  194642  1   region  300 60008494    Amarr

SELL    20  2320    2015-09-13T07:48:54 3   2320    1   region  211 60008491    Irnin

我想按名称或ID筛选特定的位置，不打扰我，然后选择该位置的最低价格。 最好以硬编码形式输入，因为我只有几个位置感兴趣。 例如locationid = 60008494。

我看到您可以在一条线上完成两个条件，但是我看不到如何应用它。 所以我想嵌套它。 不必一定是熊猫，这似乎是我发现的第一件事，它满足了我的要求。

到目前为止，我得到的代码只是我想要实现的最小部分。

data = pd.read_csv('orders.csv')
length = len(data['typeid'].unique())
res = pd.DataFrame(columns=('Buy', 'typeid', 'volume','duration','volumeE','Minimum','range','price','locationid','locationname'))
for i in range(0,length):
        name_filter = data[data['typeid'] == data['typeid'].unique()[i]]
        price_min_filter = name_filter[name_filter['price'] == name_filter['price'].min() ]  
        res = res.append(price_min_filter, ignore_index=True) 
        i=i+1
        res.to_csv('format.csv') # writes output to csv
print "Complete"

更新。 好了，最新的部分似乎是下面的代码是我应该走的方向。如果我可以有s = typeid，locationid和price，那就太完美了。 所以我写了我想做的事，如何在python中获得正确的语法？ 对不起，我习惯于Excel和SQL。

import pandas as pd

df = pd.read_csv('orders.csv')
df[df['locationid'] ==60008494]
s= df.groupby(['typeid'])['price'].min()
s.to_csv('format.csv')

Answer 1

如果您真正想要的是-

我想按名称或ID筛选特定的位置，不打扰我，然后选择该位置的最低价格。 最好以硬编码形式输入，因为我只有几个位置感兴趣。 例如locationid = 60008494。

您可以先简单地过滤locationid上的df，然后使用['price'].min() 。 范例-

In [1]: import pandas as pd

In [2]: s = """Buy,typeid,volume,issued,duration,Volume Entered,Minimum Volume,range,price,locationid,locationname
   ...: SELL,20,2076541,2015-09-12T06:31:13,90,2076541,1,region,331.21,60008494,Amarr
   ...: SELL,20,194642,2015-09-07T19:36:49,90,194642,1,region,300,60008494,Amarr
   ...: SELL,20,2320,2015-09-13T07:48:54,3,2320,1,region,211,60008491,Irnin"""

In [3]: import io

In [4]: df = pd.read_csv(io.StringIO(s))

In [5]: df
Out[5]:
    Buy  typeid   volume               issued  duration  Volume Entered  \
0  SELL      20  2076541  2015-09-12T06:31:13        90         2076541
1  SELL      20   194642  2015-09-07T19:36:49        90          194642
2  SELL      20     2320  2015-09-13T07:48:54         3            2320

   Minimum Volume   range   price  locationid locationname
0               1  region  331.21    60008494        Amarr
1               1  region  300.00    60008494        Amarr
2               1  region  211.00    60008491        Irnin

In [8]: df[df['locationid']==60008494]['price'].min()
Out[8]: 300.0

如果要对所有locationids执行此操作，则如另一个答案中所述，您可以DataFrame.groupby使用DataFrame.groupby ，然后对所需的组使用['price']列，并使用.min() 。 范例-

data = pd.read_csv('orders.csv')
data.groupby(['locationid'])['price'].min()

演示-

In [9]: df.groupby(['locationid'])['price'].min()
Out[9]:
locationid
60008491    211
60008494    300
Name: price, dtype: float64

为了获得在相应组中具有最小值的完整行，可以使用idxmin()获取最小值的索引，然后将其传递给df.loc以获取那些行。 范例-

In [9]: df.loc[df.groupby(['locationid'])['price'].idxmin()]
Out[9]:
    Buy  typeid  volume               issued  duration  Volume Entered  \
2  SELL      20    2320  2015-09-13T07:48:54         3            2320
1  SELL      20  194642  2015-09-07T19:36:49        90          194642

   Minimum Volume   range  price  locationid locationname
2               1  region    211    60008491        Irnin
1               1  region    300    60008494        Amarr

Answer 2

如果我正确理解了您的问题，那么您实际上不需要做太多事情，而只需执行DataFrame.Groupby() 。 例如，您可以按locationname对数据框进行分组，然后从生成的groupby对象中选择price列，然后使用min()方法为每个组输出最小值：

data.groupby('locationname')['price'].min()

这将为您提供每个组的最低price 。 因此，它将类似于：

locationname
Amarr    300
Irnin    211
Name: price, dtype: float64

Python Pandas多个条件

问题描述

2 个解决方案

解决方案1
1 2015-09-15 01:54:11

解决方案2
0 2015-09-15 01:48:01

Python Pandas多个条件

问题描述

2 个解决方案

解决方案1 1 2015-09-15 01:54:11

解决方案2 0 2015-09-15 01:48:01

解决方案1
1 2015-09-15 01:54:11

解决方案2
0 2015-09-15 01:48:01