熊猫-过滤文本和数据

Question

我们刚刚学习了如何在Python中过滤一些熊猫，所以我想我将在公共数据集上进行尝试。 （ http://data.wa.aemo.com.au/#stem-bids-and-offers ）

我用八月的数据。

我给自己设定的挑战是仅对$ / MWh> 0进行过滤，并且必须进行出价。 我们已经学习了如何使用np.logical_and进行过滤，但是发现的问题是我可以对数字或逻辑进行过滤。 不是都。

我有一种可行的方法，可以获取我想要的数据和可视化图像，但是我敢肯定，有一种更有效的方式可以按文本和数字字段进行过滤。 我的方法的问题在于，仅当字符大小不同时，该方法才有效。 即如果它说出价或未成年人。 我会两者都接。 我只是想拿起投标。 有人能指出我正确的方向吗？

这是我的代码：

#Task: I want to filter out ONLY positive $/MWh bids
#This requires 2 filters - 1 to filter out the $MWh > 0 and 1 to filter by Bids

# Try converting this to a numpy array and using the filtering mechanisms there
import numpy as np
df = pd.read_csv('stem-bids-and-offers-2017-08.csv')
df.head(5)
#I don't know how to filter by 'text' just yet so I will have to use another way which is using the len function
#This will reduce the bid/offer field to characters

df['boLength'] = df['Bid or Offer'].apply(len)
df.head(5)
filtByPriceBid = np.logical_and(df['Price ($/MWh)'] > 0, df['boLength'] == 3)
filtByPriceBid.head(5)

df2 = df[filtByPriceBid]
df2.head(10)

sns.kdeplot(df2['Price ($/MWh)'], shade=True)

PS：我附上了由此产生的KDE图。 如果有人也想对此提供解释，请随时提供！ 我原本期望标准化的分布，但是不幸的是，事实并非如此。

Answer 1

我希望这是您要寻找的。

您可以使用&一起使用多个过滤器

sns.kdeplot(df[(df['Price ($/MWh)'] > 0) & (df['Bid or Offer']=='Bid')]['Price ($/MWh)'], shade=True)

熊猫-过滤文本和数据

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-08-18 12:23:18

熊猫-过滤文本和数据

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-08-18 12:23:18

解决方案1
0 已采纳 2017-08-18 12:23:18