简体   繁体   English

Pandas dataframe.query方法语法

[英]Pandas dataframe.query method syntax

Question: 题:

I would like to gain a better understanding of the Pandas DataFrame.query method and what the following expression represents: 我想更好地理解Pandas DataFrame.query方法以及以下表达式代表的内容:

match = dfDays.query('index > @x.name & price >= @x.target')

What does @x.name represent? @x.name代表什么?

I understand what the resulting output is for this code (a new column with pandas.tslib.Timestamp data) but don't have a clear understanding of the expression used to get this end result. 我理解这个代码的结果是什么(带有pandas.tslib.Timestamp数据的新列),但是没有清楚地了解用于获得此最终结果的表达式。

Data: 数据:

From here: 从这里:

Vectorised way to query date and price data 矢量化的方式来查询日期和价格数据

np.random.seed(seed=1)
rng = pd.date_range('1/1/2000', '2000-07-31',freq='D')
weeks = np.random.uniform(low=1.03, high=3, size=(len(rng),))
ts2 = pd.Series(weeks
               ,index=rng)
dfDays = pd.DataFrame({'price':ts2})
dfWeeks = dfDays.resample('1W-Mon').first()
dfWeeks['target'] = (dfWeeks['price'] + .5).round(2)

def find_match(x):
    match = dfDays.query('index > @x.name & price >= @x.target')
    if not match.empty:
        return match.index[0]

dfWeeks.assign(target_hit=dfWeeks.apply(find_match, 1))

@x.name - @ helps .query() to understand that x is an external object (doesn't belong to the DataFrame for which the query() method was called). @x.name - @帮助.query()理解x是一个外部对象(不属于调用query()方法的DataFrame)。 In this case x is a DataFrame. 在这种情况下, x是一个DataFrame。 It could be a scalar value as well. 它也可以是标量值。

I hope this small demonstration will help you to understand it: 我希望这个小型演示能帮助你理解它:

In [79]: d1
Out[79]:
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

In [80]: d2
Out[80]:
   a   x
0  1  10
1  7  11

In [81]: d1.query("a in @d2.a")
Out[81]:
   a  b  c
0  1  2  3
2  7  8  9

In [82]: d1.query("c < @d2.a")
Out[82]:
   a  b  c
1  4  5  6

Scalar x : 标量x

In [83]: x = 9

In [84]: d1.query("c == @x")
Out[84]:
   a  b  c
2  7  8  9

Everything @MaxU said is perfect! @MaxU说的一切都很完美!

I wanted to add some context to the specific problem that this was applied to. 我想为这个应用的具体问题添加一些上下文。

find_match

This is a helper function that is used in the dataframe dfWeeks.apply . 这是一个在数据dfWeeks.apply使用的辅助函数。 Two things to note: 有两点需要注意:

  1. find_match takes a single argument x . find_match采用单个参数x This will be a single row of dfWeeks . 这将是dfWeeks一行。
    • Each row is a pd.Series object and each row will be passed through this function. 每行都是一个pd.Series对象,每一行都将通过此函数传递。 This is the nature of using apply . 这是使用apply的本质。
    • When apply passes this row to the helper function, the row has a name attribute that is equal to the index value for that row in the dataframe. apply将此行传递给辅助函数时,该行的name属性等于数据框中该行的索引值。 In this case, I know that the index value is a pd.Timestamp and I'll use it to do the comparing I need to do. 在这种情况下,我知道索引值是一个pd.Timestamp ,我将用它来做我需要做的比较。
  2. find_match references dfDays which is outside the scope of find_match itself. find_match引用dfDays其范围之外find_match本身。

I didn't have to use query ... I like using query . 我没有使用query ...我喜欢使用query It is my opinion that it makes some code prettier. 我认为它使一些代码更漂亮。 The following function, as provided by the OP, could've been written differently OP提供的以下功能可能采用不同的方式编写

def find_match(x):
    """Original"""
    match = dfDays.query('index > @x.name & price >= @x.target')
    if not match.empty:
        return match.index[0]

dfWeeks.assign(target_hit=dfWeeks.apply(find_match, 1))

find_match_alt

Or we could've done this, which may help to explain what the query string is doing above 或者我们可以做到这一点,这可能有助于解释query字符串在上面做了什么

def find_match_alt(x):
    """Alternative to OP's"""
    date_is_afterwards = dfDays.index > x.name
    price_target_is_met = dfDays.price >= x.target
    both_are_true = price_target_is_met & date_is_afterwards
    if (both_are_true).any():
        return dfDays[both_are_true].index[0]

dfWeeks.assign(target_hit=dfWeeks.apply(find_match_alt, 1))

Comparing these two functions should give good perspective. 比较这两个功能应该提供良好的视角。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Pandas-是否可以结合使用计数器运行dataframe.query方法 - Python Pandas - Is it possible to run a dataframe.query method in combination with a counter 使用 pandas dataframe.query() 选择列 - Select columns using pandas dataframe.query() Pandas dataframe.query 到 SQL &#39;LIKE&#39; 不使用 engine=&#39;python&#39; - Pandas dataframe.query to SQL 'LIKE' WITHOUT using engine='python' pandas DataFrame.query 表达式默认返回所有行 - pandas DataFrame.query expression that returns all rows by default 使用pandas dataframe.query查找相同的行 - Using pandas dataframe.query to find identical rows 不可散列的类型:使用 DataFrame.query 的 Pandas 中的“系列” - Unhashable type: 'Series' in Pandas using DataFrame.query 使用带有pandas.Series.str.contains的DataFrame.query()获取AttributeError:&#39;dict&#39;对象没有属性&#39;append&#39; - Using DataFrame.query() with pandas.Series.str.contains gets AttributeError: 'dict' object has no attribute 'append' Pandas 使用 DataFrame.query 根据字符串长度过滤字符串数据 - Pandas filter string data based on its string length using DataFrame.query 是否可以使用 DataFrame.query() 来判断列是否存在? - Is it possible to use DataFrame.query() to tell if a column exists? 使用 Pd.DataFrame 中包含的列表中的 dataframe.query 到 select 值 - Use dataframe.query to select values from a list contained in a pd.DataFrame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM