简体   繁体   English

如何在串联的 DataFrame TimeSeries 中搜索特定日期。 同一日期会在合并的 df 中重复多次

[英]How to search for a specific date within concatenated DataFrame TimeSeries. Same Date would repeat several times in a merged df

I downloaded historical price data for ^GSPC Share Market Index (S&P500), and several other Global Indices.我下载了 ^GSPC 股票市场指数 (S&P500) 和其他几个全球指数的历史价格数据。 Date is set as index.日期设置为索引。

Selecting values in rows when date is set to index works as expected with .loc .当 date 设置为 index 时,选择行中的值按预期与.loc一起工作。

# S&P500 DataFrame = spx_df
spx_df.loc['2010-01-04']

Open            1.116560e+03
High            1.133870e+03
Low             1.116560e+03
Close           1.132990e+03
Volume          3.991400e+09
Dividends       0.000000e+00
Stock Splits    0.000000e+00
Name: 2010-01-04 00:00:00-05:00, dtype: float64

I then concatenated several Stock Market Global Indices into a single DataFrame for further use.然后,我将几个股票市场全球指数连接成一个 DataFrame 以供进一步使用。 In effect, any date in range will be included five times when historical data for five Stock Indices are linked in a Time Series.实际上,当五个股票指数的历史数据在时间序列中链接时,范围内的任何日期都将被包含五次。

markets = pd.concat(ticker_list, axis = 0)

I want to reference a single date in concatenated df and set it as a variable.我想在连接的 df 中引用单个日期并将其设置为变量。 I would prefer if the said variable didn't represent a datetime object, because I would like to access it with .loc as part of def function. How does concatenate effect accessing rows via date as index if the same date repeats several times in a linked TimeSeries?如果上述变量不代表日期时间 object,我会更喜欢,因为我想使用.loc作为def function 的一部分访问它。如果相同的日期在链接时间序列?

This is what I attempted so far:到目前为止,这是我尝试过的:

# markets = concatenated DataFrame 
Reference_date = markets.loc['2010-01-04'] 
# KeyError: '2010-01-04'

Reference_date = markets.loc[markets.Date == '2010-01-04']
# This doesn't work because Date is not an attribute of the DataFrame

Since you have set date as index you should be able to do: Reference_date = markets.loc[markets.index == '2010-01-04']由于您已将日期设置为索引,因此您应该能够执行以下操作: Reference_date = markets.loc[markets.index == '2010-01-04']

To access a specific date in the concatenated DataFrame, you can use boolean indexing instead of.loc.要访问串联的 DataFrame 中的特定日期,您可以使用 boolean 索引而不是 .loc。 This will return a DataFrame that contains all rows where the date equals the reference date:这将返回一个 DataFrame,其中包含日期等于参考日期的所有行:

reference_date = markets[markets.index == '2010-01-04'] reference_date = 市场 [markets.index == '2010-01-04']

You may also want to use query() method for searching for specific data您可能还想使用 query() 方法来搜索特定数据

reference_date = markets.query('index == "2010-01-04"')

Keep in mind that the resulting variable reference_date is still a DataFrame and contains all rows that match the reference date across all the concatenated DataFrames.请记住,生成的变量 reference_date 仍然是 DataFrame 并且包含与所有串联 DataFrame 中的参考日期匹配的所有行。 If you want to extract only specific columns, you can use the column name like this:如果只想提取特定的列,可以像这样使用列名:

reference_date_Open = markets.query('index == "2010-01-04"')["Open"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM