简体   繁体   English

如何从元组列表中生成 dataframe,其中元组是值?

[英]How to make a dataframe from a list of tuples, where tuples are the values?

I have a time series dataset on which I am running the auto arima model. The dataset has multiple columns that are independent of each other, so basically it's like multiple auto arima analysis.我有一个运行 auto arima model 的时间序列数据集。该数据集有多个相互独立的列,所以基本上它就像多个 auto arima 分析。

The code I currently have loops through all the columns in the dataframe and stores the order values of p,d,q for each column in a list.我目前拥有的代码循环遍历 dataframe 中的所有列,并为列表中的每一列存储 p、d、q 的顺序值。 What I want to achieve is: to store the p,d,q values for each column in a dataframe row wise.我想要实现的是:将每列的 p、d、q 值存储在 dataframe 行中。

Time Series Dataframe时间序列 Dataframe

date                   Col1      Col2       Col3      Col4       Col5       Col6        Col7      Col8      Col9
2022-01-02 10:30:00     24         24        24.8      24.8       25         25         25.5      26.3      26.9   
2022-01-02 10:45:00     59         58         60       60.3       59.3       59.2       58.4      56.9      58.0   
2022-01-02 11:00:00     43.7       43.9       48        48        48.1       48.9       49        49.5      49.5 

Code代码

##Auto arima
# def arimamodel(series):
autoarima_results=[]  
series = df.columns
for col in series:
    print("Auto Arima for : ", {col})
    ARIMA_model = pm.auto_arima(
        df[col],
        start_p=1,
        start_q=1,
        test="adf",
        max_p=5,
        max_q=5,
        d=None,
        trace=True,
        error_action="ignore",
        suppress_warnings=True,
        stepwise=True,
    )
    ARIMA_model.summary()
    autoarima_results.append(ARIMA_model.order)

This returns a list that looks like: [(1,1,0), (2,1,1), (1,1,1)]这将返回一个如下所示的列表:[(1,1,0), (2,1,1), (1,1,1)]

For example, the orders of p,d,q suggested by auto arima are, Col1: 1,1,0, Col2: 2,1,1, Col3: 1,1,1 and so on.比如auto arima建议的p,d,q的顺序是,Col1: 1,1,0, Col2: 2,1,1, Col3: 1,1,1等等。

The final output should be a dataframe that would look like is as below.最终的 output 应该是 dataframe,如下所示。 Where every row represents one column and its p,d,q values:每行代表一列及其 p、d、q 值:

Results         pdq_values
Col1            (1,1,0)      
Col2            (2,1,1)    
Col3            (1,1,1)

Let's say that:比方说:

  • df has three columns: "Col1", "Col2", "Col3" df有三列: "Col1", "Col2", "Col3"
  • and autoarima_results == [(1, 1, 0), (2, 1, 1), (1, 1, 1)]autoarima_results == [(1, 1, 0), (2, 1, 1), (1, 1, 1)]

Then, here is one way to do it:然后,这是一种方法:

new_df = (
    pd.DataFrame(autoarima_results, index=cols)
    .pipe(lambda df_: df_.assign(pdq_values=df_.apply(lambda x: tuple(x), axis=1)))[
        "pdq_values"
    ]
    .to_frame("pdq_values")
)

new_df.index.name = "Results"
print(new_df)
# Output
        pdq_values
Results
Col1     (1, 1, 0)
Col2     (2, 1, 1)
Col3     (1, 1, 1)

Starting with the output of your ARIMA model, here is a simple way to do this without using lambda functions -从您的 ARIMA model 的 output 开始,这是一种无需使用 lambda 函数即可执行此操作的简单方法 -

#Your ARIMA output
l = [(1,1,0), (2,1,1), (1,1,1)]

#Convert the list of tuples into entries from a single column
df = pd.DataFrame({'pdq_values':l})    #<----
#df = pd.DataFrame([l]).T              #Another way!

#Change the index by adding 1, setting column name and prepending 'Col'
df.index = 'Col' + (df.index.set_names('Results')+1).astype(str)

#Reset index to get the index as a column in the df
df = df.reset_index()

df
  Results pdq_values
0    Col1  (1, 1, 0)
1    Col2  (2, 1, 1)
2    Col3  (1, 1, 1)

Additional notes:附加条款:

  1. The trick is to pass it as a dictionary (key, value pair) where you can specify the column name as key {'pdq_values':l} .诀窍是将其作为字典(键、值对)传递,您可以在其中将列名称指定为键{'pdq_values':l} Pandas reads this as a single column where each tuple is the entries for each row respectively. Pandas 将其读取为单个列,其中每个元组分别是每一行的条目。 Another way to force this behavior is to pass the list of tuples as a list of lists [l] .强制执行此行为的另一种方法是将元组列表作为列表列表[l]传递。 This would create a dataframe with n columns and then you will need to do a transpose.这将创建一个包含n列的 dataframe,然后您需要进行转置。
  2. 'Col' + (df.index.set_names('Results')+1).astype(str) does 4 things at once. 'Col' + (df.index.set_names('Results')+1).astype(str)一次做 4 件事。 Changes the name of the index column to Results , adds 1 to it, converts it to a string, and prepends it with Col .将索引列的名称更改为Results ,将其加 1,将其转换为字符串,并在其前面加上Col This results in 0,1,2.. to become Col1, Col2, Col3..这导致 0,1,2.. 变成 Col1, Col2, Col3..

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM