简体   繁体   English

时间序列数据的交叉验证:将用户定义的具有列表内部列表的元组列表转换为用于在 GridSearchCV 中应用的元组列表

[英]Cross-validation for time-series data: Convert user-defined list of tuples with inner lists of lists to list of tuples for applying in GridSearchCV

I have time series data and want to do walk forward cross-validation for my ML model in Python.我有时间序列数据,想在 Python 中对我的 ML model 进行前向交叉验证。 To create splits I have done following:为了创建拆分,我做了以下工作:

cv_split = [(list_of_lists[:i], list_of_lists[i:i+1]) for i in range(1, len(list_of_lists))] 

(where the list_of_lists is eg: [[0,1,2],[3,4],[5,6,7,8,], ...] (其中list_of_lists例如: [[0,1,2],[3,4],[5,6,7,8,], ...]
where each list stand for observations in a particular year.其中每个列表代表特定年份的观察结果。

The result for cv_split is the list of tuples with inner list of lists, each tuple is: ([[0,1,2],[3,4]], [[5,6,7,8]]) , cv_split的结果是具有内部列表列表的元组列表,每个元组是: ([[0,1,2],[3,4]], [[5,6,7,8]])
and this is the problem because GridSearchCV does not accept this.这就是问题所在,因为 GridSearchCV 不接受这一点。

I know that the following form for my cv_split will work:我知道我的cv_split的以下表格将起作用:
([0,1,2,3,4], [5,6,7,8]) (list of tuples of lists) . ([0,1,2,3,4], [5,6,7,8]) (list of tuples of lists)
Well I struggle how to come from ([[0,1,2],[3,4]], [[5,6,7,8]]) to ([0,1,2,3,4], [5,6,7,8]) ?好吧,我很难从([[0,1,2],[3,4]], [[5,6,7,8]])([0,1,2,3,4], [5,6,7,8]) ?

Here more comprehensive:这里更全面:

Now I have:我现在有:

[([[0,1,2],[3,4]], [[5,6,7,8]]) 

([[0,1,2],[3,4],[5,6,7,8]],[[9,10]])

([[0,1,2],[3,4],[5,6,7,8],[9,10]],[[11,12,13]]) 

([[0,1,2],[3,4],[5,6,7,8],[9,10],[11,12,13]],[[14,15,16]])] 

And I need the following form:我需要以下表格:

[([0,1,2,3,4], [5,6,7,8]) 

([0,1,2,3,4,5,6,7,8],[9,10]) 

([0,1,2,3,4,5,6,7,8,9,10],[11,12,13]) 

([0,1,2,3,4,5,6,7,8,9,10,11,12,13],[14,15,16])]

I am new to Python and will be happy about any help with some explanation.我是 Python 的新手,我很乐意为您提供任何解释帮助。

Here is how you can use a nested list comprehension:以下是如何使用嵌套列表推导:

lst = ([[0,1,2],[3,4]], [[5,6,7,8]])

t = tuple([[a for b in l for a in b] for l in lst])

print(t)

Output: Output:

([0, 1, 2, 3, 4], [5, 6, 7, 8])

UPDATE:更新:

lst = [([[0,1,2],[3,4]], [[5,6,7,8]]),
       ([[0,1,2],[3,4],   [5,6,7,8]],[[9,10]]),
       ([[0,1,2],[3,4],   [5,6,7,8],  [9,10]],[[11,12,13]]),
       ([[0,1,2],[3,4],   [5,6,7,8],  [9,10],  [11,12,13]],[[14,15,16]])]

ls = [tuple([[a for b in l for a in b] for l in tt]) for tt in lst]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM