简体   繁体   English

输入到 LSTM 的时间序列

[英]Timeseries input to an LSTM

I have dataset containing water samples collected from different locations.我有包含从不同位置收集的水样的数据集。 For example, ABC1 water sample is taken from a river in Arizona and ABC2 is a water sample taken from a river in Boston.例如,ABC1 水样取自亚利桑那州的一条河流,而 ABC2 水样取自波士顿的一条河流。 They are both rivers, they have the same feature columns(pH, temp, etc...) but they are in different locations so the changes in features are individual to them.它们都是河流,它们具有相同的特征列(pH、温度等),但它们位于不同的位置,因此特征的变化对它们来说是个体的。 So my goal is to create one river model because I do not have enough data to create individual models.所以我的目标是创建一条河流 model因为我没有足够的数据来创建单个模型。 There are total 11 columns that I want to predict next months values.我想预测下个月的值总共有 11 列。 My dataset looks like this:我的数据集如下所示:

Date         Sample_Name        pH    temp    etc...

2009-01-01    ABC1              7.2    12
2009-01-02    ABC2              5.5    11
.
.
2009-01-02    ABC1              7.2    10
2009-01-02    ABC2              7.3    10
.
.
2013-06-02    ABC2              6.5    22
2013-06-04    ABC1              6.5    22
.
2015-01-05    ABC1              8.9    13
2015-01-05    ABC4              8.8    13

I want to feed every sample and its sequence to an LSTM model.我想将每个样本及其序列提供给 LSTM model。 For example;例如; every measurement(row) of ABC1 must be given to a model as a sequence, or a batch. ABC1 的每个测量(行)必须作为序列或批次提供给 model。 Is it possible to do this kind of data preparation using TimeseriesGenerator?是否可以使用 TimeseriesGenerator 进行这种数据准备? How can I prepare my data in a way to feed it to the model as I described?如前所述,我如何准备我的数据以将其提供给 model? Also does it help to sort the dataset with date and sample name(alphabetically)?使用日期和样本名称(按字母顺序)对数据集进行排序是否也有帮助? I am trying to achieve something like this 我正在努力实现这样的目标

I want to generate data using:我想使用以下方法生成数据:

from keras.preprocessing.sequence import TimeseriesGenerator
n_timesteps = 2
n_features = 10
batch_size = 5
generator = TimeseriesGenerator(df, df, length, sampling_rate = 10, stride = 1, batch_size = batch_size)

The simple LSTM model that I want to feed my data in:我想将数据输入的简单 LSTM model:

from keras.models import Sequential
from keras.layers import LSTM, Dense
from keras.utils import Sequence

model = Sequential()
model.add(LSTM(n_features, activation='relu', input_shape=(n_timesteps, n_features)))
model.add(Dense(10))
model.compile(optimizer='adam', loss='mse', metrics = ['accuracy'])

Looking at the docs ,tf.keras.preprocessing.sequence.TimeseriesGenerator cannot take a dictionary as the first argument.查看文档,tf.keras.preprocessing.sequence.TimeseriesGenerator 不能将字典作为第一个参数。 The 'slice' error is just a manifestation of that fact, as the function tries to use slices of the first argument (dict) and fails. “切片”错误只是这一事实的体现,因为 function 尝试使用第一个参数(dict)的切片并失败。 again from the docs:再次来自文档:

Arguments: data: Indexable generator (such as list or Numpy array) containing consecutive data points (timesteps). Arguments:数据:包含连续数据点(时间步长)的可索引生成器(例如列表或 Numpy 数组)。

so perhaps you want to pass input_dict['ABC1'] or possibly input_dict['ABC1'].values所以也许你想传递input_dict['ABC1']或者可能input_dict['ABC1'].values

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM