简体   繁体   English

创建联邦学习数据

[英]Create federated learning data

I am trying to create a federated learning dataset, I want to use it later to train an ensemble of models(not for Fed-avg).我正在尝试创建一个联合学习数据集,我想稍后用它来训练一组模型(不适用于 Fed-avg)。 I am trying the following (this code could be found in the official tutorials of TFF):我正在尝试以下操作(这段代码可以在 TFF 的官方教程中找到):

emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data()

then defining some helpers for pre-processing:然后定义一些预处理助手:


    def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch `pixels` and return the features as an `OrderedDict`."""
    return collections.OrderedDict(
        x=tf.reshape(element['pixels'], [-1, 784]),
        y=tf.reshape(element['label'], [-1, 1]))

  return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER, seed=1).batch(
      BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)


def make_federated_data(client_data, client_ids):
  return [
      preprocess(client_data.create_tf_dataset_for_client(x))
      for x in client_ids
  ]

The next step is about creating the federated data like:下一步是创建联合数据,例如:

sample_clients = emnist_train.client_ids[0:NUM_CLIENTS]

federated_train_data = make_federated_data(emnist_train, sample_clients)

The federated_train_data is a list of items, each item is a collection of OrderedDict . federated_train_data是一个项目列表,每个项目都是OrderedDict的集合。 Each OrderedDict has a set of X(pixels), Y(label).每个OrderedDict都有一组 X(pixels), Y(label)。 I need to extract X,Y and feed them to a Keras model like the below:我需要提取 X、Y 并将它们提供给 Keras model,如下所示:

one_client_data = tfds.as_numpy(federated_train_data[0])
pd = pd.DataFrame(one_client_data)
X = pd['x']
Y = pd['y']
def create_keras_model():
  return tf.keras.models.Sequential([
      tf.keras.layers.InputLayer(input_shape=(784,)),
      tf.keras.layers.Dense(10, kernel_initializer='zeros'),
      tf.keras.layers.Softmax(),
  ])

model = create_keras_model()
model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# Fit data to model
history = model.fit(X, Y,
            batch_size=32,
            epochs=5,
            verbose=1)

But the thing is that I am getting an error但问题是我收到了一个错误

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

Any idea!任何的想法!

First of all, this is a little bit of a problem because you're overwriting your pandas import with a dataframe:首先,这是一个小问题,因为您要用 dataframe 覆盖 pandas 导入:

pd = pd.DataFrame(one_client_data)

So, let's just change that to df:所以,让我们把它改成 df:

df = pd.DataFrame(one_client_data)
X = df['x']
Y = df['y']

Secondly, this give you X and Y as a pd.Series, not a numpy array.其次,这为您提供了作为 pd.Series 的 X 和 Y,而不是 numpy 数组。 To get these into a numpy array, do the following.要将这些放入 numpy 数组,请执行以下操作。 That will clear up your value error.这将清除您的价值错误。 After that, you may have some issue with the shape of your data not matching the model shape, but that's a separate issue.之后,您可能会遇到数据形状与 model 形状不匹配的问题,但这是一个单独的问题。

X = np.array(X.tolist())
Y = np.array(Y.tolist())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM