简体   繁体   中英

Create federated learning data

I am trying to create a federated learning dataset, I want to use it later to train an ensemble of models(not for Fed-avg). I am trying the following (this code could be found in the official tutorials of TFF):

emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data()

then defining some helpers for pre-processing:


    def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch `pixels` and return the features as an `OrderedDict`."""
    return collections.OrderedDict(
        x=tf.reshape(element['pixels'], [-1, 784]),
        y=tf.reshape(element['label'], [-1, 1]))

  return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER, seed=1).batch(
      BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)


def make_federated_data(client_data, client_ids):
  return [
      preprocess(client_data.create_tf_dataset_for_client(x))
      for x in client_ids
  ]

The next step is about creating the federated data like:

sample_clients = emnist_train.client_ids[0:NUM_CLIENTS]

federated_train_data = make_federated_data(emnist_train, sample_clients)

The federated_train_data is a list of items, each item is a collection of OrderedDict . Each OrderedDict has a set of X(pixels), Y(label). I need to extract X,Y and feed them to a Keras model like the below:

one_client_data = tfds.as_numpy(federated_train_data[0])
pd = pd.DataFrame(one_client_data)
X = pd['x']
Y = pd['y']
def create_keras_model():
  return tf.keras.models.Sequential([
      tf.keras.layers.InputLayer(input_shape=(784,)),
      tf.keras.layers.Dense(10, kernel_initializer='zeros'),
      tf.keras.layers.Softmax(),
  ])

model = create_keras_model()
model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# Fit data to model
history = model.fit(X, Y,
            batch_size=32,
            epochs=5,
            verbose=1)

But the thing is that I am getting an error

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

Any idea!

First of all, this is a little bit of a problem because you're overwriting your pandas import with a dataframe:

pd = pd.DataFrame(one_client_data)

So, let's just change that to df:

df = pd.DataFrame(one_client_data)
X = df['x']
Y = df['y']

Secondly, this give you X and Y as a pd.Series, not a numpy array. To get these into a numpy array, do the following. That will clear up your value error. After that, you may have some issue with the shape of your data not matching the model shape, but that's a separate issue.

X = np.array(X.tolist())
Y = np.array(Y.tolist())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM