Create federated learning data

Question

I am trying to create a federated learning dataset, I want to use it later to train an ensemble of models(not for Fed-avg). I am trying the following (this code could be found in the official tutorials of TFF):

emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data()

then defining some helpers for pre-processing:


    def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch `pixels` and return the features as an `OrderedDict`."""
    return collections.OrderedDict(
        x=tf.reshape(element['pixels'], [-1, 784]),
        y=tf.reshape(element['label'], [-1, 1]))

  return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER, seed=1).batch(
      BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)


def make_federated_data(client_data, client_ids):
  return [
      preprocess(client_data.create_tf_dataset_for_client(x))
      for x in client_ids
  ]

The next step is about creating the federated data like:

sample_clients = emnist_train.client_ids[0:NUM_CLIENTS]

federated_train_data = make_federated_data(emnist_train, sample_clients)

The federated_train_data is a list of items, each item is a collection of OrderedDict . Each OrderedDict has a set of X(pixels), Y(label). I need to extract X,Y and feed them to a Keras model like the below:

one_client_data = tfds.as_numpy(federated_train_data[0])
pd = pd.DataFrame(one_client_data)
X = pd['x']
Y = pd['y']
def create_keras_model():
  return tf.keras.models.Sequential([
      tf.keras.layers.InputLayer(input_shape=(784,)),
      tf.keras.layers.Dense(10, kernel_initializer='zeros'),
      tf.keras.layers.Softmax(),
  ])

model = create_keras_model()
model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# Fit data to model
history = model.fit(X, Y,
            batch_size=32,
            epochs=5,
            verbose=1)

But the thing is that I am getting an error

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

Any idea!

Answer 1

First of all, this is a little bit of a problem because you're overwriting your pandas import with a dataframe:

pd = pd.DataFrame(one_client_data)

So, let's just change that to df:

df = pd.DataFrame(one_client_data)
X = df['x']
Y = df['y']

Secondly, this give you X and Y as a pd.Series, not a numpy array. To get these into a numpy array, do the following. That will clear up your value error. After that, you may have some issue with the shape of your data not matching the model shape, but that's a separate issue.

X = np.array(X.tolist())
Y = np.array(Y.tolist())

Create federated learning data

Question

1 answers

solution1
0 2022-04-08 16:21:28

Create federated learning data

Question

1 answers

solution1 0 2022-04-08 16:21:28

solution1
0 2022-04-08 16:21:28