简体   繁体   中英

403 permission error when tpu-vm writing cloud bucket

When I run the train.py script from https://github.com/tensorflow/models/tree/master/official/nlp , I got a 403 permission error.

python3 official/nlp/train.py --tpu=con-bert1 --experiment=bert/pretraining --mode=train --model_dir=gs://con_bioberturk/general/ --config_file=gs://con_bioberturk/bert_base.yaml --config_file=gs://con_bioberturk/pretrain.yaml  --params_override="task.init_checkpoint=gs://con_bioberturk/bert-base-turkish-cased-tf/model.ckpt"`

and my output is below:

I1115 07:49:02.847452 139877506112576 train_utils.py:368] Saving experiment configuration to gs://con_bioberturk/general/params.yaml
Traceback (most recent call last):
  File "/usr/share/tpu/models/official/modeling/hyperparams/params_dict.py", line 349, in save_params_dict_to_yaml
    yaml.dump(params.as_dict(), f, default_flow_style=False) 

File "/usr/local/lib/python3.8/dist-packages/yaml/__init__.py", line 290, in dump
    return dump_all([data], stream, Dumper=Dumper, **kwds)

File "/usr/local/lib/python3.8/dist-packages/yaml/__init__.py", line 278, in dump_all
    dumper.represent(data)

 File "/usr/local/lib/python3.8/dist-packages/yaml/representer.py", line 28, in represent
    self.serialize(node)
  File "/usr/local/lib/python3.8/dist-packages/yaml/serializer.py", line 55, in serialize
        self.emit(DocumentEndEvent(explicit=self.use_explicit_end))
      File "/usr/local/lib/python3.8/dist-packages/yaml/emitter.py", line 115, in emit
        self.state()
      File "/usr/local/lib/python3.8/dist-packages/yaml/emitter.py", line 220, in expect_document_end
    self.flush_stream()
  File "/usr/local/lib/python3.8/dist-packages/yaml/emitter.py", line 790, in flush_stream
    self.stream.flush()
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/lib/io/file_io.py", line 219, in flush
    self._writable_file.flush()

tensorflow.python.framework.errors_impl.PermissionDeniedError: Error executing an HTTP request: HTTP response code 403 with body '{
  "error": {
    "code": 403,
    "message": "Access denied.",
    "errors": [
      {
        "message": "Access denied.",
        "domain": "global",
        "reason": "forbidden"
      }
    ]
  }
}

    when initiating an upload to gs://con_bioberturk/general/params.yaml

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "official/nlp/train.py", line 82, in <module>
    app.run(main)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "official/nlp/train.py", line 47, in main
    train_utils.serialize_config(params, model_dir)
  File "/usr/share/tpu/models/official/core/train_utils.py", line 370, in serialize_config
    hyperparams.save_params_dict_to_yaml(params, params_save_path)
  File "/usr/share/tpu/models/official/modeling/hyperparams/params_dict.py", line 349, in save_params_dict_to_yaml
    yaml.dump(params.as_dict(), f, default_flow_style=False)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/lib/io/file_io.py", line 197, in __exit__
    self.close()
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/lib/io/file_io.py", line 239, in close
    self._writable_file.close()
tensorflow.python.framework.errors_impl.PermissionDeniedError: Error executing an HTTP request: HTTP response code 403 with body '{
  "error": {
    "code": 403,
    "message": "Access denied.",
    "errors": [
      {
        "message": "Access denied.",
        "domain": "global",
        "reason": "forbidden"
      }
    ]
  }
}
'            

Here is my settings:

  • tpu-vm name:con-bert1
  • TPU software version: tpu-vm-tf-2.10.0-pod
  • cloud bucket (con_bioberturk) and tpu-vm are in the same location

Looks like you need to add the service account that is currently active on your TPU VM to the GCS IAM. Instructions here - https://github.com/google-research/text-to-text-transfer-transformer/issues/1003

If that fails, try running gcloud auth login --update-adc on your TPU VM to add your credentials.

Hope this resolves your issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM