I want to transfer data from GCS to BigQuery by embulk and digdag.
But error occurs.
com.google.api.client.googleapis.json.GoogleJsonResponseException: 401 Unauthorized
.......
Error: org.embulk.config.ConfigException: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
command:
in:
type: gcs
bucket: <bucket name>
path_prefix: <file path>
auth_method: compute_engine
parser:
type: poi_excel
sheets: <sheet name>
skip_header_lines: 4
columns:
- {name: 'name', type: string}
.
.
.
out:
type: bigquery
mode: replace
project: <project name>
dataset: <dataset name>
table: <table name>
auth_method: compute_engine
schema_file: <file name of json type>
gcs_bucket: <gcs tmp bucket name>
XXXX.yaml:
$ embulk run target_item_bottoms_config.yaml
2020-07-22 14:27:36.559 +0900: Embulk v0.9.23
2020-07-22 14:27:37.609 +0900 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2020-07-22 14:27:40.577 +0900 [INFO] (main): Gem's home and path are set by default: "/Users/oniki/.embulk/lib/gems"
2020-07-22 14:27:41.662 +0900 [INFO] (main): Started Embulk v0.9.23
2020-07-22 14:27:41.853 +0900 [INFO] (0001:transaction): Loaded plugin embulk-input-gcs (0.3.2)
2020-07-22 14:27:46.263 +0900 [INFO] (0001:transaction): Loaded plugin embulk-output-bigquery (0.6.4)
2020-07-22 14:27:46.369 +0900 [INFO] (0001:transaction): Loaded plugin embulk-parser-poi_excel (0.1.7)
org.embulk.exec.PartialExecutionException: org.embulk.config.ConfigException: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
at org.embulk.exec.BulkLoader$LoaderState.buildPartialExecuteException(BulkLoader.java:340)
at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:566)
at org.embulk.exec.BulkLoader.access$000(BulkLoader.java:35)
at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:353)
at org.embulk.exec.BulkLoader$1.run(BulkLoader.java:350)
at org.embulk.spi.Exec.doWith(Exec.java:22)
at org.embulk.exec.BulkLoader.run(BulkLoader.java:350)
at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:242)
at org.embulk.EmbulkRunner.runInternal(EmbulkRunner.java:291)
at org.embulk.EmbulkRunner.run(EmbulkRunner.java:155)
at org.embulk.cli.EmbulkRun.runSubcommand(EmbulkRun.java:431)
at org.embulk.cli.EmbulkRun.run(EmbulkRun.java:90)
at org.embulk.cli.Main.main(Main.java:64)
Suppressed: java.lang.NullPointerException
at org.embulk.exec.BulkLoader.doCleanup(BulkLoader.java:463)
at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:397)
at org.embulk.exec.BulkLoader$3.run(BulkLoader.java:394)
at org.embulk.spi.Exec.doWith(Exec.java:22)
at org.embulk.exec.BulkLoader.cleanup(BulkLoader.java:394)
at org.embulk.EmbulkEmbed.run(EmbulkEmbed.java:245)
... 5 more
Caused by: org.embulk.config.ConfigException: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
at org.embulk.input.gcs.AuthUtils.newClient(AuthUtils.java:81)
at org.embulk.input.gcs.GcsFileInput.listFiles(GcsFileInput.java:49)
at org.embulk.input.gcs.GcsFileInputPlugin.transaction(GcsFileInputPlugin.java:59)
at org.embulk.spi.FileInputRunner.transaction(FileInputRunner.java:62)
at org.embulk.exec.BulkLoader.doRun(BulkLoader.java:507)
... 11 more
Caused by: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:226)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.list(HttpStorageRpc.java:366)
at com.google.cloud.storage.StorageImpl$8.call(StorageImpl.java:338)
at com.google.cloud.storage.StorageImpl$8.call(StorageImpl.java:335)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
at com.google.cloud.storage.StorageImpl.listBlobs(StorageImpl.java:334)
at com.google.cloud.storage.StorageImpl.list(StorageImpl.java:290)
at org.embulk.input.gcs.AuthUtils.newClient(AuthUtils.java:77)
... 15 more
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 401 Unauthorized
{
"code" : 401,
"errors" : [ {
"domain" : "global",
"location" : "Authorization",
"locationType" : "header",
"message" : "Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.",
"reason" : "required"
} ],
"message" : "Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket."
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:401)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1097)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:499)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.list(HttpStorageRpc.java:356)
... 23 more
Error: org.embulk.config.ConfigException: com.google.cloud.storage.StorageException: Anonymous caller does not have storage.objects.list access to the Google Cloud Storage bucket.
output:
$ gcloud config list
[compute]
region = us-east1
zone = us-east1-c
[core]
account = myname@xxx.com
disable_usage_reporting = False
project = <project ID>
Your active configuration is: [default]
$ gcloud auth list
Credentialed Accounts
ACTIVE ACCOUNT
* myname@xxxx.com
To set the active account, run:
$ gcloud config set account `ACCOUNT`
$ gsutil ls
gs://<bucket name>
my environment:
$ gcloud config list [compute] region = us-east1 zone = us-east1-c [core] account = myname@xxx.com disable_usage_reporting = False project = <project ID> Your active configuration is: [default] $ gcloud auth list Credentialed Accounts ACTIVE ACCOUNT * myname@xxxx.com To set the active account, run: $ gcloud config set account `ACCOUNT` $ gsutil ls gs://<bucket name>
my gcp IAM role:
owner
I understand that the solution to this error is authorization. But my preferences seem to be fine.
what's wrong?
As the documentation [1], if we have 401- Unauthorized error then there could be many reasons, please have a related list of reasons listed below [followed the link 1], which could be helpful for troubleshooting:
Reason:AuthenticationRequiredRequesterPays
Access to a Requester Pays bucket requires authentication.
Reason: authError
This error indicates a problem with the authorization provided in the request to Cloud Storage. The following are some situations where that will occur: The OAuth access token has expired and needs to be refreshed. This can be avoided by refreshing the access token early, but code can also catch this error, refresh the token and retry automatically. Multiple non-matching authorizations were provided; choose one mode only. The OAuth access token's bound project does not match the project associated with the provided developer key. The Authorization header was of an unrecognized format or uses an unsupported credential type.
reason:lockedDomainExpired
When downloading content from a cookie-authenticated site, eg, using the Storage Browser, the response will redirect to a temporary domain. This error will occur if access to said domain occurs after the domain expires. Issue the original request again, and receive a new redirect.
Reason: push.webhookUrlUnauthorized
Requests to storage.objects.watchAll will fail unless you verify you own the domain.
Reason: required
Access to a non-public method that requires authorization was made, but none was provided in the Authorization header or through other means.
[1] https://cloud.google.com/storage/docs/json_api/v1/status-codes#401_Unauthorized
I try locally, and create Service Account Key and save at local.
◾️XXXX.yaml
before
auth_method: compute_engine
after
auth_method: json_key
json_keyfile: /path/to/json_keyfile.json
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.