简体   繁体   中英

Creating a BigQuery dataset from a log sink in GCP

When running

gcloud logging sinks list

it seems I have several sinks for my project

▶ gcloud logging sinks list
NAME                    DESTINATION                                                                                    FILTER
myapp1                  bigquery.googleapis.com/projects/myproject/datasets/myapp1                      resource.type="k8s_container" resource.labels.cluster_name="mygkecluster" resource.labels.container_name="myapp1"
myapp2                  bigquery.googleapis.com/projects/myproject/datasets/myapp2                      resource.type="k8s_container" resource.labels.cluster_name="mygkecluster" resource.labels.container_name="myapp2"
myapp3                  bigquery.googleapis.com/projects/myproject/datasets/myapp3                      resource.type="k8s_container" resource.labels.cluster_name="mygkecluster" resource.labels.container_name="myapp3"

However, when I navigate in my BigQuery console, I don't see the corresponding datasets.

Is there a way to import these sinks as datasets so that I can run queries against them?

This guide on creating BigQuery datasets does not list how to do so from a log sink (unless I am missing something)

Also any idea why the above datasets are not displayed when using the bq ls command?

Firstly, be sure to be in the good project. if not, you can import dataset from external project by clicking on the PIN button (and you need to have enough permission for this).

Secondly, the Cloud Logging sink to BigQuery doesn't create the dataset, only the tables. So, if you have created the sinks without the dataset, you sinks aren't running (or run in error). Here more details

BigQuery: Select or create the particular dataset to receive the exported logs. You also have the option to use partitioned tables.

In general, what you expect for this feature to do is right, using BigQuery as log sink is to allow you to query the logs with BQ. For the problem you're facing, I believe it is to do with using Web console vs. gcloud.

When using BigQuery as log sink, there are 2 ways to specify a dataset:

  1. point to an existing dataset
  2. create a new dataset

When creating a new sink via web console, there's an option to have Cloud Logging create a new dataset for you as well. However, when using gcloud logging sinks create , it does not automatically create a dataset for you, only create the log sink. It seems like it also does not validate whether the specified dataset exists.

To resolve this, you could either use web console for the task or create the datasets on your own. There's nothing special about creating a BQ dataset to be a log sink destination comparing to creating a BQ dataset for other purpose. Create a BQ dataset, then create a log sink to point to the dataset and you're good to go.

Conceptually, different products (BigQuery, Cloud Logging) on GCP runs independently, the log sink in Cloud Logging is simply an object that pairs up filter and destination, but does not own/manage the destination resource (eg. BQ dataset). It's just that in web console, it provide some extra integration to make things easier.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM