简体   繁体   English

从 Jupyter Notebook 创建 BigQueryML Model

[英]Creating BigQueryML Model From Jupyter Notebook

I can create BigQuery ML models from the Google Big Query Web UI, but I'm trying to keep all of my code in python notebooks.我可以从 Google Big Query Web UI 创建 BigQuery ML 模型,但我试图将所有代码保留在 python 笔记本中。 Is there any way that I can create the models from the notebook without having to jump out to the web UI?有什么方法可以让我从笔记本创建模型而不必跳到 web UI? I am able to use the predict function for creating model results from the Jupyter Notebook.我可以使用预测 function 从 Jupyter Notebook 创建 model 结果。

Thanks.谢谢。

You don't need to do anything special, just run as a standalone query.您不需要做任何特别的事情,只需作为独立查询运行。

Create your dataset创建数据集

Enter the following code to import the BigQuery Python client library and initialize a client.输入以下代码导入 BigQuery Python 客户端库并初始化客户端。 The BigQuery client is used to send and receive messages from the BigQuery API. BigQuery 客户端用于从 BigQuery API 发送和接收消息。

from google.cloud import bigquery
​
client = bigquery.Client(location="US")

Next, you create a BigQuery dataset to store your ML model.接下来,您创建一个 BigQuery 数据集来存储您的 ML model。 Run the following to create your dataset:运行以下命令来创建您的数据集:

dataset = client.create_dataset("bqml_tutorial")

Create your model创建您的 model

Next, you create a logistic regression model using the Google Analytics sample dataset for BigQuery.接下来,您使用 BigQuery 的 Google Analytics 示例数据集创建逻辑回归 model。 The model is used to predict whether a website visitor will make a transaction. model 用于预测网站访问者是否会进行交易。 The standard SQL query uses a CREATE MODEL statement to create and train the model.标准 SQL 查询使用CREATE MODEL语句来创建和训练 model。 Standard SQL is the default query syntax for the BigQuery python client library.标准 SQL 是 BigQuery python 客户端库的默认查询语法。

The BigQuery python client library provides a cell magic, %%bigquery , which runs a SQL query and returns the results as a Pandas DataFrame. The BigQuery python client library provides a cell magic, %%bigquery , which runs a SQL query and returns the results as a Pandas DataFrame.

To run the CREATE MODEL query to create and train your model:要运行CREATE MODEL查询来创建和训练您的 model:

%%bigquery
CREATE OR REPLACE MODEL `bqml_tutorial.sample_model`
OPTIONS(model_type='logistic_reg') AS
SELECT
  IF(totals.transactions IS NULL, 0, 1) AS label,
  IFNULL(device.operatingSystem, "") AS os,
  device.isMobile AS is_mobile,
  IFNULL(geoNetwork.country, "") AS country,
  IFNULL(totals.pageviews, 0) AS pageviews
FROM
  `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20160801' AND '20170630'

The query takes several minutes to complete.查询需要几分钟才能完成。 After the first iteration is complete, your model (sample_model) appears in the navigation panel of the BigQuery web UI.第一次迭代完成后,您的 model (sample_model) 将出现在 BigQuery web UI 的导航面板中。 Because the query uses a CREATE MODEL statement to create a table, you do not see query results.因为查询使用 CREATE MODEL 语句来创建表,所以您看不到查询结果。 The output is an empty DataFrame. output 是一个空的 DataFrame。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM