[英]Import data from google sheets to a MySQL table
使用谷歌應用程序腳本將數據從谷歌表導入 MySQL 表。 我有一個非常龐大的數據集可以將谷歌工作表導入表格。 但是,我遇到了超出最大執行時間的異常,是否還有其他選項可以加快執行速度。
var address = 'database_IP_address';
var rootPwd = 'root_password';
var user = 'user_name';
var userPwd = 'user_password';
var db = 'database_name';
var root = 'root';
var instanceUrl = 'jdbc:mysql://' + address;
var dbUrl = instanceUrl + '/' + db;
function googleSheetsToMySQL() {
var RecId;
var Code;
var ProductDescription;
var Price;
var dbconnection = Jdbc.getConnection(dbUrl, root, rootPwd);
var statement = dbconnection.createStatement();
var googlesheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('product');
var data = googlesheet.getDataRange().getValues();
for (var i = 1; i < data.length; i++) {
RecId = data[i][0];
Code = data[i][1];
ProductDescription = data[i][2];
Price = data[i][3];
var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";
statement = dbconnection.prepareCall(sql);
statement.setString(1, RecId);
statement.setString(2, Code);
statement.setString(3, ProductDescription);
statement.setString(4, Price);
statement.executeUpdate();
}
statement.close();
dbconnection.close();
}
使用批處理
dbconnection.setAutoCommit(false)
for (var i = 1; i < data.length; i++) {
RecId = data[i][0];
Code = data[i][1];
ProductDescription = data[i][2];
Price = data[i][3];
var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";
statement = dbconnection.prepareCall(sql);
statement.setString(1, RecId);
statement.setString(2, Code);
statement.setString(3, ProductDescription);
statement.setString(4, Price);
statement.addBatch()
statement.executeBatch()
}
dbconnection.commit()
我懷疑您可能已經找到了問題的解決方案,但是對於所有可能像我一樣偶然發現此問題的人來說,有一種簡單的方法可以加快這些請求。 OP快到了……
使用提供的代碼:
function googleSheetsToMySQL() {
var sheetName = 'name_of_google_sheet';
var dbAddress = 'database_ip_address';
var dbUser = 'database_user_name';
var dbPassword = 'database_user_password';
var dbName = 'database_name';
var dbTableName = 'database_table_name';
var dbURL = 'jdbc:mysql://' + dbAddress + '/' + dbName;
// Regarding the statement used by the OP, you might find something like....
//
// "INSERT INTO " + dbTableName + " (recid, code, product_description, price) VALUES (?, ?, ?, ?);";
//
// to be more practical if you're trying to implement the OP's code,
// as you are unlikely to have a stored procedure named 'sp_googlesheetstotable', or may be more
// familiar with basic queries like INSERT, UPDATE, or SELECT
var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";
// The more records/requests you load into the statement object, the longer it will take to process,
// which may mean you exceed the execution time before you can do any post processing.
//
// For example, you may want to record the last row you exported in the event the export must be halted
// prematurely. You could create a series of Triggers to re-initiate the export, picking up right where
// you left off.
//
// The other consideration is that you want your GAS memory utilization to remain as low as possible to
// keep things running smoothly and quickly, so try to strike a balance that fits the data you're
// working with.
var maxRecordsPerBatch = 1000;
var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
var sheet = spreadsheet.getSheetByName(sheetName);
var sheetData = sheet.getDataRange().getValues();
var dbConnection = Jdbc.getConnection(dbURL, dbUser, dbPassword);
// The following only needs to be set when you are changing the statement that needs to be prepared
// or when you need to reset the variable.
//
// For example, if you were to switch to a different sheet which may have different values, columns,
// structure, and/or target database table.
var dbStatement = dbConnection.prepareCall(sql);
var RecId;
var Code;
var ProductDescription;
var Price;
var recordCounter = 0;
var lastRow;
dbConnection.setAutoCommit(false);
for (var i = 1; i < sheetData.length; i++) {
lastRow = (i + 1 == sheetData.length ? true : false);
RecId = sheetData[i][0];
Code = sheetData[i][1];
ProductDescription = sheetData[i][2];
Price = sheetData[i][3];
dbStatement.setString(1, RecId);
dbStatement.setString(2, Code);
dbStatement.setString(3, ProductDescription);
dbStatement.setString(4, Price);
// This command takes what has been set above and adds the request to the array that will be sent
// to the database for processing.
dbStatement.addBatch();
recordCounter += 1;
if (recordCounter == maxRecordsPerBatch || lastRow)
{
try {
dbStatement.executeBatch();
}
catch(e)
{
console.log('Attempted to update TABLE `' + dbTableName + '` in DB `' + dbName + '`, but the following error was returned: ' + e);
}
if (!lastRow)
{ // Reset vars
dbStatement = dbConnection.prepareCall( sql ); // Better to reset this variable to avoid any potential "No operations allowed after statement closed" errors
recordCounter = 0;
}
}
}
dbConnection.commit();
dbConnection.close();
}
OP 可能仍會遇到執行時間限制(我做了少於 10k 條記錄),但您應該避免批量處理單個請求,除非您在定位問題行時遇到問題。
請務必記住,添加到 Statement 或 PreparedStatement 的每個更新都由數據庫單獨執行。 這意味着,其中一些可能會在其中一個失敗之前成功。 所有成功的語句現在都應用於數據庫,但其余的更新可能不會。 這可能會導致數據庫中的數據不一致。
為避免這種情況,您可以在 JDBC 事務中執行批量更新。 在事務中執行時,您可以確保要么執行所有更新,要么都不執行。 任何成功的更新都可以回滾,以防其中一個更新失敗。
如果時間限制是一個巨大的麻煩,您可以嘗試從外部訪問表格中的數據。 為了后代,我已經復制了基本說明,但如果它仍然有效,請訪問鏈接。
- 更新 composer.json 以要求“google/apiclient”:“^2.0”並運行 composer update
- 在https://console.developers.google.com/apis/dashboard上創建項目。
- 單擊啟用 API 並啟用 Google Sheets API
- 轉到憑據,然后單擊創建憑據,然后選擇服務帳戶密鑰
- 在下拉列表中選擇新建服務帳戶。 給賬戶起個名字,什么都行。
- 對於角色,我選擇了 Project -> Service Account Actor
- 對於密鑰類型,選擇 JSON(默認)並下載文件。 這個文件包含一個私鑰,所以要非常小心,畢竟它是你的憑據
- 最后,編輯您想要訪問的電子表格的共享權限,並共享查看(如果您只想讀取文件)或編輯(如果您需要讀/寫)訪問您可以在 JSON 文件中找到的 client_email 地址。
<?php
require __DIR__ . '/vendor/autoload.php';
/*
* We need to get a Google_Client object first to handle auth and api calls, etc.
*/
$client = new \Google_Client();
$client->setApplicationName('My PHP App');
$client->setScopes([\Google_Service_Sheets::SPREADSHEETS]);
$client->setAccessType('offline');
/*
* The JSON auth file can be provided to the Google Client in two ways, one is as a string which is assumed to be the
* path to the json file. This is a nice way to keep the creds out of the environment.
*
* The second option is as an array. For this example I'll pull the JSON from an environment variable, decode it, and
* pass along.
*/
$jsonAuth = getenv('JSON_AUTH');
$client->setAuthConfig(json_decode($jsonAuth, true));
/*
* With the Google_Client we can get a Google_Service_Sheets service object to interact with sheets
*/
$sheets = new \Google_Service_Sheets($client);
/*
* To read data from a sheet we need the spreadsheet ID and the range of data we want to retrieve.
* Range is defined using A1 notation, see https://developers.google.com/sheets/api/guides/concepts#a1_notation
*/
$data = [];
// The first row contains the column titles, so lets start pulling data from row 2
$currentRow = 2;
// The range of A2:H will get columns A through H and all rows starting from row 2
$spreadsheetId = getenv('SPREADSHEET_ID');
$range = 'A2:H';
$rows = $sheets->spreadsheets_values->get($spreadsheetId, $range, ['majorDimension' => 'ROWS']);
if (isset($rows['values'])) {
foreach ($rows['values'] as $row) {
/*
* If first column is empty, consider it an empty row and skip (this is just for example)
*/
if (empty($row[0])) {
break;
}
$data[] = [
'col-a' => $row[0],
'col-b' => $row[1],
'col-c' => $row[2],
'col-d' => $row[3],
'col-e' => $row[4],
'col-f' => $row[5],
'col-g' => $row[6],
'col-h' => $row[7],
];
/*
* Now for each row we've seen, lets update the I column with the current date
*/
$updateRange = 'I'.$currentRow;
$updateBody = new \Google_Service_Sheets_ValueRange([
'range' => $updateRange,
'majorDimension' => 'ROWS',
'values' => ['values' => date('c')],
]);
$sheets->spreadsheets_values->update(
$spreadsheetId,
$updateRange,
$updateBody,
['valueInputOption' => 'USER_ENTERED']
);
$currentRow++;
}
}
print_r($data);
/* Output:
Array
(
[0] => Array
(
[col-a] => 123
[col-b] => test
[col-c] => user
[col-d] => test user
[col-e] => usertest
[col-f] => email@domain.com
[col-g] => yes
[col-h] => no
)
[1] => Array
(
[col-a] => 1234
[col-b] => another
[col-c] => user
[col-d] =>
[col-e] => another
[col-f] => another@eom.com
[col-g] => no
[col-h] => yes
)
)
*/
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.