簡體   English   中英

將數據從谷歌表導入 MySQL 表

[英]Import data from google sheets to a MySQL table

使用谷歌應用程序腳本將數據從谷歌表導入 MySQL 表。 我有一個非常龐大的數據集可以將谷歌工作表導入表格。 但是,我遇到了超出最大執行時間的異常,是否還有其他選項可以加快執行速度。

var address = 'database_IP_address';
var rootPwd = 'root_password';
var user = 'user_name';
var userPwd = 'user_password';
var db = 'database_name';

var root = 'root';
var instanceUrl = 'jdbc:mysql://' + address;
var dbUrl = instanceUrl + '/' + db;

function googleSheetsToMySQL() {   

  var RecId;
  var Code;
  var ProductDescription;
  var Price;

  var dbconnection = Jdbc.getConnection(dbUrl, root, rootPwd);
  var statement = dbconnection.createStatement();
  var googlesheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('product'); 
  var data = googlesheet.getDataRange().getValues();  

  for (var i = 1; i < data.length; i++) {
  RecId = data[i][0];
  Code = data[i][1];
  ProductDescription = data[i][2];
  Price = data[i][3];

  var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";  
  statement = dbconnection.prepareCall(sql);  
  statement.setString(1, RecId);
  statement.setString(2, Code);
  statement.setString(3, ProductDescription);
  statement.setString(4, Price);
  statement.executeUpdate();  
  }

  statement.close();
  dbconnection.close();
}

使用批處理

dbconnection.setAutoCommit(false)

for (var i = 1; i < data.length; i++) {
RecId = data[i][0];
Code = data[i][1];
ProductDescription = data[i][2];
Price = data[i][3];

var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";
statement = dbconnection.prepareCall(sql);
statement.setString(1, RecId);
statement.setString(2, Code);
statement.setString(3, ProductDescription);
statement.setString(4, Price);
statement.addBatch()
statement.executeBatch()
}

dbconnection.commit()

我懷疑您可能已經找到了問題的解決方案,但是對於所有可能像我一樣偶然發現此問題的人來說,有一種簡單的方法可以加快這些請求。 OP快到了……

使用提供的代碼:

function googleSheetsToMySQL() {

  var sheetName = 'name_of_google_sheet';

  var dbAddress = 'database_ip_address';
  var dbUser = 'database_user_name';
  var dbPassword = 'database_user_password';
  var dbName = 'database_name';
  var dbTableName = 'database_table_name';

  var dbURL = 'jdbc:mysql://' + dbAddress + '/' + dbName;

  // Regarding the statement used by the OP, you might find something like....
  //
  // "INSERT INTO " + dbTableName + " (recid, code, product_description, price) VALUES (?, ?, ?, ?);";
  //
  // to be more practical if you're trying to implement the OP's code, 
  // as you are unlikely to have a stored procedure named 'sp_googlesheetstotable', or may be more 
  // familiar with basic queries like INSERT, UPDATE, or SELECT

  var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";

  // The more records/requests you load into the statement object, the longer it will take to process,
  // which may mean you exceed the execution time before you can do any post processing.
  //
  // For example, you may want to record the last row you exported in the event the export must be halted
  // prematurely. You could create a series of Triggers to re-initiate the export, picking up right where
  // you left off.
  //
  // The other consideration is that you want your GAS memory utilization to remain as low as possible to
  // keep things running smoothly and quickly, so try to strike a balance that fits the data you're
  // working with.

  var maxRecordsPerBatch = 1000;

  var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = spreadsheet.getSheetByName(sheetName);

  var sheetData = sheet.getDataRange().getValues();

  var dbConnection = Jdbc.getConnection(dbURL, dbUser, dbPassword);

  // The following only needs to be set when you are changing the statement that needs to be prepared
  // or when you need to reset the variable.
  //
  // For example, if you were to switch to a different sheet which may have different values, columns,
  // structure, and/or target database table.

  var dbStatement = dbConnection.prepareCall(sql);

  var RecId;
  var Code;
  var ProductDescription;
  var Price;

  var recordCounter = 0;
  var lastRow;

  dbConnection.setAutoCommit(false);

  for (var i = 1; i < sheetData.length; i++) {

    lastRow = (i + 1 == sheetData.length ? true : false);

    RecId = sheetData[i][0];
    Code = sheetData[i][1];
    ProductDescription = sheetData[i][2];
    Price = sheetData[i][3];

    dbStatement.setString(1, RecId);
    dbStatement.setString(2, Code);
    dbStatement.setString(3, ProductDescription);
    dbStatement.setString(4, Price);

    // This command takes what has been set above and adds the request to the array that will be sent 
    // to the database for processing.

    dbStatement.addBatch();

    recordCounter += 1;

    if (recordCounter == maxRecordsPerBatch || lastRow)
    {
      try {
        dbStatement.executeBatch();
      }
      catch(e)
      {
        console.log('Attempted to update TABLE `' + dbTableName + '` in DB `' + dbName + '`, but the following error was returned: ' + e);
      }

      if (!lastRow)
      { // Reset vars
        dbStatement = dbConnection.prepareCall( sql ); // Better to reset this variable to avoid any potential "No operations allowed after statement closed" errors
        recordCounter = 0;
      }
    }
  }

  dbConnection.commit();
  dbConnection.close();
}

OP 可能仍會遇到執行時間限制(我做了少於 10k 條記錄),但您應該避免批量處理單個請求,除非您在定位問題行時遇到問題。

從這個鏈接

請務必記住,添加到 Statement 或 PreparedStatement 的每個更新都由數據庫單獨執行。 這意味着,其中一些可能會在其中一個失敗之前成功。 所有成功的語句現在都應用於數據庫,但其余的更新可能不會。 這可能會導致數據庫中的數據不一致。

為避免這種情況,您可以在 JDBC 事務中執行批量更新。 在事務中執行時,您可以確保要么執行所有更新,要么都不執行。 任何成功的更新都可以回滾,以防其中一個更新失敗。

替代方案

如果時間限制是一個巨大的麻煩,您可以嘗試從外部訪問表格中的數據。 為了后代,我已經復制了基本說明,但如果它仍然有效,請訪問鏈接。

鏈接到源

  1. 更新 composer.json 以要求“google/apiclient”:“^2.0”並運行 composer update
  2. https://console.developers.google.com/apis/dashboard上創建項目。
  3. 單擊啟用 API 並啟用 Google Sheets API
  4. 轉到憑據,然后單擊創建憑據,然后選擇服務帳戶密鑰
  5. 在下拉列表中選擇新建服務帳戶。 給賬戶起個名字,什么都行。
  6. 對於角色,我選擇了 Project -> Service Account Actor
  7. 對於密鑰類型,選擇 JSON(默認)並下載文件。 這個文件包含一個私鑰,所以要非常小心,畢竟它是你的憑據
  8. 最后,編輯您想要訪問的電子表格的共享權限,並共享查看(如果您只想讀取文件)或編輯(如果您需要讀/寫)訪問您可以在 JSON 文件中找到的 client_email 地址。
<?php
require __DIR__ . '/vendor/autoload.php';


/*
 * We need to get a Google_Client object first to handle auth and api calls, etc.
 */
$client = new \Google_Client();
$client->setApplicationName('My PHP App');
$client->setScopes([\Google_Service_Sheets::SPREADSHEETS]);
$client->setAccessType('offline');

/*
 * The JSON auth file can be provided to the Google Client in two ways, one is as a string which is assumed to be the
 * path to the json file. This is a nice way to keep the creds out of the environment.
 *
 * The second option is as an array. For this example I'll pull the JSON from an environment variable, decode it, and
 * pass along.
 */
$jsonAuth = getenv('JSON_AUTH');
$client->setAuthConfig(json_decode($jsonAuth, true));

/*
 * With the Google_Client we can get a Google_Service_Sheets service object to interact with sheets
 */
$sheets = new \Google_Service_Sheets($client);

/*
 * To read data from a sheet we need the spreadsheet ID and the range of data we want to retrieve.
 * Range is defined using A1 notation, see https://developers.google.com/sheets/api/guides/concepts#a1_notation
 */
$data = [];

// The first row contains the column titles, so lets start pulling data from row 2
$currentRow = 2;

// The range of A2:H will get columns A through H and all rows starting from row 2
$spreadsheetId = getenv('SPREADSHEET_ID');
$range = 'A2:H';
$rows = $sheets->spreadsheets_values->get($spreadsheetId, $range, ['majorDimension' => 'ROWS']);
if (isset($rows['values'])) {
    foreach ($rows['values'] as $row) {
        /*
         * If first column is empty, consider it an empty row and skip (this is just for example)
         */
        if (empty($row[0])) {
            break;
        }

        $data[] = [
            'col-a' => $row[0],
            'col-b' => $row[1],
            'col-c' => $row[2],
            'col-d' => $row[3],
            'col-e' => $row[4],
            'col-f' => $row[5],
            'col-g' => $row[6],
            'col-h' => $row[7],
        ];

        /*
         * Now for each row we've seen, lets update the I column with the current date
         */
        $updateRange = 'I'.$currentRow;
        $updateBody = new \Google_Service_Sheets_ValueRange([
            'range' => $updateRange,
            'majorDimension' => 'ROWS',
            'values' => ['values' => date('c')],
        ]);
        $sheets->spreadsheets_values->update(
            $spreadsheetId,
            $updateRange,
            $updateBody,
            ['valueInputOption' => 'USER_ENTERED']
        );

        $currentRow++;
    }
}

print_r($data);
/* Output:
Array
(
    [0] => Array
        (
            [col-a] => 123
            [col-b] => test
            [col-c] => user
            [col-d] => test user
            [col-e] => usertest
            [col-f] => email@domain.com
            [col-g] => yes
            [col-h] => no
        )

    [1] => Array
        (
            [col-a] => 1234
            [col-b] => another
            [col-c] => user
            [col-d] =>
            [col-e] => another
            [col-f] => another@eom.com
            [col-g] => no
            [col-h] => yes
        )

)
 */

嘗試檢查此相關SO 問題,了解有關如何使用 Apps 腳本代碼將數據從 Google 電子表格導入 MySQL 的信息。

現在,對於您的錯誤超出最大執行時間異常,請記住Apps 腳本配額只有單個腳本的最大執行時間為 6 分鍾/執行。 所以這意味着你超過了這個限制。

嘗試查看此頁面以了解有關如何防止 Google Scripts 超出最大執行時間限制的技術。

有關更多信息,請查看此鏈接:

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM