简体   繁体   中英

Import data from google sheets to a MySQL table

Import data from google sheets to a MySQL table using google apps script. I have a significantly huge dataset to import google sheet into a table. But, I am running into exceeded maximum execution time exception are there other options to speed-up execution.

var address = 'database_IP_address';
var rootPwd = 'root_password';
var user = 'user_name';
var userPwd = 'user_password';
var db = 'database_name';

var root = 'root';
var instanceUrl = 'jdbc:mysql://' + address;
var dbUrl = instanceUrl + '/' + db;

function googleSheetsToMySQL() {   

  var RecId;
  var Code;
  var ProductDescription;
  var Price;

  var dbconnection = Jdbc.getConnection(dbUrl, root, rootPwd);
  var statement = dbconnection.createStatement();
  var googlesheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('product'); 
  var data = googlesheet.getDataRange().getValues();  

  for (var i = 1; i < data.length; i++) {
  RecId = data[i][0];
  Code = data[i][1];
  ProductDescription = data[i][2];
  Price = data[i][3];

  var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";  
  statement = dbconnection.prepareCall(sql);  
  statement.setString(1, RecId);
  statement.setString(2, Code);
  statement.setString(3, ProductDescription);
  statement.setString(4, Price);
  statement.executeUpdate();  
  }

  statement.close();
  dbconnection.close();
}

Using batch execution

dbconnection.setAutoCommit(false)

for (var i = 1; i < data.length; i++) {
RecId = data[i][0];
Code = data[i][1];
ProductDescription = data[i][2];
Price = data[i][3];

var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";
statement = dbconnection.prepareCall(sql);
statement.setString(1, RecId);
statement.setString(2, Code);
statement.setString(3, ProductDescription);
statement.setString(4, Price);
statement.addBatch()
statement.executeBatch()
}

dbconnection.commit()

I suspect that you may have figured out the solution to your problem, but for all those who might stumble across this like I did, there is an easy way to speed up these requests. The OP was nearly there...

Using the provided code:

function googleSheetsToMySQL() {

  var sheetName = 'name_of_google_sheet';

  var dbAddress = 'database_ip_address';
  var dbUser = 'database_user_name';
  var dbPassword = 'database_user_password';
  var dbName = 'database_name';
  var dbTableName = 'database_table_name';

  var dbURL = 'jdbc:mysql://' + dbAddress + '/' + dbName;

  // Regarding the statement used by the OP, you might find something like....
  //
  // "INSERT INTO " + dbTableName + " (recid, code, product_description, price) VALUES (?, ?, ?, ?);";
  //
  // to be more practical if you're trying to implement the OP's code, 
  // as you are unlikely to have a stored procedure named 'sp_googlesheetstotable', or may be more 
  // familiar with basic queries like INSERT, UPDATE, or SELECT

  var sql = "{call [dbo].[sp_googlesheetstotable](?,?,?,?)}";

  // The more records/requests you load into the statement object, the longer it will take to process,
  // which may mean you exceed the execution time before you can do any post processing.
  //
  // For example, you may want to record the last row you exported in the event the export must be halted
  // prematurely. You could create a series of Triggers to re-initiate the export, picking up right where
  // you left off.
  //
  // The other consideration is that you want your GAS memory utilization to remain as low as possible to
  // keep things running smoothly and quickly, so try to strike a balance that fits the data you're
  // working with.

  var maxRecordsPerBatch = 1000;

  var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = spreadsheet.getSheetByName(sheetName);

  var sheetData = sheet.getDataRange().getValues();

  var dbConnection = Jdbc.getConnection(dbURL, dbUser, dbPassword);

  // The following only needs to be set when you are changing the statement that needs to be prepared
  // or when you need to reset the variable.
  //
  // For example, if you were to switch to a different sheet which may have different values, columns,
  // structure, and/or target database table.

  var dbStatement = dbConnection.prepareCall(sql);

  var RecId;
  var Code;
  var ProductDescription;
  var Price;

  var recordCounter = 0;
  var lastRow;

  dbConnection.setAutoCommit(false);

  for (var i = 1; i < sheetData.length; i++) {

    lastRow = (i + 1 == sheetData.length ? true : false);

    RecId = sheetData[i][0];
    Code = sheetData[i][1];
    ProductDescription = sheetData[i][2];
    Price = sheetData[i][3];

    dbStatement.setString(1, RecId);
    dbStatement.setString(2, Code);
    dbStatement.setString(3, ProductDescription);
    dbStatement.setString(4, Price);

    // This command takes what has been set above and adds the request to the array that will be sent 
    // to the database for processing.

    dbStatement.addBatch();

    recordCounter += 1;

    if (recordCounter == maxRecordsPerBatch || lastRow)
    {
      try {
        dbStatement.executeBatch();
      }
      catch(e)
      {
        console.log('Attempted to update TABLE `' + dbTableName + '` in DB `' + dbName + '`, but the following error was returned: ' + e);
      }

      if (!lastRow)
      { // Reset vars
        dbStatement = dbConnection.prepareCall( sql ); // Better to reset this variable to avoid any potential "No operations allowed after statement closed" errors
        recordCounter = 0;
      }
    }
  }

  dbConnection.commit();
  dbConnection.close();
}

The OP may still have run up against the execution time limit (I did at less than 10k records), but you should avoid batching individual requests unless you're having trouble locating a problem row.

From this link

It is important to keep in mind, that each update added to a Statement or PreparedStatement is executed separately by the database. That means, that some of them may succeed before one of them fails. All the statements that have succeeded are now applied to the database, but the rest of the updates may not be. This can result in an inconsistent data in the database.

To avoid this, you can execute the batch update inside a JDBC transaction. When executed inside a transaction you can make sure that either all updates are executed, or none are. Any successful updates can be rolled back, in case one of the updates fail.

Alternative Solution

If the time limit is a huge bother, you might try externally accessing the data within your Sheets. I've copied the basic instructions for posterity's sake, but please visit the link if it still works.

Link to source

  1. Update composer.json to require “google/apiclient”: “^2.0” and run composer update
  2. Create project on https://console.developers.google.com/apis/dashboard .
  3. Click Enable APIs and enable the Google Sheets API
  4. Go to Credentials, then click Create credentials, and select Service account key
  5. Choose New service account in the drop down. Give the account a name, anything is fine.
  6. For Role I selected Project -> Service Account Actor
  7. For Key type, choose JSON (the default) and download the file. This file contains a private key so be very careful with it, it is your credentials after all
  8. Finally, edit the sharing permissions for the spreadsheet you want to access and share either View (if you only want to read the file) or Edit (if you need read/write) access to the client_email address you can find in the JSON file.
<?php
require __DIR__ . '/vendor/autoload.php';


/*
 * We need to get a Google_Client object first to handle auth and api calls, etc.
 */
$client = new \Google_Client();
$client->setApplicationName('My PHP App');
$client->setScopes([\Google_Service_Sheets::SPREADSHEETS]);
$client->setAccessType('offline');

/*
 * The JSON auth file can be provided to the Google Client in two ways, one is as a string which is assumed to be the
 * path to the json file. This is a nice way to keep the creds out of the environment.
 *
 * The second option is as an array. For this example I'll pull the JSON from an environment variable, decode it, and
 * pass along.
 */
$jsonAuth = getenv('JSON_AUTH');
$client->setAuthConfig(json_decode($jsonAuth, true));

/*
 * With the Google_Client we can get a Google_Service_Sheets service object to interact with sheets
 */
$sheets = new \Google_Service_Sheets($client);

/*
 * To read data from a sheet we need the spreadsheet ID and the range of data we want to retrieve.
 * Range is defined using A1 notation, see https://developers.google.com/sheets/api/guides/concepts#a1_notation
 */
$data = [];

// The first row contains the column titles, so lets start pulling data from row 2
$currentRow = 2;

// The range of A2:H will get columns A through H and all rows starting from row 2
$spreadsheetId = getenv('SPREADSHEET_ID');
$range = 'A2:H';
$rows = $sheets->spreadsheets_values->get($spreadsheetId, $range, ['majorDimension' => 'ROWS']);
if (isset($rows['values'])) {
    foreach ($rows['values'] as $row) {
        /*
         * If first column is empty, consider it an empty row and skip (this is just for example)
         */
        if (empty($row[0])) {
            break;
        }

        $data[] = [
            'col-a' => $row[0],
            'col-b' => $row[1],
            'col-c' => $row[2],
            'col-d' => $row[3],
            'col-e' => $row[4],
            'col-f' => $row[5],
            'col-g' => $row[6],
            'col-h' => $row[7],
        ];

        /*
         * Now for each row we've seen, lets update the I column with the current date
         */
        $updateRange = 'I'.$currentRow;
        $updateBody = new \Google_Service_Sheets_ValueRange([
            'range' => $updateRange,
            'majorDimension' => 'ROWS',
            'values' => ['values' => date('c')],
        ]);
        $sheets->spreadsheets_values->update(
            $spreadsheetId,
            $updateRange,
            $updateBody,
            ['valueInputOption' => 'USER_ENTERED']
        );

        $currentRow++;
    }
}

print_r($data);
/* Output:
Array
(
    [0] => Array
        (
            [col-a] => 123
            [col-b] => test
            [col-c] => user
            [col-d] => test user
            [col-e] => usertest
            [col-f] => email@domain.com
            [col-g] => yes
            [col-h] => no
        )

    [1] => Array
        (
            [col-a] => 1234
            [col-b] => another
            [col-c] => user
            [col-d] =>
            [col-e] => another
            [col-f] => another@eom.com
            [col-g] => no
            [col-h] => yes
        )

)
 */

Try to check this related SO question for some information on how to import data from Google Spreadsheets into MySQL using an Apps Script code.

Now, for your error exceeded maximum execution time exception, remember that the Apps Script quotas have only a maximum execution time for a single script of 6 mins / execution. So it means that you exceeded this limit.

Try to check this page for the tecnique on how to prevent Google Scripts from exceeding the maximum execution time limit.

For more information, check this links:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM