简体   繁体   中英

Google Apps Script - XML Parser - Regex

I am using a Google Apps Script that pulls the content from a feed in a sheet.

This is the code that I'm using:

function processXML(FeedURL,sheetsFileDestinationURL,rawPasteSheetName,OPT_childNamesArray,OPT_Namespace){

   var OPT_childNamesArray = ["link"]; // get only item url from the feed

  var GoogleSheetsFile = SpreadsheetApp.openByUrl(sheetsFileDestinationURL);
  var GoogleSheetsPastePage = GoogleSheetsFile.getSheetByName(rawPasteSheetName);
  if (OPT_childNamesArray){
    GoogleSheetsPastePage.getDataRange().offset(1,0).clearContent(); // get all filled cells, omitting the header row, and clear content
  }
  else {
    GoogleSheetsPastePage.getDataRange().offset(0,0).clearContent(); // get all filled cells, INCLUDING the header row, and clear content
  }

  // Generate 2d/md array / rows export based on requested columns and feed
  var exportRows = []; // hold all the rows that are generated to be pasted into the sheet
  var XMLFeedURL = FeedURL;
  var feedContent = UrlFetchApp.fetch(XMLFeedURL).getContentText(); // get the full feed content
  var feedItems = XmlService.parse(feedContent).getRootElement().getChild('channel').getChildren('item'); // get all items in the feed
  for (var x=0; x<feedItems.length; x++){
    // Iterate through items in the XML/RSS feed
    var currentFeedItem = feedItems[x];
    var singleItemArray = []; // use to hold all the values for this single item/row

    // Parse for specific children (requires names and namespace)
    if (OPT_childNamesArray){
      for (var y=0; y<OPT_childNamesArray.length; y++){
        // Iterate through requested children by name and fill rows
        var currentChildName = OPT_childNamesArray[y];
        if (OPT_Namespace){

          if (currentFeedItem.getChild(OPT_childNamesArray[y],OPT_Namespace)){
            singleItemArray.push(currentFeedItem.getChildText(OPT_childNamesArray[y],OPT_Namespace));
          }
          else {
            singleItemArray.push("null");
          }
        }
        else {
          if (currentFeedItem.getChild(OPT_childNamesArray[y])){
            singleItemArray.push(currentFeedItem.getChildText(OPT_childNamesArray[y]));
          }
          else {
            singleItemArray.push("null");
          }
        }
      }
      exportRows.push(singleItemArray);
    }

    // Parse for ALL children, does not require knowing names or namespace
    else if (!OPT_childNamesArray){
      var allChildren = currentFeedItem.getChildren();

      if (x == 0){
        // if looking at first item, create a header row first with column headings
        var headerRow = [];
        for (var h=0; h<allChildren.length; h++){
          headerRow.push(allChildren[h].getName());
        }
        exportRows.push(headerRow);
      }

      for (var c=0; c<allChildren.length; c++){
        singleItemArray.push(allChildren[c].getText());
      }

      exportRows.push(singleItemArray);
    }
  }

  // Paste the generated md array export into the spreadsheet
  if (OPT_childNamesArray){
    GoogleSheetsPastePage.getRange(2,1,exportRows.length,exportRows[1].length).setValues(exportRows);
  }
  else if (!OPT_childNamesArray){
    var maxRangeLength = 0;
    var currentRowIndex = 1;
    for (var x = 0; x<exportRows.length; x++){
      if (exportRows[x].length > maxRangeLength){
        maxRangeLength = exportRows[x].length;
      }
      GoogleSheetsPastePage.getRange(currentRowIndex,1,1,exportRows[x].length).setValues([exportRows[x]]);
      currentRowIndex++;
    }
  }
}

My problem is this:

When I run this code I get:

https://url/115-396/

https://url/115-396/

https://url/115-396/

I need to remove "115-396/" .

So I tryed to add this code but didn't work:

...
  // Paste the generated md array export into the spreadsheet
  if (OPT_childNamesArray){

    for (var k = 0; k < exportRows.length; k++) {
      var re = '115-396/'
      var replacingItem = '';
      var URL = exportRows[0].toString().replace(re, replacingItem);
    }

    GoogleSheetsPastePage.getRange(2,1,exportRows.length,exportRows[1].length).setValue(URL);

  }
  else if (!OPT_childNamesArray){

...

Edit after @Yuri reply:

  // Paste the generated md array export into the spreadsheet
  if (OPT_childNamesArray){

 for ( k=0; k < exportRows[0].length; k++) {
    var re = '115-396/'
    var replacingItem = '';
    exportRows[0][k] = exportRows[0][k].toString().replace(re, replacingItem); 
  }

      GoogleSheetsPastePage.getRange(2,1,exportRows.length,exportRows[1].length).setValues(exportRows);

  }

result:

https://url/

https://url/115-396/

https://url/115-396/

Basically, the regex is applied only to the first url.

How I can make that the regex is applied to all the url's?


Any help? Thanks

You are using a for to iterate thru the exportRow array, but later on, you're not using the k iterator inside the for.

Then, you are not accessing the exportRows array, only the first position:

      var URL = exportRows[0].toString().replace(re, replacingItem);

Shouldn't be?

      var URL = exportRows[k].toString().replace(re, replacingItem);

In that case, it won't work, because URL it's not an array, so by doing this you are only saving the last assignation produced on the for iterator on the URL, I believe you are trying to do the following:

  for ( k=0; k < exportRows.length; k++) {
    var re = '115-396/'
    var replacingItem = '';
    exportRows[k] = exportRows[k].toString().replace(re, replacingItem);
  }

And you'll have exportRows as an array of the desired url's without the 115-396 extensions.

Now you can place this on the spreadsheet with setValue as you were doing, but setValue is for strings, integers, etc, and not for arrays. For arrays you have setValues()

GoogleSheetsPastePage.getRange(2,1,exportRows.length,exportRows[1].length).setValues(exportRows);

But, then, the range of exportRows should match the range of your getRange selection, which I'm not sure it's happening.

Just to clarify it, exportRows.length is the length of the array, and exportRows[1] is the length of the string/url stored on the position 1 of the array.

Hope this helps, the question is not really clear neither the intentions, provide more info if still not working.


How to know the size of the range you're getting?

   var myrange = GoogleSheetsPastePage.getRange(2,1,exportRows.length,exportRows[1].length)
   Logger.log(myrange.getNumRows());
   Logger.log(myrange.getNumColumns());

You'll be able to know the range you have on getRange and make it match with the exportRows size.

Make sure to check the attached documentation, and in case you have more doubts please open a new question related to it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM