简体   繁体   中英

Google Apps Script: TextFinder regex not finding trailing spaces

I have a column in Sheets with company names where I want to remove text like inc., llc, etc. It can vary considerably where some will have inc with or without a period at the end, sometimes set off with a comma, and it can fall at the end of the string as well as in the middle, so I created the nested loops rather than searching for each iteration explicitly.

The if statement exists for text that has a leading space because, in those cases, I want to replace with a space.

The following code works, except when there is a trailing space, for example: " inc " or ", inc ". If I put something explicit like searchTerm = range.createTextFinder(" inc "); there is no issue.

The problem seems to be in the concatenated string. I have tried replacing the " " in the punctuation array with "\\s" but that still does not work.

Can someone help me see what am I missing here?

function dataClean(range) {

  var endings = ["inc", "ltd", "llc", "llp", "lp", "lcc"];
  var punctuationStart = [", ", " "];
  var punctuationEnd = [".", " ", "$"];
  var searchTerm, regex;

  /*Loops through endings with the pattern: ', inc.' ', inc ' ', inc$' ' inc.' ' inc ' ' inc$'*/
  for (i=0; i<endings.length; i++){

    for (j=0; j<punctuationStart.length; j++){
    
      for (k=0; k<punctuationEnd.length; k++){

        regex = punctuationStart[j] + endings[i] + punctuationEnd[k];

        /*k==1 is used for trailing spaces so they are replaced with a space*/
        if (k==1){

          searchTerm = range.createTextFinder(regex);
          searchTerm.useRegularExpression(true);
          searchTerm.matchCase(false);
          searchTerm.replaceAllWith(" ");

        }
        else{
          
          searchTerm = range.createTextFinder(regex);
          searchTerm.useRegularExpression(true);
          searchTerm.matchCase(false);
          searchTerm.replaceAllWith("");

        }   

      }

    }

  }

} ```

I'm not an expert on regex but as long as the suffix is at the end of the company name this will work. If its in the middle you'll have to break the string apart, extract the suffix and put it back together.

function testRegex() {
  let test = "My Company, llc.";
  let match = test.match(/(\s|,\s)(inc|ltd|llc|lp|lcc)/);
  console.log(match);
  console.log(test.substring(0,match.index));
}

11:27:53 AM Notice  Execution started
11:27:53 AM Info    [ ', llc',
  ', ',
  'llc',
  index: 10,
  input: 'My Company, llc.',
  groups: undefined ]
11:27:53 AM Info    My Company
11:27:53 AM Notice  Execution completed

I've tested the regex for the following scenarios at RegExr

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM