简体   繁体   中英

Regex match $ but not \$ or $$ in Google Apps Script's body.replaceText()

In Google Apps Script's body.replaceText() , I want to match

$Not\$yes $not $$
^         ^

I can match the dollar without a backslash, with

[^\\]\$

but I'm not sure how to handle a 2nd $ before and after or at the start of the document.

It's worth noting that in body.replaceText() , "A subset of the JavaScript regular expression features are not fully supported, such as capture groups and mode modifiers." Stack overflow reference

With Google Apps Script's body.replaceText() :

The body.replaceText() method does not support some portions of JavaScript RegExp . It is known that capture groups are not supported along with mode modifiers ( flags ).

You are wanting to replace the second character in a three character sequence where the first and third character are not single explicitly specified characters. This is effectively impossible with a single regular expression without look-ahead, look-behind assertions and/or capture groups. It would be possible to match the two character sequence [^\\$]\\$ and use negative look-ahead (?!\\$) . However, because Google Apps Script does not support capture groups, it is not possible to replace just the second character, the $ .

Thus, we need to use another approach which involves making multiple substitutions to get the text into a form where we can match only what we want. We also have to perform additional substitutions to return the text to its original form leaving only the operation(s) we desired.

We have a generalized thing (in this case a single character $ ) which we want to match, but we want to not match some specific cases that are very similar to it ( \\$ and $$ ) which would be matched with any regular expression which would be used to match the desired string. What we need to do is change those to something that will not match, perform the operation desired, and then change those back to what they were.

To do this, we need to perform multiple replace operations. The code below shows how this could be done.

Note 1: You have not clearly specified if $$ is to be ignored anywhere within the text, or only at the beginning, or end of the document. What about $$$ ? Currently, the code below will "ignore" all sequences that contain more than one $ in a row. If you are more specific in your requirements, it can be changed here, or you can just change it in your own code.

Note 2: For the code below, I'm assuming that you are wanting to perform a replaceText() on just the $ characters which you are wanting to match. You have not actually stated what you are wanting to do, but the code can be used to adapt to whatever action you desire.

 //Fake an Object called 'body' so that we can use it as if it was defined // in Google Apps Script by using the statement: // var body = DocumentApp.getActiveDocument().getBody(); var body = new SomeItem('$Not\\\\$yes $not $$'); //The following lines were tested as-is in Google Apps Script: //Open up character sequences which we can be sure are not used in the text. // These sequences can be temporarily used to represent the strings we do not want // to match. In this case we make sure that no 'Q' exists which is not followed // by a `z`. This lets us use any Q[^z] sequence for anything we desire. body.replaceText('Q','Qz'); //Make sure no 'Q' exists that is not 'Qz' //These look a bit strange because we have to use '\\\\' to get a single '\\' within // a string literal. body.replaceText('\\\\\\\\\\\\$','Qa'); //Use `Qa` to represent `\\$` //Use `QbQb` instead of `Qb` because of an issue with restoring and `$` in replacement // string at the end of the string. Also allows handling an odd number of $ in a row. body.replaceText('\\\\$\\\\$','QbQb'); //Use `QbQb` to represent `$$` body.replaceText('Qb\\\\$','QbQb'); //Handle an odd number of $ in a row. Not specified // in the question, but probably desired. //Here we perform whatever operation was desired on all remaining `$`. // In this example we will replace them with '_MATCH_'. // However, for the generalized case, we first we have to perform the // same preliminary substitution which we did to open up character sequences. var newText = '_MATCH_'; body.replaceText('\\\\$',newText.replace(/Q/mg,'Qz')); // Change remaining '$'. //If the replace() is not performed in the above line then any `Qa`, `Qb`, or `Qz` // in the new text will end up being replaced with '\\$', '$' and 'Q' respectively. //Restore the temporary changes body.replaceText('Qa','\\\\$'); //Restore `Qa` to `\\$` body.replaceText('Qb','$'); //Restore `Qb` to `$` body.replaceText('Qz','Q'); //Restore all `Qz` to `Q` //End of lines to be used in Google Apps Script console.log(body.text); 
 <head> <script> //A function which will perform similar to Google Apps Script so that the // code is closer to what is available there. No provision is made to eliminate // the capture group feature of RegExp. function SomeItem(_text){ this.text = _text; } SomeItem.prototype.replaceText=function (regExString,replaceText){ var theRegExp = new RegExp(regExString,"gm"); this.text = this.text.replace(theRegExp,replaceText); //console.log(regExString,replaceText,this.text); } </script> </head> <body/> 

In JavaScript:

[This portion of the answer was provided prior to the Question being changed to wanting to be used with body.replaceText() which does not support the full capabilities of JavaScript RegExp .]

Your RegExp does not actually match just the $ . It is matching two characters when the $ is not at the beginning of the string. JavaScript does not have a look-behind assertion, just look-ahead ( (?=y) requires y to follow the match and (?!y) requires that y must not follow the match). Because JavaScript does not have a look-behind assertion, it is not possible to match just the second character of a two character sequence. On the other hand, with the use of the look-ahead assertion, we can prevent the match based on the following character without the need to actually be matching a three character sequence.

To match what you are wanting, we have to exclude both matches with $ preceding the $ and those matches where the following character is a $ . This can be done by adding $ to the character set which is not matched (in addition to \\ ; changing it to [^\\\\$] ), and adding a look-ahead exclusion of $ , by using (?!\\$) .

This results in the RegExp of:

/(^|[^\\$])\$(?!\$)/g

Note that this also changes your non capturing group to a capturing group so that it can be used to recover the extra character prior to the $ when the RegExp is used in a replace() .

Here is a function example of using the above RegExp to match the test string you provided:

 var testMatch = '$Not\\\\$yes $not $$' //Need double \\\\ because it is in a string literal. //Note the use of a capture group and $1 in the replace string to retain the character // prior to the matching '$'. result = testMatch.replace(/(^|[^\\\\$])\\$(?!\\$)/g,'$1_MATCH_'); console.log(result); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM