简体   繁体   English

替换引号(但不触及嵌套引号)

[英]Replace quotes (but not touch nested quotes)

Got problems with next issue: I need to replace quotes with angle quotes, but if sentence got quotes again - the shouldnt be replaced.下一个问题有问题:我需要用角度引号替换引号,但如果句子再次得到引号 - 不应该替换。

So to get open quote I use next:因此,为了获得公开报价,我接下来使用:

const regexStartQuote = /"(?=\S)/gm;
const replaceStartQuote = '«'

to replace quote with closing one I use:用我使用的结束语替换引号:

// const regexEndQuote = /(?<=\S)"/gm; // not supported in Mozilla
const regexEndQuote = /"(?=\s)/gm;
const replaceEndQuote = '»'

And this works.这有效。 I mean: "Some text" -> «Some text»我的意思是:“一些文本”-> «一些文本»

Btw I work with draftjs and this changes applied on fly.顺便说一句,我使用的是 Draftjs,并且这些更改是即时应用的。

And I need to extend existing regex`s so if the sentence should be something like:我需要扩展现有的正则表达式,所以如果句子应该是这样的:

«Some text "Text in quotes" something more» «一些文字“引号中的文字”更多»

And, of course possible variants like:而且,当然还有可能的变体,例如:

«Some text "Text in quotes", something more» «一些文字“引号中的文字”,还有更多»

«Some text: "Text in quotes", something more» «一些文字:“引号中的文字”,更多»

«Some text: "Text in quotes",- something more» «一些文字:“引号中的文字”,- 更多»

UPDATE更新

The flow of program is next: Each symbol that is typed is merged with string.接下来是程序流程:输入的每个符号都与字符串合并。 I mean, first when eg textblock is empty我的意思是,首先当例如 textblock 为空时

the string is just `` (empty),字符串只是``(空),

then user type 'w' -> string become w ,然后用户输入 'w' -> 字符串变为w

then 'o' -> string wo ,然后 'o' -> 字符串wo

then 'w' -> string is wow ,然后 'w' -> 字符串是wow

then ' ' (space) -> string is wow ,然后 ' ' (空格) -> 字符串是wow ,

then " -> string is wow «然后" -> 字符串是wow «

and so on等等

As I understand, regex should be something like:据我了解,正则表达式应该是这样的:

`If user typed " and there is no » before it but we have « we shouldnt change ". `如果用户键入“并且在它之前没有»,但我们有«我们不应该更改”。

Try this solution试试这个解决方案

const startRegex = /^"/gm;
const endRegex = /"$/gm;

str.replace(startRegex, "<<")

str.replace(endRegex, ">>")

 const startRegex = /^"/gm; const endRegex = /"$/gm; const str = `"Some text "Text in quotes" something more"` let result = str.replace(startRegex, "<<") result = result.replace(endRegex, ">>") console.log(result);

This handles the nesting of quoted strings that occur on a line boundary (the quoted string itself does not have to begin and end at the start and end of the line).这处理出现在行边界上的引用字符串的嵌套(引用字符串本身不必在行的开始结束开始结束)。 This is somewhat artificial, but if you want to allow for multiple internal quoted strings within the outer quoted string, then this almost becomes a necessity.这有点人为,但是如果您想在外部引用字符串中允许多个内部引用字符串,那么这几乎成为必需品。 This would be the problem.这将是问题所在。 Consider the following string:考虑以下字符串:

var s = '"This is an "internal quote" within a sentence." A short sentence.\n' +
        '"Another quoted sentence."\n' +
        '"Yet another quoted sentence."' +
        'etc.';

What prevents " A short sentence.\n" and "\n" , for example, from being recognized as internal quoted strings?例如,是什么阻止" A short sentence.\n""\n"被识别为内部引用字符串? In other words, it becomes impossible to tell when a quote signifies the end of the outer quoted string or the start of a new internal quoted string (at least until you get to the end of the entire input).换句话说,当引号表示外部引用字符串的结束或新的内部引用字符串的开始时(至少在您到达整个输入的结尾之前),就变得不可能了。

The regex: ^([^"\n]*)"((?:[^"\n]*"[^"\n]*")*[^"\n]*)"([^*\n]*)$正则表达式: ^([^"\n]*)"((?:[^"\n]*"[^"\n]*")*[^"\n]*)"([^*\n]*)$

  1. ^ Matches the start of the line. ^匹配行首。
  2. ([^"\n]*) Capture group 1: 0 or more characters that match anything other than " or newline. ([^"\n]*)捕获组 1:0 个或多个匹配除"或换行符之外的任何字符。 This is everything on the line that might precede the opening quote.这是可能在开场报价之前的所有内容。
  3. " Matches the opening quote. Now we will be looking for optional quoted strings withing the outer quotes "匹配开引号。现在我们将寻找带有外引号的可选引号字符串
  4. (?:[^"\n]*"[^"\n]*") A non-capturing group that looks for 0 or more non-quote/non-newline characters followed by a quote followed by 0 or more non-quote/non-newline characters followed by a quote. (?:[^"\n]*"[^"\n]*")一个非捕获组,它查找 0 个或多个非引号/非换行符,后跟一个引号,后跟 0 个或多个非引号/非换行符后跟引号。 This would be an internal quoted string.这将是一个内部引用的字符串。
  5. ((?:[^"\n]*"[^"\n]*"))* The above pattern can be repeated 0 or more times. ((?:[^"\n]*"[^"\n]*"))*上述模式可以重复 0 次或多次。
  6. [^"\n]*" Matches 0 or more non-quote/non-newline characters followed by a quote. [^"\n]*"匹配 0 个或多个非引号/非换行符后跟引号。 This takes care of matching the rest of the quoted string.这负责匹配引用字符串的 rest。
  7. ([^*\n]*) Matches the rest of the line (0 or more characters), which should not include a quote. ([^*\n]*)匹配不应包含引号的行的 rest(0 个或多个字符)。

正则表达式可视化

The above regex is fairly complicated because it checks for balanced quotes.上面的正则表达式相当复杂,因为它检查平衡的引号。 If you do not care to do such rigid checking, then a simpler regex that only looks for the first and last quotes on a line would be (and the rest of the code stays the same):如果您不关心进行这种严格的检查,那么只查找一行中的第一个和最后一个引号的更简单的正则表达式将是(并且代码的 rest 保持不变):

/^([^"\n]*)"([^\n]*)"([^"\n])*$/gm;

 var s = 'A plain line.\n' + 'This is "Some text in quotes" and some without.\n' + '"This has "quotes within quotes" and some without."\n' + '"This has "many" "quoted" "strings" within quotes."'; var regex = /^([^"\n]*)"((?:[^"\n]*"[^"\n]*")*[^"\n]*)"([^*\n]*)$/gm; console.log(s.replace(regex, "$1«$2»$3"));

Update更新

To modify input, s , as it is entered, you need to test against several regular expressions:要在输入时修改输入s ,您需要针对几个正则表达式进行测试:

  1. If input matches /^[^"\n]*$/ (no quote on line), then no replacement necessary.如果输入匹配/^[^"\n]*$/ (没有引号),则无需替换。
  2. If input matches /^[^«\n]*«([^»\n]*»)?[^"\n]*$/ , then no replacement necessary.如果输入匹配/^[^«\n]*«([^»\n]*»)?[^"\n]*$/ ,则无需替换。
  3. If input matches /^([^"«\n]*)"$/ (first quote seen), then s = s.replace('"', '«');如果输入匹配/^([^"«\n]*)"$/ (看到第一个引号),则s = s.replace('"', '«');
  4. If input matches /^([^"«\n]*)«([^\n]*)"$/ (other than first quote seen), then s = s.replace('»', '"'); s = s.replace(/"$/, '»');如果输入匹配/^([^"«\n]*)«([^\n]*)"$/ (除了看到的第一个引号),那么s = s.replace('»', '"'); s = s.replace(/"$/, '»');

Code snippets don't seem to allow true one-character-at-a-time input, but this one simulates what it would look like:代码片段似乎不允许真正的一次输入一个字符,但这个模拟了它的样子:

 function test(str) { let s = ''; for (let i = 0; i < str.length; i++) { key = str.charAt(i); s += key; if (/^[^"\n]*$/.test(s) || /^[^«\n]*«([^»\n]*»)?[^"\n]*$/.test(s)); else if (/^([^"«\n]*)"$/.test(s)) s = s.replace('"', '«'); else if (/^([^"«\n]*)«([^\n]*)"$/.test(s)) { s = s.replace('»', '"'); s = s.replace(/"$/, '»'); } console.log("\n" + s); } } test('a"bc"de"fg"h"ij"');

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM