简体   繁体   English

正则表达式获取除括号内的所有非特殊单词

[英]regex get every non-special word except inside parentheses

I have the following strings:我有以下字符串:

[
    '全新Precision 5530二合一移动工作站',
    '15" (5530)',
    '新14"灵越燃7000三边微边框',
    '灵越新13"(7380)轻薄本 热卖',
    'XPS新15"(9570)热卖',
    '新15"灵越5000(Intel)',
    '12” 二合一 (5290)'
]

I need to eliminate every non-chinese character(like product line name, model), including the ones inside parentheses, but I can't replace the (Intel) too(can be other string inside the parentheses that can't be on the regex match)我需要消除所有非中文字符(如产品线名称、型号),包括括号内的字符,但我也不能替换(Intel)(可以是括号内不能出现的其他字符串)正则匹配)

For now, I have the following: pattern = /(\\w+\\s+\\d+|\\(?\\d{4}\\)?|[az]+)/gi现在,我有以下内容: pattern = /(\\w+\\s+\\d+|\\(?\\d{4}\\)?|[az]+)/gi

this, applied to the previous array, returns this,应用于前一个数组,返回

[
     ["Precision 5530"],
     ["(5530)"],
     ["7000"],
     ["(7380)"],
     ["XPS", "(9570)"],
     ["5000", "Intel"],
     ["(5290)"]
]

which is almost perfect, except that "Intel" shouldn't be there..I can't seem to get to the regex that will exclude the Intel(or anything that is common letter inside ())这几乎是完美的,除了“英特尔”不应该在那里。

On regex101: https://regex101.com/r/vqO0BO/2在 regex101 上: https ://regex101.com/r/vqO0BO/2

can anyone help?有人可以帮忙吗?


Solution: With the regex provided in the answers(getting also the parentheses), and a bit of js, I manage to get the newText from text that I wanted..解决方案:使用答案中提供的正则表达式(也得到括号)和一些 js,我设法从我想要的文本中获取 newText ..

newText = text.replace(pattern, function(a, b) {
                    if(a === b) {
                        return " ";
                    } else {
                        if(a !== undefined) {
                            return a;
                        } else if(b !== undefined) {
                            return b;
                        } else { //If a and b are undefined, just replace the "undefined" with ""
                            return "";
                        }
                    }
                }).trim();

I suggest matching what is inside parentheses, and matching and capturing the rest.我建议匹配括号内的内容,并匹配并捕获其余部分。 Once the capturing group matches some text, the match can be replaced with a space, and if Group 1 did not match, replace with the whole match.一旦捕获组匹配某些文本,则可以将匹配替换为空格,如果组 1 不匹配,则替换为整个匹配。

 var strs = [ '全新Precision 5530二合一移动工作站', '15" (5530)', '新14"灵越燃7000三边微边框', '灵越新13"(7380)轻薄本 热卖', 'XPS新15"(9570)热卖', '新15"灵越5000(Intel)', '12” 二合一 (5290)' ]; var pattern = /\\([az]+\\)|(\\w+\\s+\\d+|\\(?\\d{4}\\)?|[az]+)/gi; for (var s of strs) { console.log( s.replace(pattern, function (a, b) { return b ? " " : a; }).trim() ); }

Regex details正则表达式详情

  • \\( - ( \\( - (
  • [az]+ - 1+ letters [az]+ - 1+ 个字母
  • \\) - a ) \\) - 一个)
  • | - or - 或者
  • (\\w+\\s+\\d+|\\(?\\d{4}\\)?|[az]+) - Group 1: 1+ word chars, 1+ whitespaces and 1+ digits, or an optional ( , 4 digits and an optional ) , or 1 or more ASCII letters. (\\w+\\s+\\d+|\\(?\\d{4}\\)?|[az]+) - 第 1 组:1+ 个单词字符、1+ 个空格和 1+ 个数字,或可选的( 、4 个数字和一个可选的) ,或 1 个或多个 ASCII 字母。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM