简体   繁体   English

我将如何使用正则表达式来存储文档中某些单词的所有出现?

[英]How would I use regex to store all occurrences of certain words in a document?

I'm a JavaScript developer who gets weak in the knees when he sees regex. 我是一名JavaScript开发人员,当他看到正则表达式时会屈膝。

But right now, I'm working on a side project that seems to require it. 但是现在,我正在研究一个似乎需要它的辅助项目。

I want to create an array of 'important words' (there are around 250 of them), and then scan through a giant document looking for and storing each occurrence of an 'important word' for analysis and further manipulation. 我想创建一个“重要单词”数组(其中大约有250个),然后浏览一个巨大的文档,查找并存储每次出现的“重要单词”以进行分析和进一步处理。

I have no idea where to start (or what to Google) when it comes to the regex part of this, nor do I know the expertise required for what I'm trying to do. 对于正则表达式部分,我不知道从哪里开始(或对Google来说是什么),我也不知道我要做什么。

If I can get the 'important words' individually into an array, I know what to do. 如果我可以将“重要单词”分别放入一个数组,那么我知道该怎么做。 It's the steps leading up to that that I'm confused about. 这是导致我感到困惑的步骤。

Any basic advice or direction would be much appreciated. 任何基本的建议或方向将不胜感激。

Thanks! 谢谢!

What about doing something like this? 怎么做这样的事情?

var list = ['test', 'west', 'pest', 'nest'], results = {},
    string = 'pesty test for the pest from the west test';
for (var i=0, l=list.length; i<l; i++) {
    var match = string.match(RegExp('\\b' + list[i] + '\\b', 'gi'));
    results[list[i]] = (match !== null) ? match.length : 0;
}
// results = {test: 2, west: 1, pest: 1, nest: 0}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM