Javascript正则表达式：在特定行和字符处获取文本

Question

Given a chunk of text (imagine a page from a book), how can I get the word at a particular line and character #? 给定一大堆文本（想象一本书的一页），我如何在特定行和字符＃处得到单词？

Find and return the word at Ln # 3, Ch # 7 "just". 查找并返回单词Ln＃3，Ch＃7“正好”。

var text = "Lorem ispum dolar\n
Si emit I dont know latin\n
Really just making this up as I go\n
Ok this should be enough for us to work on.\n

JSFiddle to try code on: http://jsfiddle.net/xa9xS/709/ JSFiddle尝试以下代码： http : //jsfiddle.net/xa9xS/709/

Answer 1

You can use something like this (?:.*\\n){2}.{6}\\s+(\\w+) Where this would get word of line 2+1 starting at character 6+1. 您可以使用类似(?:.*\\n){2}.{6}\\s+(\\w+)字符，其中从字符6 + 1开始将得到第2 + 1行的单词。

Edit: Figured I'd robustify it a bit. 编辑：想通了一点。 The above fails to match anything if you provide a character-index in the middle of a word. 如果您在一个单词的中间提供一个字符索引，则以上内容将不匹配任何内容。 The following will skip ahead untill the start of a word before it starts capturing: (?:.*\\n){2}.{6}.*?\\b(\\w+)\\b . 以下内容将跳过，直到单词开始捕获之前的单词开头： (?:.*\\n){2}.{6}.*?\\b(\\w+)\\b 。

PS: Regex in javascript doesn't support positive lookbehind, so skipping back to the start of the word is quite a bit trickier. PS：JavaScript中的Regex不支持正向查找，因此，跳回到单词的开头会有些棘手。

Edit2: Making the string.replace work requires us to capture the other parts of the string. Edit2：使string.replace工作需要我们捕获字符串的其他部分。 This seems to do the trick: text.replace(/((?:.*\\n){2}(?:.{6}.*?))\\b(\\w+)\\b((?:.*\\n?)*)/g, "$1[the-replacement]$3") but it does complicate things. 这似乎可以解决问题： text.replace(/((?:.*\\n){2}(?:.{6}.*?))\\b(\\w+)\\b((?:.*\\n?)*)/g, "$1[the-replacement]$3")但这会使事情复杂化。 It might be better to use the more direct approach in this case. 在这种情况下，最好使用更直接的方法。 Simplicity is king! 简约为王！

Answer 2

window.example_text = "Lorem ispum dolar\n\
Si emit I dont know latin\n\
Really just making this up as I go\n\
Ok this should be enough for us to work on.\n";

var lineNumber = 3;
var charNumber = 7;

var match = (example_text.split("\n")[lineNumber - 1]).substr(charNumber).split(/\s/)[0];
console.log(match);

http://jsfiddle.net/2DFhM/1/ http://jsfiddle.net/2DFhM/1/

Answer 3

Use this regex: 使用此正则表达式：

^(?:.*(?:\r?\n)*){2}.{6}\W+(\w+)

Explanation 说明

The ^ anchor asserts that we are at the beginning of the string ^锚断言我们在字符串的开头
To get to line 3, we need to skip two lines 要到达第3行，我们需要跳过两行
Our line skipper is (?:.*(?:\\r?\\n)*){2} , matching any chars that are not line breaks, then line breaks 我们的换行符是(?:.*(?:\\r?\\n)*){2} ，匹配所有非换行符，然后换行符
.{6} eats up the first six chars .{6}吃掉前六个字符
There is no word starting at character 7, so we are going to match the next word: 从第7个字符开始没有单词，因此我们要匹配下一个单词：
\\W+ matches any non-word chars \\W+匹配任何非单词字符
(\\w+) captures word chars to Group 1 (\\w+)将单词字符捕获到组1
we retrieve the match from Group 1 我们从组1检索匹配项

In JS: 在JS中：

var myregex = /^(?:.*[\r\n]*){2}.{6}\W+(\w+)/;
var matchArray = myregex.exec(yourString);
if (matchArray != null) {
    thematch = matchArray[1];
} else {
    thematch = "";
}

Answer 4

Probably too late now lol, lots of good answers but here goes for the sake of completeness: 现在可能已经太晚了，哈哈，很多好的答案，但是为了完整起见，这里是：

made this regexp here: http://regex101.com/r/nF2vX8/1 在此处进行了此正则表达式： http : //regex101.com/r/nF2vX8/1

(?:.*\\n.*){2}^(?:.{7})(\\w*\\W)

and here's a solution in javascript: 这是javascript中的解决方案：

var index_left = 0, index_right = 0, stringy = "";
for (; line_number-- > 0;){
    index_left = index_right;
    index_right = example_text.indexOf("\n", index_right) + 1;
}

stringy = example_text.substring(index_left, index_right-1);

index_left = 0;
index_left = stringy.indexOf(" ", char_number+1);
stringy = stringy.substring(0, index_left);
index_left = stringy.lastIndexOf(" ", index_left);
stringy = stringy.substring(index_left+1);

console.log(stringy);

and the fiddle for the js: http://jsfiddle.net/xa9xS/714/ 和js的提琴： http : //jsfiddle.net/xa9xS/714/

it mangles line_number but it's easy to fix by copying the value and i'm too bored to do it now :P 它会破坏line_number但是很容易通过复制值来修复，我现在很无聊，不能这样做：P

Javascript正则表达式：在特定行和字符处获取文本

问题描述

4 个解决方案

解决方案1
4 已采纳 2014-07-12 00:08:13

解决方案2
3 2014-07-12 00:13:44

解决方案3
2 2014-07-12 00:07:49

解决方案4
0 2014-07-12 01:24:26

Javascript正则表达式：在特定行和字符处获取文本

问题描述

4 个解决方案

解决方案1 4 已采纳 2014-07-12 00:08:13

解决方案2 3 2014-07-12 00:13:44

解决方案3 2 2014-07-12 00:07:49

解决方案4 0 2014-07-12 01:24:26

解决方案1
4 已采纳 2014-07-12 00:08:13

解决方案2
3 2014-07-12 00:13:44

解决方案3
2 2014-07-12 00:07:49

解决方案4
0 2014-07-12 01:24:26