[英]Regex: is there a oneliner for this?
I want to search inside multiple big text files (200MB each) as fast as possible.我想尽快在多个大文本文件(每个 200MB)中进行搜索。 I am using the command line tool ripgrep and I want to call it only once.我正在使用命令行工具ripgrep ,我只想调用它一次。
In the following string:在以下字符串中:
***foo***bar***baz***foo***bar***baz
( ***
stands for a different type and number of characters.) ( ***
代表不同类型和数量的字符。)
I want to match baz
, but only if it follows the first occurence of foo***bar***
我想匹配baz
,但baz
是它遵循foo***bar***
的第一次出现
So in ***foo***bar***baz***foo***bar***baz
it matches the first baz
and in ***foo***bar***qux***foo***bar***baz
it shall match nothing.所以在***foo***bar***baz***foo***bar***baz
它匹配第一个baz
而在***foo***bar***qux***foo***bar***baz
它不匹配任何东西。
I tried several solutions but it did not work.我尝试了几种解决方案,但没有奏效。 Can this be done with a single regular expression?这可以用单个正则表达式完成吗?
I'm pretty sure that a regex is overkill in this case.我很确定在这种情况下正则表达式是矫枉过正的。 A simple series of find
can do the job:一系列简单的find
就可以完成这项工作:
fn find_baz(input: &str) -> Option<usize> {
const FOO: &str = "foo";
const BAR: &str = "bar";
// 1: we find the occurrences of "foo", "bar" and "baz":
let foo = input.find(FOO)?;
let bar = input[foo..].find(BAR).map(|i| i + foo)?;
let baz = input[bar..].find("baz").map(|i| i + bar)?;
// 2: we verify that there is no other "foo" and "bar" between:
input[bar..baz]
.find(FOO)
.map(|i| i + bar)
.and_then(|foo| input[foo..baz].find(BAR))
.xor(Some(baz))
}
#[test]
fn found_it() {
assert_eq!(Some(15), find_baz("***foo***bar***baz***foo***bar***baz"));
}
#[test]
fn found_it_2() {
assert_eq!(Some(27), find_baz("***foo***bar***qux***foo***baz"));
}
#[test]
fn not_found() {
assert_eq!(None, find_baz("***foo***bar***qux***foo***bar***baz"));
}
#[test]
fn not_found_2() {
assert_eq!(None, find_baz("***foo***bar***qux***foo***"));
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.