简体   繁体   English

正则表达式:是否有一个单线?

[英]Regex: is there a oneliner for this?

I want to search inside multiple big text files (200MB each) as fast as possible.我想尽快在多个大文本文件(每个 200MB)中进行搜索。 I am using the command line tool ripgrep and I want to call it only once.我正在使用命令行工具ripgrep ,我只想调用它一次。

In the following string:在以下字符串中:

***foo***bar***baz***foo***bar***baz

( *** stands for a different type and number of characters.) ***代表不同类型和数量的字符。)

I want to match baz , but only if it follows the first occurence of foo***bar***我想匹配baz ,但baz是它遵循foo***bar***的第一次出现

So in ***foo***bar***baz***foo***bar***baz it matches the first baz and in ***foo***bar***qux***foo***bar***baz it shall match nothing.所以在***foo***bar***baz***foo***bar***baz它匹配第一个baz而在***foo***bar***qux***foo***bar***baz它不匹配任何东西。

I tried several solutions but it did not work.我尝试了几种解决方案,但没有奏效。 Can this be done with a single regular expression?这可以用单个正则表达式完成吗?

I'm pretty sure that a regex is overkill in this case.我很确定在这种情况下正则表达式是矫枉过正的。 A simple series of find can do the job:一系列简单的find就可以完成这项工作:

fn find_baz(input: &str) -> Option<usize> {
    const FOO: &str = "foo";
    const BAR: &str = "bar";

    // 1: we find the occurrences of "foo", "bar" and "baz":
    let foo = input.find(FOO)?;
    let bar = input[foo..].find(BAR).map(|i| i + foo)?;
    let baz = input[bar..].find("baz").map(|i| i + bar)?;

    // 2: we verify that there is no other "foo" and "bar" between:
    input[bar..baz]
        .find(FOO)
        .map(|i| i + bar)
        .and_then(|foo| input[foo..baz].find(BAR))
        .xor(Some(baz))
}

#[test]
fn found_it() {
    assert_eq!(Some(15), find_baz("***foo***bar***baz***foo***bar***baz"));
}

#[test]
fn found_it_2() {
    assert_eq!(Some(27), find_baz("***foo***bar***qux***foo***baz"));
}

#[test]
fn not_found() {
    assert_eq!(None, find_baz("***foo***bar***qux***foo***bar***baz"));
}

#[test]
fn not_found_2() {
    assert_eq!(None, find_baz("***foo***bar***qux***foo***"));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM