简体   繁体   中英

How to use Rust regex on bytes (Vec<u8> or &[u8])?

I have a &[u8] and I need to verify if it conforms to some pattern. There are examples of using regexes on &[u8] in the Regex documentation and in the module documentation . I took the code from the examples section and put it inside a main() and added a few declarations:

extern crate regex;
use regex::Regex;

fn main() {
    let re = Regex::new(r"'([^']+)'\s+\((\d{4})\)").unwrap();
    let text = b"Not my favorite movie: 'Citizen Kane' (1941).";
    let caps = re.captures(text).unwrap();
    assert_eq!(&caps[1], &b"Citizen Kane"[..]);
    assert_eq!(&caps[2], &b"1941"[..]);
    assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]);
    // You can also access the groups by index using the Index notation.
    // Note that this will panic on an invalid index.
    assert_eq!(&caps[1], b"Citizen Kane");
    assert_eq!(&caps[2], b"1941");
    assert_eq!(&caps[0], b"'Citizen Kane' (1941)");
}

I don't understand how this example code differs from regular string matching, and indeed the compiler complains about expecting a &str . In general the code does not hint how it differs from the usual string matching, with which I have no problems.

I presume I did something basic wrong, like a missing or more precise import. I am in a guessing game here, as the docs fail to provide working examples (as they regularly do), and this time the compiler also fails to nudge me in the right direction.

Here are the compiler messages:

error[E0308]: mismatched types
 --> src/main.rs:7:28
  |
7 |     let caps = re.captures(text).unwrap();
  |                            ^^^^ expected str, found array of 45 elements
  |
  = note: expected type `&str`
             found type `&[u8; 45]`

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8]>` is not satisfied
 --> src/main.rs:8:5
  |
8 |     assert_eq!(&caps[1], &b"Citizen Kane"[..]);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8]`
  |
  = help: the trait `std::cmp::PartialEq<[u8]>` is not implemented for `str`
  = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8]>` for `&str`
  = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8]>` is not satisfied
 --> src/main.rs:9:5
  |
9 |     assert_eq!(&caps[2], &b"1941"[..]);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8]`
  |
  = help: the trait `std::cmp::PartialEq<[u8]>` is not implemented for `str`
  = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8]>` for `&str`
  = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8]>` is not satisfied
  --> src/main.rs:10:5
   |
10 |     assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]);
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8]`
   |
   = help: the trait `std::cmp::PartialEq<[u8]>` is not implemented for `str`
   = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8]>` for `&str`
   = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8; 12]>` is not satisfied
  --> src/main.rs:13:5
   |
13 |     assert_eq!(&caps[1], b"Citizen Kane");
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8; 12]`
   |
   = help: the trait `std::cmp::PartialEq<[u8; 12]>` is not implemented for `str`
   = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8; 12]>` for `&str`
   = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8; 4]>` is not satisfied
  --> src/main.rs:14:5
   |
14 |     assert_eq!(&caps[2], b"1941");
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8; 4]`
   |
   = help: the trait `std::cmp::PartialEq<[u8; 4]>` is not implemented for `str`
   = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8; 4]>` for `&str`
   = note: this error originates in a macro outside of the current crate

error[E0277]: the trait bound `str: std::cmp::PartialEq<[u8; 21]>` is not satisfied
  --> src/main.rs:15:5
   |
15 |     assert_eq!(&caps[0], b"'Citizen Kane' (1941)");
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't compare `str` with `[u8; 21]`
   |
   = help: the trait `std::cmp::PartialEq<[u8; 21]>` is not implemented for `str`
   = note: required because of the requirements on the impl of `std::cmp::PartialEq<&[u8; 21]>` for `&str`
   = note: this error originates in a macro outside of the current crate

and added a few declarations

Unfortunately, you added the wrong ones. Note how the documentation you've linked to is for the struct regex::bytes::Regex , not regex::Regex — they are two different types!

extern crate regex;
use regex::bytes::Regex;
//         ^^^^^

fn main() {
    let re = Regex::new(r"'([^']+)'\s+\((\d{4})\)").unwrap();
    let text = b"Not my favorite movie: 'Citizen Kane' (1941).";
    let caps = re.captures(text).unwrap();

    assert_eq!(&caps[1], &b"Citizen Kane"[..]);
    assert_eq!(&caps[2], &b"1941"[..]);
    assert_eq!(&caps[0], &b"'Citizen Kane' (1941)"[..]);

    assert_eq!(&caps[1], b"Citizen Kane");
    assert_eq!(&caps[2], b"1941");
    assert_eq!(&caps[0], b"'Citizen Kane' (1941)");
}

as the docs fail to provide working examples (as they regularly do)

Note that code blocks in documentation are compiled and executed by default , so my experience is that it's pretty rare that the examples don't work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM