简体   繁体   中英

Capturing a particular character from a string in Perl

I have a file with contents like this:

HFH_F_OPL_J0                                       ;comment1
HIJ_I_AAA_V2_DSD                                   ;comment2
ALE_H_FB_V1                                        ;comment3
ZXZPOIF_P                                              ;comment4
RST0DREK_S                                              ;comment5

I need to match the single character, always present after the first underscore, and always one of [H, I, F, P, L, S] only .

What regex is to be used for this?

/(\w{3,})_([S|I|P|F|L|H]{1})(.*)\;/ 

does not give the right results.

Use anchors and change the first \\w to [AZ] because \\w should also match _ . Now, get the Character you want from group index 1.

/^[A-Z]{3,}_([SIPFLH]).*;/ 

or

/^[^_]{3,}_\K[SIPFLH](?=.*;)/ 

DEMO

If you trust your data then there's no reason to check the value of the character right after the first underscore -- you can just grab it and use it

This short Perl program demonstrates

use strict;
use warnings 'all';
use feature 'say';

while ( <DATA> ) {
    say $1 if /_(.)/;
}

__DATA__
HFH_F_OPL_J0                                       ;comment1
HIJ_I_AAA_V2_DSD                                   ;comment2
ALE_H_FB_V1                                        ;comment3
ZXZPOIF_P                                              ;comment4
RST0DREK_S

output

F
I
H
P
S

If you want to be slightly more secure then you can use a character class instead of a dot, which changes that line of my code to

say $1 if /_([HIFPLS])/;

The output is identical to that of the original code

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM