perl regrex that captures substring between tic marks

Question

I am trying to find a solution in perl that captures the filename in the following string -- between the tic marks.

my $str = "Saving to: ‘wapenc?T=mavodi-7-13b-2b-3-96-1e3431a’";

(my $results) = $str =~ /‘(.*?[^\\])‘/;
print $results if $results;

I need to end up with wapenc?T=mavodi-7-13b-2b-3-96-1e3431a

Answer 1

The final tick seems to be different in your regex than in the input string - char 8217 (RIGHT SINGLE QUOTATION MARK U+2019) versus 8216 (LEFT SINGLE QUOTATION MARK U+2018). Also, when using Unicode characters in the source, be sure to include

use utf8;

and save the file UTF-8 encoded.

After fixing these two issues, the code worked for me:

#! /usr/bin/perl
use warnings;
use strict;
use utf8;

my $str = "Saving to: ‘wapenc?T=mavodi-7-13b-2b-3-96-1e3431a’";

(my $results) = $str =~ /‘(.*?[^\\])’/;
print $results if $results;

Answer 2

Your tic characters aren't in the 7-bit ASCII character set, so there is a whole character-encoding rabbit hole to go down here. But the quick and dirty solution is to capture everything in between extended characters.

($result) = $str =~ /[^\0-\x7f]+(.*?)[^\0-\x7f]/;

[^\\0-\\x7f] matches characters with character values not between 0 and 127, ie, anything that is not a 7-bit ASCII character including new lines, tabs, and other control sequences. This regular expression will work whether your input is UTF-8 encoded or has already been decoded, and may work for other character encodings, too.

perl regrex that captures substring between tic marks

Question

2 answers

solution1
3 2018-12-22 00:55:14

solution2
1 ACCPTED 2018-12-22 00:57:52

perl regrex that captures substring between tic marks

Question

2 answers

solution1 3 2018-12-22 00:55:14

solution2 1 ACCPTED 2018-12-22 00:57:52

solution1
3 2018-12-22 00:55:14

solution2
1 ACCPTED 2018-12-22 00:57:52