简体   繁体   中英

Perl: How can we check that which string is a sub string of another?

I have 2 strings say A and B. Now any of these 2 can be a substring of another. For eg:

 Case:1-- A = "abcdef" and B = "abc" //String B is Substring of A.

or

Case:2-- A = "xyz" and B = "wxyza"  // String A is Substring of B.

Now I know about index function

 index($substr, $str)

but I don't know which one is a substring of another, so can't pass the parameters.

I have done it using an OR where I check for both the cases by swapping the variables.

I only need to know that one of the strings is a substring of another. But I need a better technique to do so? Thanks in advance!

@zdim one statement

Knock yourself out,

print length($s1) > length($s2) 
  ? "s2 is @{[ index($s1,$s2) <0 && 'NOT ']}substr of s1"
  : "s1 is @{[ index($s2,$s1) <0 && 'NOT ']}substr of s2";

Here is a simple way to submit the pair and get the string enclosing the other, or undef .

($enclosing) = grep { /$s2/ && /$s1/ } ($s1, $s2);

Here is a way to get the strings ordered as ($inside, $enclosing) , or get an empty list

@sorted = sort { "$b$a" =~ /^$b.*$b/ <=> "$a$b" =~ /^$a.*$a/ }
          grep { $s1 =~ /$s2/ || $s2 =~ /$s1/ } ($s1, $s2);

This first filters out the no-match case by a two-way regex, passing through an empty list.

In both cases equal words aren't in any way marked as such, and I don't see how they could be.
However, they do contain each other and can probably be further processed the same way.
The only solution in this answer that delivers that is the @mask below, set to (1,1) in that case.

All code here runs under use warnings which is omitted for brevity.


Posted initially. Returns a copy of the word that is inside the other or undef .

($in) =  map { /^($s1).*$s1|(^$s2).*$s2/ ? $1 // $2 : () } ("$s1$s2", "$s2$s1");

Comments clarified that returning a code clasifying which string is inside the other may be useful.

($sc) =  map { /^($s1).*$s1|(^$s2).*$s2/ ? ($1 && 1) || ($2 && 2) : () } 
              ("$s1$s2", "$s2$s1");

The $sc is 1 when $s1 is contained in $s2 ,or 2 when $s2 is in $s1 , or undef otherwise.


Depending on how this is meant to be used, the origin of the above may be useful

@mask =  map { /^($s1.*$s1)|(^$s2.*$s2)/ ? 1 : 0 } ("$s1$s2", "$s2$s1");

The @mask has (bool, bool) for whether words ($s1, $s2) are inside the other.

It is: (1,1) (equal), or (1,0) or (0,1) (for $s1 or $s2 inside the other) or (0,0) (distinct).

If you're sure that one of the string is a substring of the other, you could use something like:

my $c = ($sa =~ /$sb/) || (-1) *($sb =~ /$sa/);

Where 1 means $sb is a substr of $sa and -1 means $sa is a substr of $sb.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM