PHP - quick regular expression question

Question

so I am trying to match word in a wall of text and return few words before and after the match. Everything is working, but I would like to ask if there is any way to modify it so it will look for similar words. Hmm, let me show you an example:

preg_match_all('/(?:\b(\w+\s+)\{1,5})?.*(pripravená)(?:(\s+){1,2}\b.{1,10})?/u', $item, $res[$file]);

This code returns a match, but I would like it to modify it so

preg_match_all('/(?:\b(\w+\s+)\{1,5})?.*(pripravena)(?:(\s+){1,2}\b.{1,10})?/u', $item, $res[$file]);

would also return a match. Its slovak language and I tried with range of unicode characters and also with \\p{Sk} (and few others) but to no avail. Maybe I just put it in the wrong place, I dont know...

Is something like this possible?

Any help is appreciated

Answer 1

I don't know if there is a "ignore accent" switch. But you could replace your search query with something like:

$query = 'pripravená';
$query = preg_replace(
  array('=[áàâa]=i','=[óòôo]=i','=[úùûu]=i'),
  array( '[áàâa]'  , '[óòôo]'  , '[úùûu]'  ),
  $query
);
preg_match_all('/(?:\b(\w+\s+)\{1,5})?.*('.$query.')(?:(\s+){1,2}\b.{1,10})?/u', $item, $res[$file]);

That would convert your 'pripravená' query into 'pripraven[áàâa]' .

Answer 2

(pripraven[áa]) or (pripravena\\p{M}*) or, more likely, some combination of these approaches.

I don't know of any other, more concise, way of specifying "all Latin-1 vowels that are similar to 'a' in my current locale".

Answer 3

You could use strtr() to strip out the accents: See the PHP manual page for a good example - http://php.net/manual/en/function.strtr.php

$addr = strtr($addr, "äåö", "aao");

You'd still need to specify all the relevant characters, but it would be easier than using a regex to do it.

PHP - quick regular expression question

Question

3 answers

solution1
1 2010-10-26 13:23:59

solution2
0 2010-10-26 13:15:02

solution3
0 ACCPTED 2010-10-26 13:29:29

PHP - quick regular expression question

Question

3 answers

solution1 1 2010-10-26 13:23:59

solution2 0 2010-10-26 13:15:02

solution3 0 ACCPTED 2010-10-26 13:29:29

solution1
1 2010-10-26 13:23:59

solution2
0 2010-10-26 13:15:02

solution3
0 ACCPTED 2010-10-26 13:29:29