简体   繁体   中英

RegEx UTF-8 ignore and bold special characters

I have this code for my search box of my website:

<? echo preg_replace("/({$term})/i", "<b>$0</b>", NoticiaInfo($news_results, 'subtitulo')); ?>

And I'd like to know if there is any way to make, for example, the letter "c" replace an "ç" with regex.

So, if I search for "ca", the letters "çã" of "Função" will be bolded...

Is there any way to do this with regex?

You would need to use preg_replace with an array. try:

<?php
    $replacements = array(
        '/a/' => '<b>ã</b>',
        '/c/' => '<b>ç</b>'
    );
    echo preg_replace(array_keys($replacements), array_values($replacements),  NoticiaInfo($news_results, 'subtitulo')); 
?>

and fill out the $replacements array with the other chars you'd like to replace.

@Ranty makes a good point so you could try using str_replace instead and your code will become:

<?php
    $replacements = array(
        'a' => '<b>ã</b>',
        'c' => '<b>ç</b>'
    );
    echo str_replace(array_keys($replacements), array_values($replacements),  NoticiaInfo($news_results, 'subtitulo')); 
?>

No pretty way to do this and preserve the accent marks. You first have to assemble a list of all possible permutations of the search term with substituted chars.

<?
$termList = array($term);

// You'll need to programmatically create this list
// This is just a sample, assuming that $term == 'Funcao';
$termList[] = 'Funcão';
$termList[] = 'Funçao';
$termList[] = 'Função';

$bodyText = NoticiaInfo($news_results, 'subtitulo');

foreach($termList as $searchTerm) {
    $bodyText = preg_replace("/({$searchTerm})/i", "<b>$0</b>", $bodyText);
}

echo $bodyText;

?>

Programmatically creating the search term array will be a nightmare, but there are numerous password cracking apps that do that already (ex: they sub chars for digits and create every permutation thereof) so the logic exists somewhere. Though, when you start getting longer search strings the overhead of this starts getting out of hand.

Of course, if you don't care about maintaining the accent marks this becomes much easier.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM