Hi I have a web site which has all the url are in seo now I'm transfer my site to urdu language but because the url is in urdu its not display the correct url does anyone have a seo function which i can use.
My site url is now like this domain.com/123// it should be like this domain.com/123/ع وأنا لا أعرف من أين أستطيع أن أراك/
this is the code i have at the moment.
function seoUrl($input)
{
/**
* Return URL-Friendly string slug
* @param string $input
* @return string
*/
$input = remove_accent( $input );
$input = str_replace(" ", " ", $input);
$input = str_replace(array("'", "-"), "", $input); //remove single quote and dash
$input = mb_convert_case($input, MB_CASE_LOWER, "UTF-8"); //convert to lowercase
$input = preg_replace("#[^a-zA-Z]+#", "-", $input); //replace everything non an with dashes
$input = preg_replace("#(-){2,}#", "$1", $input); //replace multiple dashes with one
$input = trim($input, "-"); //trim dashes from beginning and end of string if any
return $input;
}
function remove_accent( $str )
{
$a = array('À', 'Á', 'Â', 'Ã', 'Ä', 'Å', 'Æ', 'Ç', 'È', 'É', 'Ê', 'Ë', 'Ì', 'Í', 'Î', 'Ï', 'Ð', 'Ñ', 'Ò', 'Ó', 'Ô', 'Õ', 'Ö',
'Ø', 'Ù', 'Ú', 'Û', 'Ü', 'Ý', 'ß', 'à', 'á', 'â', 'ã', 'ä', 'å', 'æ', 'ç', 'è', 'é', 'ê', 'ë', 'ì', 'í', 'î', 'ï',
'ñ', 'ò', 'ó', 'ô', 'õ', 'ö', 'ø', 'ù', 'ú', 'û', 'ü', 'ý', 'ÿ', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c',
'C', 'c', 'C', 'c', 'D', 'd', 'Ð', 'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G',
'g', 'G', 'g', 'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', '?', '?', 'J', 'j', 'K', 'k',
'L', 'l', 'L', 'l', 'L', 'l', '?', '?', 'L', 'l', 'N', 'n', 'N', 'n', 'N', 'n', '?', 'O', 'o', 'O', 'o', 'O', 'o',
'Œ', 'œ', 'R', 'r', 'R', 'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', 'Š', 'š', 'T', 't', 'T', 't', 'T', 't', 'U',
'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'W', 'w', 'Y', 'y', 'Ÿ', 'Z', 'z', 'Z', 'z', 'Ž', 'ž', '?',
'ƒ', 'O', 'o', 'U', 'u', 'A', 'a', 'I', 'i', 'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', '?', '?',
'?', '?', '?', '?');
$b = array('A', 'A', 'A', 'A', 'A', 'A', 'AE', 'C', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I', 'D', 'N', 'O', 'O', 'O', 'O', 'O',
'O', 'U', 'U', 'U', 'U', 'Y', 's', 'a', 'a', 'a', 'a', 'a', 'a', 'ae', 'c', 'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i', 'n',
'o', 'o', 'o', 'o', 'o', 'o', 'u', 'u', 'u', 'u', 'y', 'y', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c', 'C', 'c',
'C', 'c', 'D', 'd', 'D', 'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G', 'g', 'G', 'g',
'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'IJ', 'ij', 'J', 'j', 'K', 'k', 'L', 'l', 'L', 'l',
'L', 'l', 'L', 'l', 'l', 'l', 'N', 'n', 'N', 'n', 'N', 'n', 'n', 'O', 'o', 'O', 'o', 'O', 'o', 'OE', 'oe', 'R', 'r', 'R',
'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', 'S', 's', 'T', 't', 'T', 't', 'T', 't', 'U', 'u', 'U', 'u', 'U', 'u', 'U',
'u', 'U', 'u', 'U', 'u', 'W', 'w', 'Y', 'y', 'Y', 'Z', 'z', 'Z', 'z', 'Z', 'z', 's', 'f', 'O', 'o', 'U', 'u', 'A', 'a',
'I', 'i', 'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'A', 'a', 'AE', 'ae', 'O', 'o');
return str_replace($a, $b, $str);
}
The problem is what @deceze pointed out. Urls can only contain characters within the Latin alphabet (actually, within the English alphabet), so your only way to use Urdu in your url would be to have your best approach with English letters.
For example, I speak Catalan, and, a part from having accents, we have got this letter: ç . It is almost a c , but it sounds like an s , so when slugging a text with ç (for example, Març), I'd go for either Marc (character similarity) or Mars (phonetical similarity). You could follow this pattern. Otherwise, I think there is nothing else you could do.
EDIT: After a fast class in url encoding, you all should read the comments below this answer.
I turned to completely read your functions, and I think I happen to understand what's going on "behind the scenes":
You get your Urdu string, say the one you put before: ع وأنا لا أعرف من أين أستطيع أن أراك
remove_accent()
. It doesn't contain any of the Urdu characters to be replaced for others without accents, so it returns the same string: ع وأنا لا أعرف من أين أستطيع أن أراك
. ع وأنا لا أعرف من أين أستطيع أن أراك
. ع وأنا لا أعرف من أين أستطيع أن أراك
. And here comes the problem ------------------------------------
. -
. (empty)
. So, the main problem you had was the first regex function. I'm not aware on how to fix that. Probably with a trick converting all those characters into ASCII and then creating a regex trying to fix that. However, I'd go with these steps:
_., !'?&
and turn them into -
. utf8_decode()
would probably suffice, but I haven't tried it)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.