php / regex：“linkify”博客标题

Question

I'm trying to write a simple PHP function that can take a string like 我正在尝试编写一个简单的PHP函数，可以使用类似的字符串

Topic: Some stuff, Maybe some more, it's my stuff?

and return 并返回

topic-some-stuff-maybe-some-more-its-my-stuff

As such: 因此：

lowercase 小写
remove all non-alphanumeric non-space characters 删除所有非字母数字非空格字符
replace all spaces (or groups of spaces) with hyphens 用连字符替换所有空格（或空格组）

Can I do this with a single regex? 我可以用一个正则表达式做到这一点吗？

Answer 1

function Slug($string)
{
    return strtolower(trim(preg_replace('~[^0-9a-z]+~i', '-', html_entity_decode(preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8')), ENT_QUOTES, 'UTF-8')), '-'));
}

$topic = 'Iñtërnâtiônàlizætiøn';
echo Slug($topic); // internationalizaetion

$topic = 'Topic: Some stuff, Maybe some more, it\'s my stuff?';
echo Slug($topic); // topic-some-stuff-maybe-some-more-it-s-my-stuff

$topic = 'here عربي‎ Arabi';
echo Slug($topic); // here-arabi

$topic = 'here 日本語 Japanese';
echo Slug($topic); // here-japanese

Answer 2

Why are regular expressions considered the universal panacea to all life's problems (just because a lowly backtrace in a preg_match has discovered the cure for cancer). 为什么正则表达式被认为是解决所有生命问题的普遍灵丹妙药（仅仅因为preg_match中的低回溯已经发现了治愈癌症的方法）。 here's a solution without recourse to regexp: 这是一个无需借助regexp的解决方案：

$str = "Topic: Some stuff, Maybe some more, it's my stuff?";
$str = implode('-',str_word_count(strtolower($str),2));
echo $str;

Without going the whole UTF-8 route: 没有走完整个UTF-8路线：

$str = "Topic: Some stuff, Maybe some more, it's my Iñtërnâtiônàlizætiøn stuff?";
$str = implode('-',str_word_count(strtolower(str_replace("'","",$str)),2,'Þßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ'));
echo $str;

gives 给

topic-some-stuff-maybe-some-more-its-my-iñtërnâtiônàlizætiøn-stuff 话题有些-东西 - 也许，一些-更其-MY-iñtërnâtiônàlizætiøn-东西

Answer 3

You can do it with one preg_replace : 你可以用一个preg_replace做到这一点：

preg_replace(array("/[A-Z]/e", "/\\p{P}/", "/\\s+/"),
    array('strtolower("$0")', '', '-'), $str);

Technically, you could do it with one regex, but this is simpler. 从技术上讲，你可以使用一个正则表达式，但这更简单。

Preemptive response: yes, it unnecessarily uses regular expressions (though very simple ones), an unecessarily big number of calls to strtolower , and it doesn't consider non-english characters (he doesn't even give an encoding); 抢先响应：是的，它不必要地使用正则表达式（虽然非常简单），对strtolower的调用次数非常大，并且它不考虑非英语字符（他甚至不提供编码）; I'm just satisfying the OP's requirements. 我只是满足了OP的要求。

Answer 4

Many frameworks provide functions for this 许多框架为此提供了功能

CodeIgniter: http://bitbucket.org/ellislab/codeigniter/src/c39315f13a76/system/helpers/url_helper.php#cl-472 CodeIgniter： http ： //bitbucket.org/ellislab/codeigniter/src/c39315f13a76/system/helpers/url_helper.php#cl-472

wordpress (has many more in the code): http://core.trac.wordpress.org/browser/trunk/wp-includes/formatting.php#L814 wordpress（在代码中有更多）： http ： //core.trac.wordpress.org/browser/trunk/wp-includes/formatting.php#L814

php / regex：“linkify”博客标题

问题描述

4 个解决方案

解决方案1
3 2010-07-14 09:29:58

解决方案2
2 2010-07-14 08:59:44

解决方案3
2 2010-07-14 09:05:53

解决方案4
2 已采纳 2010-07-14 09:29:26

php / regex：“linkify”博客标题

问题描述

4 个解决方案

解决方案1 3 2010-07-14 09:29:58

解决方案2 2 2010-07-14 08:59:44

解决方案3 2 2010-07-14 09:05:53

解决方案4 2 已采纳 2010-07-14 09:29:26

解决方案1
3 2010-07-14 09:29:58

解决方案2
2 2010-07-14 08:59:44

解决方案3
2 2010-07-14 09:05:53

解决方案4
2 已采纳 2010-07-14 09:29:26