简体   繁体   English

在段落中将每个句子的第一个字母大写?

[英]In paragraph making the first letter of every sentence uppercase?

I got this function from php.net for convert uppercase become lowercase in sentence case.我从 php.net 得到了这个 function,用于在句子大小写中将大写字母转换为小写字母。

function sentence_case($string) {
    $sentences = preg_split('/([.?!]+)/', $string, -1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
    $new_string = '';
    foreach ($sentences as $key => $sentence) {
        $new_string .= ($key & 1) == 0
            ? ucfirst(strtolower(trim($sentence)))
            : $sentence . ' ';
    }
    return trim($new_string);
}

If the sentence is not in the paragraph, all works well.如果句子不在段落中,一切正常。 But if the sentence is in the paragraph, the first letter in opening paragraph ( <p> ) or break ( <br> ) tag HTML become lowercase.但如果句子在段落中,段落开头的第一个字母 ( <p> ) 或中断 ( <br> ) 标记 HTML 变为小写。

This is the sample:这是示例:

Before:前:

<p>Lorem IPSUM is simply dummy text. LOREM ipsum is simply dummy text! wHAt is LOREM IPSUM? Hello lorem ipSUM!</p>

Output: Output:

<p>lorem ipsum is simply dummy text. Lorem ipsum is simply dummy text! What is lorem ipsum? Hello lorem ipsum!</p>

Can someone help me to make the first letter in the paragraph become capital letter?有人可以帮我把段落中的第一个字母变成大写字母吗?

You can do it with CSS easily 您可以使用CSS轻松实现

p::first-letter {
    text-transform: uppercase;
}

Your problem is that you're considering HTML within the sentence, so the first "word" of the sentence is <P>lorem , not Lorem . 您的问题是您正在考虑在句子中使用HTML,因此句子的第一个“单词”是<P>lorem ,而不是Lorem

You can change the regexp to read /([>.?!]+)/ , but this way you'll see extra spaces before "Lorem" as the system now sees two sentences and not one. 您可以将regexp更改为/([>.?!]+)/ ,但是通过这种方式,您会在“ Lorem”之前看到多余的空格 ,因为系统现在看到两个句子而不是一个。

Also, now Hello <em>there</em> will be considered as four sentences. 另外,现在Hello <em>there</em>将被视为四个句子。

This looks disturbingly like a case of "How can I use regexp to interpret (X)HTML"? 这看起来令人不安,就像“如何使用正则表达式解释(X)HTML”的情况一样?

try this 尝试这个

function html_ucfirst($s) {
return preg_replace_callback('#^((<(.+?)>)*)(.*?)$#', function ($c) {
        return $c[1].ucfirst(array_pop($c));
 }, $s);
}

and call this function 并调用此功能

$string= "<p>Lorem IPSUM is simply dummy text. LOREM ipsum is simply dummy text! wHAt is LOREM IPSUM? Hello lorem ipSUM!</p>";
echo html_ucfirst($string);

here is working demo : https://ideone.com/fNq3Vo 这是工作演示: https : //ideone.com/fNq3Vo

When parsing valid html, it is best practice to leverage a legitimate DOM parser.解析有效的 html 时,最佳做法是利用合法的 DOM 解析器。 Using regex is not reliable because regex does not know the difference between a tag and a substring that resembles a tag.使用 regex 并不可靠,因为 regex 不知道标签和类似于标签的子字符串之间的区别。

Code: ( Demo )代码:(演示

$html = <<<HTML
<p>Lorem IPSUM is simply dummy text.<br>Here is dummy text. LOREM ipsum is simply dummy text! wHAt is LOREM IPSUM? Hello lorem ipSUM!</p>
HTML;

libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach($xpath->query('//text()') as $textNode) {
    $textNode->nodeValue = preg_replace_callback(
        '/(?:^|[.!?]) *\K[a-z]+/',
        function($m) {
            return ucfirst($m[0]);
        },
        strtolower($textNode->nodeValue)
    );
}
echo $dom->saveHTML();

Output:输出:

<p>Lorem ipsum is simply dummy text.<br>Here is dummy text. Lorem ipsum is simply dummy text! What is lorem ipsum? Hello lorem ipsum!</p>

The above snippet does not:上面的代码片段没有:

  1. allow acronyms to remain all-caps (because the OP wants to convert all letters to lowercase before making select letters uppercase)允许首字母缩略词保持全大写(因为 OP 希望在将选择的字母变为大写之前将所有字母转换为小写)
  2. does not bother to properly handle multibyte character (because the OP does not indicate this necessity)不费心去正确处理多字节字符(因为 OP 没有表明这种必要性)
  3. does not know the difference between a mid-sentence dot and a sentence-ending dot (due to ambiguity in English punctuation)不知道中间句点和句子结尾点之间的区别(由于英语标点符号的歧义)

In HTML 在HTML中

 p.case { text-transform: capitalize; } 
 <p class="case">This is some text and usre.</p> 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM