將段落中句子的每個首字母大寫

Question

我想在整個句子（str）的整個段落（str）中將第一個單詞的首字母大寫。 問題是所有字符都是小寫。

我嘗試過這樣的事情：

text = "here a long. paragraph full of sentences. what in this case does not work. i am lost" 
re.sub(r'(\b\. )([a-zA-z])', r'\1' (r'\2').upper(), text)

我期望這樣的事情：

“很長。一段充滿句子。這種情況下不起作用。我迷路了。”

Answer 1

您可以將re.sub與lambda一起使用：

import re
text = "here a long. paragraph full of sentences. what in this case does not work. i am lost" 
result = re.sub('(?<=^)\w|(?<=\.\s)\w', lambda x:x.group().upper(), text)

輸出：

'Here a long. Paragraph full of sentences. What in this case does not work. I am lost'

正則表達式說明：

(?<=^)\\w ：匹配在行首之前的字母數字字符。

(?<=\\.\\s)\\w ：匹配字母數字字符，其后帶有句點和空格。

Answer 2

您可以使用((?:^|\\.\\s)\\s*)([az])正則表達式（它不依賴於周圍環境，有時您可能正在使用的regex方言中可能不提供這種環視，因此更簡單例如，盡管EcmaScript2018中支持Java腳本，但Java尚不廣泛支持lookbehind。但是您可以在句子開頭捕獲零個或多個開頭的空白，或在其后捕獲一個或多個空白。用文字點表示. 並在group1中捕獲它，然后使用([az])捕獲一個小寫字母，並在group2中捕獲，並使用lambda表達式將匹配的文本替換為group1捕獲的文本和group2捕獲的字母。 檢查此Python代碼，

import re

arr = ['here a long.   paragraph full of sentences. what in this case does not work. i am lost',
       '   this para contains more than one space after period and also has unneeded space at the start of string.   here a long.   paragraph full of sentences.  what in this case does not work. i am lost']

for s in arr:
    print(re.sub(r'(^\s*|\.\s+)([a-z])', lambda m: m.group(1) + m.group(2).upper(), s))

輸出，

Here a long.   Paragraph full of sentences. What in this case does not work. I am lost
   This para contains more than one space after period and also has unneeded space at the start of string.   Here a long.   Paragraph full of sentences.  What in this case does not work. I am lost

並且如果您想擺脫多余的空格並將其減少為一個空格，只需將\\s*從group1中取出並使用此正則表達式((?:^|\\.\\s))\\s*([az])和更新的Python代碼，

import re

arr = ['here a long.   paragraph full of sentences. what in this case does not work. i am lost',
       '   this para contains more than one space after period and also has unneeded space at the start of string.   here a long.   paragraph full of sentences.  what in this case does not work. i am lost']

for s in arr:
    print(re.sub(r'((?:^|\.\s))\s*([a-z])', lambda m: m.group(1) + m.group(2).upper(), s))

您會發現，通常需要將多余的空格減少到只有一個空格，

Here a long. Paragraph full of sentences. What in this case does not work. I am lost
This para contains more than one space after period and also has unneeded space at the start of string. Here a long. Paragraph full of sentences. What in this case does not work. I am lost

另外，如果要使用基於PCRE的正則表達式引擎來完成此操作，則可以在正則表達式本身中使用\\U ，而不必使用lambda函數，而只需將其替換為\\1\\U\\2

基於PCRE的正則表達式的正則表達式演示

將段落中句子的每個首字母大寫

問題描述

2 個解決方案

解決方案1
6 2019-05-18 17:45:52

解決方案2
0 已采納 2019-05-18 18:23:28

將段落中句子的每個首字母大寫

問題描述

2 個解決方案

解決方案1 6 2019-05-18 17:45:52

解決方案2 0 已采納 2019-05-18 18:23:28

解決方案1
6 2019-05-18 17:45:52

解決方案2
0 已采納 2019-05-18 18:23:28