[英]Capitalize each first word of a sentence in a paragraph
我想在整個句子(str)的整個段落(str)中將第一個單詞的首字母大寫。 問題是所有字符都是小寫。
我嘗試過這樣的事情:
text = "here a long. paragraph full of sentences. what in this case does not work. i am lost"
re.sub(r'(\b\. )([a-zA-z])', r'\1' (r'\2').upper(), text)
我期望這樣的事情:
“很長。一段充滿句子。這種情況下不起作用。我迷路了。”
您可以將re.sub
與lambda
一起使用:
import re
text = "here a long. paragraph full of sentences. what in this case does not work. i am lost"
result = re.sub('(?<=^)\w|(?<=\.\s)\w', lambda x:x.group().upper(), text)
輸出:
'Here a long. Paragraph full of sentences. What in this case does not work. I am lost'
正則表達式說明:
(?<=^)\\w
:匹配在行首之前的字母數字字符。
(?<=\\.\\s)\\w
:匹配字母數字字符,其后帶有句點和空格。
您可以使用((?:^|\\.\\s)\\s*)([az])
正則表達式( 它不依賴於周圍環境,有時您可能正在使用的regex方言中可能不提供這種環視,因此更簡單例如,盡管EcmaScript2018中支持Java腳本,但Java尚不廣泛支持lookbehind。但是您可以在句子開頭捕獲零個或多個開頭的空白,或在其后捕獲一個或多個空白。用文字點表示.
並在group1中捕獲它,然后使用([az])
捕獲一個小寫字母,並在group2中捕獲,並使用lambda表達式將匹配的文本替換為group1捕獲的文本和group2捕獲的字母。 檢查此Python代碼,
import re
arr = ['here a long. paragraph full of sentences. what in this case does not work. i am lost',
' this para contains more than one space after period and also has unneeded space at the start of string. here a long. paragraph full of sentences. what in this case does not work. i am lost']
for s in arr:
print(re.sub(r'(^\s*|\.\s+)([a-z])', lambda m: m.group(1) + m.group(2).upper(), s))
輸出,
Here a long. Paragraph full of sentences. What in this case does not work. I am lost
This para contains more than one space after period and also has unneeded space at the start of string. Here a long. Paragraph full of sentences. What in this case does not work. I am lost
並且如果您想擺脫多余的空格並將其減少為一個空格,只需將\\s*
從group1中取出並使用此正則表達式((?:^|\\.\\s))\\s*([az])
和更新的Python代碼,
import re
arr = ['here a long. paragraph full of sentences. what in this case does not work. i am lost',
' this para contains more than one space after period and also has unneeded space at the start of string. here a long. paragraph full of sentences. what in this case does not work. i am lost']
for s in arr:
print(re.sub(r'((?:^|\.\s))\s*([a-z])', lambda m: m.group(1) + m.group(2).upper(), s))
您會發現,通常需要將多余的空格減少到只有一個空格,
Here a long. Paragraph full of sentences. What in this case does not work. I am lost
This para contains more than one space after period and also has unneeded space at the start of string. Here a long. Paragraph full of sentences. What in this case does not work. I am lost
另外,如果要使用基於PCRE
的正則表達式引擎來完成此操作,則可以在正則表達式本身中使用\\U
,而不必使用lambda函數,而只需將其替換為\\1\\U\\2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.