Bash 找句型的腳本

Question

我想要一個腳本，當有一個包含多個句子的文本作為標准輸入時，它會在一個新行上將每個句子寫入一個標准輸出。 這意味着它只會打印出那些以大寫字母開頭且僅以標點符號之一結尾的部分：點/感嘆號/問號。

例子：

標准輸入：

This is the first sentence. This is the second sentence! Is this the third sentence? this is not a sentence

標准輸出：

This is the first sentence.
This is the second sentence!
Is this the third sentence?

while read -r INPUT
do
    if [[ "$SENFLAG" == "1" ]]
    then
        echo "$INPUT" | grep -o '[[:alpha:]][^ ]*[A-Z][^ ]*' 
    fi
done

我嘗試使用 grep，但我不確定如何進一步推進。

Answer 1

grep -Eo '[A-Z][^.!?]*[.!?]' input_file

Answer 2

這是通過sed的一種方法。 這不是一個簡短的命令，但我認為更好理解。

sed -e 's/\![[:space:]]/\!\n/g' \
-e 's/\?[[:space:]]/\?\n/g' \
-e 's/\.[[:space:]]/\.\n/g' | \
grep -v '^[[:lower:]]'
This is the first sentence.
This is the second sentence!
Is this the third sentence?

解釋：

首先， set命令查找標點符號后跟空格\:[[:space:]]並將它們替換為相同的標點符號和新行\!\n 。 最后grep正在查看所有行並刪除以小寫字母開頭的行。

Bash 找句型的腳本

問題描述

2 個解決方案

解決方案1
1 已采納 2021-10-11 04:31:13

解決方案2
1 2021-10-11 13:04:01

Bash 找句型的腳本

問題描述

2 個解決方案

解決方案1 1 已采納 2021-10-11 04:31:13

解決方案2 1 2021-10-11 13:04:01

解決方案1
1 已采納 2021-10-11 04:31:13

解決方案2
1 2021-10-11 13:04:01