简体   繁体   English

提取除第一个单词以外的大写单词的句子

[英]Extract sentences with capitalized words other than first word

Say I have a dataframe假设我有一个 dataframe

df = pd.DataFrame({'col1': ['Hello, world. Good day','My name is Bob. Call Me','good evening','yep. stack Overflow.',"Ain't McDonald Yo"]})

                      col1
0   Hello, world. Good day
1  My name is Bob. Call Me
2             good evening
3     yep. stack Overflow.
4        Ain't McDonald Yo

I'm trying to extract sentences from each row that contain capitalized words other than the first word.我正在尝试从每行中提取包含除第一个单词以外的大写单词的句子。 Sentences are separated by a period.句子用句号分隔。

Output: Output:

                      col1                     col2
0   Hello, world. Good day                      NaN
1  My name is Bob. Call Me  My name is Bob. Call Me
2             good evening                      NaN
3     yep. stack Overflow.           stack Overflow
4        Ain't McDonald Yo        Ain't McDonald Yo

Try:尝试:

df["col2"] = df["col1"].apply(
    lambda x: ".".join(
        [
            sentence
            for sentence in x.split(".")
            if any(word[0].isupper() for word in sentence.split()[1:])
        ]
    )
    or np.nan
)
print(df)

Prints:印刷:

                      col1                     col2
0   Hello, world. Good day                      NaN
1  My name is Bob. Call Me  My name is Bob. Call Me
2             good evening                      NaN
3     yep. stack Overflow.           stack Overflow
4        Ain't McDonald Yo        Ain't McDonald Yo

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果两者都大写的话,Python正则表达式会首先使用大写单词或第一和第二单词 - Python regex pull first capitalized word or first and second words if both are capitalized 仅提取首字母大写的整个单词 - Extract Only Whole Word That Has The First Letter Capitalized Regext匹配大写单词,以及周围的+ - 4个单词 - Regext to match capitalized word, and the surrounding +- 4 words python使用正则表达式提取大写单词 - python extract capitalized words using regex 如果单元格有 2 个单词,则只提取第一个单词,如果单元格有 3 个单词,则提取第一个单词 - PANDAS/REGEX - If cell has 2 words, extract only 1st word and if cell has 3 words, extract 2 first words - PANDAS/REGEX 比较句子列表和单词列表,如果单词存在,则返回完整的句子 - Compare List of Sentences and List of words and return complete Sentences , if word is present 提取两个句子之间不同的单词 - Extract the words that differ between two sentences 在python中使用正则表达式返回特定单词之间的大写单词 - Using regex in python to return capitalized words between a specific word python打印带有常用词或频率词的句子? - python print sentences with common word or frequency words? 如何在找到的单词周围找到单词或句子? - How to find words or sentences around the found word?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM