[英]Extract sentences with capitalized words other than first word
Say I have a dataframe假设我有一个 dataframe
df = pd.DataFrame({'col1': ['Hello, world. Good day','My name is Bob. Call Me','good evening','yep. stack Overflow.',"Ain't McDonald Yo"]})
col1
0 Hello, world. Good day
1 My name is Bob. Call Me
2 good evening
3 yep. stack Overflow.
4 Ain't McDonald Yo
I'm trying to extract sentences from each row that contain capitalized words other than the first word.我正在尝试从每行中提取包含除第一个单词以外的大写单词的句子。 Sentences are separated by a period.句子用句号分隔。
Output: Output:
col1 col2
0 Hello, world. Good day NaN
1 My name is Bob. Call Me My name is Bob. Call Me
2 good evening NaN
3 yep. stack Overflow. stack Overflow
4 Ain't McDonald Yo Ain't McDonald Yo
Try:尝试:
df["col2"] = df["col1"].apply(
lambda x: ".".join(
[
sentence
for sentence in x.split(".")
if any(word[0].isupper() for word in sentence.split()[1:])
]
)
or np.nan
)
print(df)
Prints:印刷:
col1 col2
0 Hello, world. Good day NaN
1 My name is Bob. Call Me My name is Bob. Call Me
2 good evening NaN
3 yep. stack Overflow. stack Overflow
4 Ain't McDonald Yo Ain't McDonald Yo
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.