[英]Combine strings in list to form words starting with capital letter
NLP newbie here. NLP 新手在这里。 I have a list of strings, and I would like to combine them so that each string starts with a capital letter.
我有一个字符串列表,我想将它们组合起来,以便每个字符串都以大写字母开头。 What is the most efficient way to do so?
这样做的最有效方法是什么?
Here is the list: [' Ye', 'oks', 'am', '-', 'd', 'ong', ' Gang', 'nam', '-', 'gu Seoul', ' Korea']
.这是列表:
[' Ye', 'oks', 'am', '-', 'd', 'ong', ' Gang', 'nam', '-', 'gu Seoul', ' Korea']
。
Desidered output: ['Yeoksam-dong', 'Gangnam-gu Seoul', 'Korea']
.期望输出:
['Yeoksam-dong', 'Gangnam-gu Seoul', 'Korea']
。 ['Yeoksam-dong', 'Gangnam-gu', 'Seoul', 'Korea']
is also fine. ['Yeoksam-dong', 'Gangnam-gu', 'Seoul', 'Korea']
也不错。
This is the solution I'm working to improve:这是我正在努力改进的解决方案:
places = [' Ye', 'oks', 'am', '-', 'd', 'ong', ' Gang', 'nam', '-', 'gu Seoul', ' Korea']
num_places = 0
Temp = []
for ii in range(len(places)):
loc = str(" ".join(places[ii].split()))
print(loc, loc[0].isupper())
if str(" ".join(places[ii + 1].split()))[0].isupper() == True:
places_words.append(loc)
num_places += 1
else:
Temp.append(loc)
print(Temp)
One approach:一种方法:
from itertools import tee
def pairwise(iterable):
# pairwise('ABCDEFG') --> AB BC CD DE EF FG
a, b = tee(iterable)
next(b, None)
return zip(a, b)
data = [' Ye', 'oks', 'am', '-', 'd', 'ong', ' Gang', 'nam', '-', 'gu Seoul', ' Korea']
# find the indices of the words that ara capitalized
indices = [i for i, e in enumerate(data) if e.strip()[0].isupper()] + [len(data)]
# iterate pairwise and join the strings
res = ["".join(data[start:end]).strip() for start, end in pairwise(indices)]
print(res)
Output输出
['Yeoksam-dong', 'Gangnam-gu Seoul', 'Korea']
Alternative using more_itertools ,使用more_itertools 的替代方法,
from more_itertools import split_before
data = [' Ye', 'oks', 'am', '-', 'd', 'ong', ' Gang', 'nam', '-', 'gu Seoul', ' Korea']
chunks = split_before(map(str.strip, data), lambda e: e[0].isupper())
res = ["".join(chunk).strip() for chunk in chunks]
print(res)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.