如果元素包含两个元素，则将它们合并

Question

我有一个非常混乱的数据，我注意到模式，只要元素的'\\ n'结尾，那么在此之前需要将其与单个元素合并。

样本清单：

ls = ['hello','world \n','my name','is john \n','How are you?','I am \n doing well']
ls

退货/试用：

print([s for s in ls if "\n" in s[-1]])
>>>    ['world \n', 'is john \n'] # gave elements that ends with \n

如何获取以'\\ n'结尾的元素并在元素之前合并1？ 寻找这样的输出：

['hello world \n', 'my name is john \n', 'How are you?','I am \n doing well']

Answer 1

我将其写出来，是为了易于理解，而不是试图使其变得更复杂，例如列表理解。

这将适用于任何数量的单词，直到您按下\\n字符并清除其余输入内容为止。

ls_out = []            # your outgoing ls
out = ''               # keeps your words to use
for i in range(0, len(ls)):
    if '\n' in ls[i]:   # check for the ending word, if so, add it to output and reset
        out += ls[i]
        ls_out.append(out)
        out = ''
    else:                # otherwise add to your current word list
        out += ls[i]
if out:   # check for remaining words in out if total ls doesn't end with \n
    ls_out.append(out)

字符串连接时，您可能需要添加空格，但是我猜想这只是您的示例。 如果这样做，请进行以下编辑：

        out += ' ' + ls[i]

编辑：
如果您只想抓住一个而不是多个，则可以这样做：

ls_out = []
for i in range(0, len(ls)):
    if ls[i].endswith('\n'):             # check ending only
        if not ls[i-1].endswith('\n'):   # check previous string
            out = ls[i-1] + ' ' + ls[i]  # concatenate together
        else:
            out = ls[i]                  # this one does, previous didn't
    elif ls[i+1].endswith('\n'):         # next one will grab this so skip
        continue
    else:
        out = ls[i]                      # next one won't so add this one in
    ls_out.append(out)

Answer 2

如果要减少列表，也许一种可读的方法是使用reduce函数。

functools.reduce（func，iter，[initial_value]）对所有可迭代元素进行累积操作，因此不能应用于无限迭代。

首先，您需要某种方式来累积结果，我使用具有两个元素的元组： 具有串联字符串的缓冲区，直到找到“ \\ n”和结果列表 。 参见初始结构（1） 。

ls = ['hello','world \n','my name','is john \n','How are you?','I am \n doing well']

def combine(x,y):
    if y.endswith('\n'):
        return ( "", x[1]+[x[0]+" "+y] )  #<-- buffer to list
    else:
        return ( x[0]+" "+y, x[1] )       #<-- on buffer

t=reduce( combine, ls, ("",[]) ) #<-- see initial struct (1)
t[1]+[t[0]] if t[0] else t[1] #<-- add buffer if not empty

结果：

['hello world \n', 'my name is john \n', 'How are you? ', 'I am \n doing well ']

（1）解释了初始结构：您使用元组来存储缓冲区字符串，直到\\n和已煮熟的字符串列表：

("",[])

手段：

("__ buffer string not yet added to list __", [ __result list ___ ] )

Answer 3

您可以使用“ re”模块，使用正则表达式来解决它。

import re
ls = ['hello','world \n','my name','is john \n','How are you?','I am \n doing well']
new_ls = []
for i in range(len(ls)):
    concat_word = ''                # reset the concat word to ''
    if re.search(r"\n$", str(ls[i])):      # matching the \n at the end of the word
        try:
            concat_word = str(ls[i-1]) + ' ' + str(ls[i])  # appending to the previous word
        except:
            concat_word = str(ls[i])     # in case if the first word in the list has \n
        new_ls.append(concat_word)
    elif re.search(r'\n',str(ls[i])):      # matching the \n anywhere in the word
        concat_word = str(ls[i])  
        new_ls.extend([str(ls[i-1]), concat_word])   # keeps the word before the "anywhere" match separate
print(new_ls)

这将返回输出

['hello world \n', 'my name is john \n', 'How are you?', 'I am \n doing well']

Answer 4

假设第一个元素不以\\n结尾并且所有单词都超过2个字符：

res = []
for el in ls:
  if el[-2:] == "\n":
    res[-1] = res[-1] + el
  else:
    res.append(el)

Answer 5

尝试这个：

lst=[]
for i in range(len(ls)):
    if "\n" in ls[i][-1]:
        lst.append((ls[i-1] + ' ' + ls[i]))
        lst.remove(ls[i-1])
    else:
        lst.append(ls[i])
lst

结果：

['hello world \n', 'my name is john \n', 'How are you?', 'I am \n doing well']

如果元素包含两个元素，则将它们合并

问题描述

5 个解决方案

解决方案1
2 2019-05-08 20:08:54

解决方案2
2 已采纳 2019-05-08 20:48:11

解决方案3
1 2019-05-08 21:02:00

解决方案4
0 2019-05-08 20:09:29

解决方案5
0 2019-05-08 20:19:59

如果元素包含两个元素，则将它们合并

问题描述

5 个解决方案

解决方案1 2 2019-05-08 20:08:54

解决方案2 2 已采纳 2019-05-08 20:48:11

解决方案3 1 2019-05-08 21:02:00

解决方案4 0 2019-05-08 20:09:29

解决方案5 0 2019-05-08 20:19:59

解决方案1
2 2019-05-08 20:08:54

解决方案2
2 已采纳 2019-05-08 20:48:11

解决方案3
1 2019-05-08 21:02:00

解决方案4
0 2019-05-08 20:09:29

解决方案5
0 2019-05-08 20:19:59