如何从嵌套列表中删除一定长度的字符串？

Question

我有一个嵌套的字符串列表，由不同长度的列表组成的语料库 。 我只想保留长度大于2的字符串。

关于如何从嵌套列表中删除元素的类似问题？ 我尝试了所有可以让我指出条件长度> 2的答案。

码

corpus = list(r_corpus('teeny.txt'))
print('initial corpus here ',corpus)

#Current attempt
[[ subelt for subelt in elt if len(subelt) >2 ] for elt in corpus] 

#previous attempt 1
##for thing in corpus:
##    [y for y in thing if len(y)>2]

#previous attempt 2
##for sentence in corpus:
##    sentence = [x for x in sentence if len(x) > 2 ]

print('\n\n corpus here without any string of length 2 or smaller',corpus)

这是当前尝试的输出，与前两次尝试的输出相同。

初始语料库在这里

[['extracting', 'opinions'],
['soo', 'min', 'kim', 'and'],
['abstract'],
['this', 'paper', 'presents', 'method', 'for', 'identifying', 'an'], 
['this', 'section', 'reviews', 'previous', 'works', 'in'], 
['subjectivity', 'detection', 'is'], 
['work', 'is', 'similar', 'to', 'ours', 'but', 'different']]

长度为2 或更短的任何字符串的主体

[['extracting', 'opinions'],
['soo', 'min', 'kim', 'and'], 
['abstract'], 
['this', 'paper', 'presents', 'method', 'for', 'identifying', 'an'], 
['this', 'section', 'reviews', 'previous', 'works', 'in'], 
['subjectivity', 'detection', 'is'], 
['work', 'is', 'similar', 'to', 'ours', 'but', 'different']]

我需要的

不带任何长度为2或更小的字符串的第二版语料库的最快方法：

语料库，没有任何长度为2或更小的字符串

[['extracting', 'opinions'], 
['soo', 'min', 'kim', 'and'], 
['abstract'], 
['this', 'paper', 'presents', 'method', 'for', 'identifying'], 
['this', 'section', 'reviews', 'previous', 'works'],
['subjectivity', 'detection'],
['work','similar','ours', 'but', 'different']]

谢谢。

Answer 1

@Vera ，您可以尝试下面的代码。 它使用了诸如列表理解 ， lambda函数 ， map（） ， filter等概念。

使用列表理解 ， lambda函数 ， map（） ， filter（） ， reduce（）等是一种Python方式，可以更轻松，高效和简洁地解决问题。

您可以检查List comprehension和map（），filter（），reduce（），lambda函数等，以查看与这些概念相关的给定示例并进行解释。

import json

corpus = [['extracting', 'opinions'],
['soo', 'min', 'kim', 'and'],
['abstract'],
['this', 'paper', 'presents', 'method', 'for', 'identifying', 'an'], 
['this', 'section', 'reviews', 'previous', 'works', 'in'], 
['subjectivity', 'detection', 'is'], 
['work', 'is', 'similar', 'to', 'ours', 'but', 'different']]

new_corpus = list( map(lambda words: list(filter(lambda word: len(word)> 2, words)), corpus))

# Pretty printing list of lists of words of length > 2
print(json.dumps(new_corpus, indent=2))

"""
[
  [
    "extracting",
    "opinions"
  ],
  [
    "soo",
    "min",
    "kim",
    "and"
  ],
  [
    "abstract"
  ],
  [
    "this",
    "paper",
    "presents",
    "method",
    "for",
    "identifying"
  ],
 [
    "this",
    "section",
    "reviews",
    "previous",
    "works"
  ],
  [
    "subjectivity",
    "detection"
  ],
  [
    "work",
    "similar",
    "ours",
    "but",
    "different"
  ]
]
"""

如何从嵌套列表中删除一定长度的字符串？

问题描述

码

我需要的

1 个解决方案

解决方案1
0 2018-05-26 20:41:11

如何从嵌套列表中删除一定长度的字符串？

问题描述

码

我需要的

1 个解决方案

解决方案1 0 2018-05-26 20:41:11

解决方案1
0 2018-05-26 20:41:11