繁体   English   中英

如何将一个包含多个单词的字符串拆分成一个包含一定数量单词的字符串的列表?

[英]How to split a string of multiple words into a list with strings of a certain number of words?

我有一个包含多种语言字符的字符串:

'죄송합니다 how are you doing? My name is Yudhiesh and I am 아니 doing good 저기요'

我正在尝试根据字符串中的单词数将此单个字符串分块为字符串列表,如果块大小为7 ,则结果应该是这样的,即字符串中最多有 7 个单词:

['죄송합니다 how are you doing? My name', 'is Yudhiesh and I am 아니 doing', 'good 저기요']

我目前的尝试是基于你如何分块一个不起作用的列表:

s = '죄송합니다 how are you doing? My name is Yudhiesh and I am 아니 doing good 저기요'
>>> parts = [str(s[i:i+7]) for i in range(0, len(s), 7)]
>>> parts
['죄송합니다 h', 'ow are ', 'you doi', 'ng? My ', 'name is', ' Yudhie', 'sh and ', 'I am 아니', ' doing ', 'good 저기', '요']

首先,您可以创建一个单词列表,然后创建块并加入它们。

这是您在 function 中需要的:

def split_max_num(string, max_words):
    """
    >>> split_max_num('죄송합니다 how are you doing? My name is Yudhiesh and I am 아니 doing good 저기요', 7)
    ['죄송합니다 how are you doing? My name', 'is Yudhiesh and I am 아니 doing', 'good 저기요']
    """
    words = string.split()
    len_words = len(words)

    res = list()
    for index in range(0, len_words, max_words):
        res.append(' '.join(words[index:index+max_words]))
    return res

下面的呢?

def split_max(words, n): 
    words = words.split()
    words = [words[i:i + n] for i in range(0, len(words), n)]
    return [' '.join(l) for l in words]


split_max(data, 7)

您正在将列表转换为列表的字符串表示形式。

我想你是想重新加入这些词

s = '죄송합니다 how are you doing? My name is Yudhiesh and I am 아니 doing good 저기요'
s = s.split()
print([" ".join(s[i:i+7]) for i in range(0, len(s), 7)]) 

在字符串上使用.split()并将其分块:

from typing import List

def chunk_list(lst: List[str], chunk_size: int) -> List[List[str]]: 
    return [
        lst[i:i + chunk_size] 
        for i in range(0, len(lst), chunk_size)
    ]
     

def chunk_string(string: str, chunk_size: int) -> List[str]:
    return chunk_list(string.split(), chunk_size)

你确定你没有设置s=['죄송합니다 how are you doing? My name', 'is 도와 주세요 and I am doing', 'good 저기요'] s=['죄송합니다 how are you doing? My name', 'is 도와 주세요 and I am doing', 'good 저기요']在这个例子中生产parts之前?

你可能想用"somestring".split(" ")分割你的原始字符串,即分割空格以获得所有单词的列表,然后你可以像你试图做的那样用索引来切割列表。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM