[英]How to split a string into a fixed number of parts in Python?
I know the method textwrap.wrap
, but this method splits a string to a fixed length for each part, but I'm looking for a function in python that splits the string the string into fixed num of parts.我知道方法
textwrap.wrap
,但是这种方法将字符串拆分为每个部分的固定长度,但我在 python 中寻找 function ,它将字符串拆分为固定数量的部分。
For example: string = "Hello, my name is foo"
例如:
string = "Hello, my name is foo"
and foo(string, 7)
和
foo(string, 7)
returns ['Hel', 'lo,', ' my', ' na', 'me ', 'is ', 'foo']
返回
['Hel', 'lo,', ' my', ' na', 'me ', 'is ', 'foo']
Algorithmically, I know how to implement this method, but I want to know if there a module that provides it or a "magic function" in the regex module that answers this problem...从算法上讲,我知道如何实现这个方法,但我想知道是否有提供它的模块或正则表达式模块中的“魔术函数”来回答这个问题......
One approach can be using re
. 一种方法可以使用
re
。
import re
string = "Hello, my name is foo"
def foo(string, parts):
x=len(string)/parts
print re.findall(r".{"+str(x)+r"}|.+?$",string)
foo(string,7)
Output: ['Hel', 'lo,', ' my', ' na', 'me ', 'is ', 'foo']
输出:
['Hel', 'lo,', ' my', ' na', 'me ', 'is ', 'foo']
I don't know if any module does this... but I feel compelled to say that the problem here is basically What is the most "pythonic" way to iterate over a list in chunks? 我不知道是否有任何模块可以执行此操作...但是我不得不说这里的问题本质上是迭代块中列表的最“ pythonic”方法是什么? , except you have strings instead of lists.
,除了您使用字符串而不是列表。 But the most pythonic way there should also be the most pythonic here, I suppose, and it's a good thing if you can avoid
re
. 我想,最pythonic的方式这里也应该是python最多的,如果可以避免
re
,那是一件好事。 So here is the solution (not sure what you want if the string cannot be evenly divided by the number of parts; assuming you simply discard the "remainder"): 因此,这里是解决方案(如果不能将字符串不能均匀地除以部分数量,则不知道要什么;假设您只是丢弃“剩余”):
# python 3 version
def foo(string, n):
part_len = -(-len(string) // n) # same as math.ceil(len(string) / n)
return [''.join(x) for x in zip(*[iter_str] * part_len)]
Thus: 从而:
>>> s = "Hello, my name is foo"
>>> foo(s, 7)
['Hel', 'lo,', ' my', ' na', 'me ', 'is ', 'foo']
>>> foo(s, 6)
['Hell', 'o, m', 'y na', 'me i', 's fo']
Now admittedly having foo(s, 6)
return a list of length 5 is somewhat surprising. 现在公认的是
foo(s, 6)
返回长度为5的列表有些令人惊讶。 Maybe you want to raise an exception instead. 也许您想提出一个例外。 If you want to keep the remainder, then use
zip_longest
如果要保留其余部分,请使用
zip_longest
from itertools import zip_longest
def foo2(string, n, pad=''):
part_len = -(-len(string) // n)
return [''.join(x) for x in zip_longest(*[iter(string)] * part_len, fillvalue=pad)]
>>> foo2(s, 6)
['Hell', 'o, m', 'y na', 'me i', 's fo', 'o']
>>> foo2(s, 6, pad='?')
['Hell', 'o, m', 'y na', 'me i', 's fo', 'o???']
I don't think there is a builtin, but I think you could do it with regex: https://stackoverflow.com/a/9477447/1342445 我认为没有内置函数,但我认为您可以使用正则表达式来做到这一点: https : //stackoverflow.com/a/9477447/1342445
In that case your function generates the regex from the len(input) / int(parts) of the string, and raises an error if it's not divisible by the input. 在这种情况下,您的函数会从字符串的len(input)/ int(parts)生成正则表达式,如果输入不能将其整除,则会引发错误。 Would be much simpler with undefined remainder behavior :)
使用未定义的剩余行为会更简单:)
I think it would look something like: 我认为它看起来像:
import re
def split_into(string: str, parts: int):
if (len(string) % parts) != 0:
raise NotImplementedError('string is not divisible by # parts')
chunk_size = len(string) / parts
regex = '.'*chunk_size
return re.findall(regex, string)
Yet another solution to this problem...这个问题的另一个解决方案......
# split text to parts
def split_to_parts(txt,parts):
# return array
ret=[]
# calculate part length
part_len=int(len(txt)/parts)
# iterate and fill the return array
for i in range(parts):
# divide the text
piece=txt[part_len*i:part_len*(i+1)]
# add it to the return array
ret.append(piece)
# return the array
return(ret)
txt = "Hello, my name is foo"
parts=7
split_to_parts(txt,parts)
# output:
# ['Hel', 'lo,', ' my', ' na', 'me ', 'is ', 'foo']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.