简体   繁体   English

每个单词之间的间隔数

[英]number of space between each word

How can I find a quick way to count the number of spacing between each word in a text? 如何找到一种快速的方法来计算文本中每个单词之间的间距数量?

Each space represents a value, 每个空格代表一个值,

Example: one space is the letter 'a', two spaces is the letter 'b', etc.. 例如:一个空格是字母“ a”,两个空格是字母“ b”,依此类推。

An example with the text 文本示例

text : 文字:

hello all  the   world 

one space between hello and all --> 'a', two spaces between all and the --> 'b', ... 在hello和all之间有一个空格->'a',在all和->'b'之间有两个空格,...

word --> 'abc' 字-> 'abc'

import re
import string

''.join(map(lambda x: string.lowercase[len(x) - 1], re.findall(r'\s+', 'hello all  the   world')))
# 'abc'

For entertainment value -- and because I don't like regular expressions but do like the itertools module -- another way to do this is to know that you can use itertools.groupby to collect objects by like kind: 为了获得娱乐价值-并且因为我不喜欢正则表达式,而是喜欢itertools模块-另一种方法是知道可以使用itertools.groupby来按类似的方式收集对象:

>>> from string import lowercase
>>> from itertools import groupby
>>> 
>>> s = 'hello all  the   world'
>>> counts = [(len(list(cpart))) for c,cpart in groupby(s) if c == ' ']
>>> counts
[1, 2, 3]
>>> values = [lowercase[count-1] for count in counts]
>>> values
['a', 'b', 'c']
>>> vs = ''.join(values)
>>> vs
'abc'

itertools.groupby is often very useful. itertools.groupby通常非常有用。

Assuming i got you right: 假设我说对了:

from string import lowercase

word = lowercase[:text.count(' ')]

If you'd specify the output format you want, I could make this more specific, but this should put you well on your way to a complete solution. 如果您要指定所需的输出格式,我可以对此做更具体的说明,但这应该可以使您顺利地找到完整的解决方案。

import re

word_re = re.compile('(\W*)(\w+)'):

for match in word_re.finditer(text)
    spaces, word = match.groups()
    print len(spaces), word

Note: \\w stands for "word characters" and \\W is the opposite. 注意: \\w代表“单词字符”,而\\ W相反。 Depending on your exact problem you may want to make these more specific. 根据您的确切问题,您可能需要使它们更具体。

Reference: http://docs.python.org/library/re.html#regular-expression-syntax 参考: http : //docs.python.org/library/re.html#regular-expression-syntax

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM