简体   繁体   English

从两个大小不同的列表中创建一个列表(元组?)

[英]Create a list (of tuples?) from two lists of different sizes

I am stuck trying to perform this task and while trying I can't help thinking there will be a nicer way to code it than the way I have been trying. 我被困在尝试执行此任务的过程中,而在尝试的过程中,我不禁会想到比我一直尝试的方法更好的编码方法。

I have a line of text and a keyword. 我有一行文字和一个关键字。 I want to make a new list going down each character in each list. 我要创建一个新列表,将每个列表中的每个字符都记下来。 The keyword will just repeat itself until the end of the list. 关键字将重复自身直到列表的结尾。 If there are any non-alpha characters the keyword letter will not be used. 如果有任何非字母字符,将不使用关键字字母。

For example: 例如:

Keyword="lemon"
Text="hi there!"

would result in 会导致

('lh', 'ei', ' ', 'mt' , 'oh', 'ne', 'lr', 'ee', '!')

Is there a way of telling python to keep repeating over a string in a loop, ie keep repeating over the letters of lemon? 有没有办法告诉python在循环中不断重复一个字符串,即继续在柠檬字母上重复?

I am new to coding so sorry if this isn't explained well or seems strange! 我是编码新手,如果无法正确解释或感到奇怪,对不起!

You've got two questions mashed into one. 您有两个问题混为一谈。 The first is: how do you remove non-alphanumeric chars from a string? 第一个是:如何从字符串中删除非字母数字字符? You can do it a few ways, but regular expression substitution is a nice way. 您可以通过几种方法来实现,但是正则表达式替换是一种不错的方法。

import re

def removeWhitespace( s ):
    return re.sub( '\s', '', s )

The second part of the question is about how to keep looping through the keyword, until the text line is consumed. 问题的第二部分是关于如何不断循环使用关键字,直到使用完文本行。 You can write this as: 您可以这样写:

def characterZip( keyword, textline ):
    res = []
    textline = removeWhitespace(textline)
    textlen = len(textline)
    for i in xrange(textlen)):
        res.append( '%s%s' % (keyword[i%len(keyword)], textline[i]) )
    return res

Most pythonistas will look at this and see opportunity for refactoring. 大多数pythonistas会对此进行研究,并看到重构的机会。 The patten that this code is trying to achieve is in functional programming termed a zip . 该代码试图实现的模式在称为zip功能编程中。 The quirk is that in this case you're doing something slightly non-normative with the repeating characters of the keyword, this too has an equivalent, the cycle function in the itertools module. 奇怪的是,在这种情况下,您正在使用关键字的重复字符来做一些不太规范的事情,这在itertools模块中也具有等效的循环功能。

from itertools import cycle, islice, izip

def characterZip( keyword, textline ):
    textline = removeWhitespace(textline)
    textlen = len(textline)
    it = islice( izip(cycle(keyword), textline), textlen )
    return [ '%s%s' % val for val in it ]

Here's a solution: 这是一个解决方案:

import itertools

def task(kw,text):
    i = itertools.cycle(kw)
    return tuple(next(i)+t if t.isalpha() else t for t in text)

print(task('lemon','hi there!'))

Output 产量

('lh', 'ei', ' ', 'mt', 'oh', 'ne', 'lr', 'ee', '!')

itertools.cycle iterates over a sequence repeatedly (a string is a sequence of characters). itertools.cycle重复遍历一个序列(字符串是一个字符序列)。 next gets the next character from the repeating sequence. next从重复序列中获取下一个字符。 The generator expression selects the pair of next keyword letter and text character if the text character is alphabetic, else it just selects the non-alphabetic character alone. 如果文本字符是字母,则生成器表达式选择下一个关键字字母和文本字符对,否则仅选择非字母字符。

I think you could use enumerate in that situation: 我认为您可以在这种情况下使用enumerate

# remove unwanted stuff
l = [ c for c in Text if c.isalpha() ]

for n,k in enumerate(l):
   print n, (Keyword[n % len(Keyword)], Text[l])

that gives you: 这给你:

0 ('l', 'h')
1 ('e', 'i')
2 ('m', 't')
3 ('o', 'h')
4 ('n', 'e')
5 ('l', 'r')
6 ('e', 'e')

You could use that as the basis for your manipulation. 您可以以此为基础进行操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM