简体   繁体   English

从字符串操作列表中的项目,然后将其转回字符串

[英]Manipulating items in a list, from a string then turning it back to a string

I applied to a data engineer job not too long ago, I got a Python question that didn't meet all the edge cases and it had been haunting me since, I used .endswith() at that time and I feel like that's what failed in my code不久前我申请了一份数据工程师的工作,我收到了一个 Python 问题,它没有满足所有的边缘情况,从那以后它一直困扰着我,我当时使用.endswith() ,我觉得这就是失败的原因在我的代码中

I have been trying to recode it and here is what I have so far:我一直在尝试重新编码它,这是我到目前为止所拥有的:

x = 'cars that ran up and opened a 
tattooaged car dealership educated'
# create a program to remove 'ed' from 
# any word that ends with ed but not 
# the word 'opened'
# also, every word must be less than 
# 8 letters long

suffix= 'ed'

def check_ed_lt8(x):
    x_list=x.split(" ")
    for index,var in enumerate(x_list):
        if suffix in var != 'opened':
            new_word = var[:-len(suffix)].strip('suffix')
            x_list[index] = new_word
        elif len(var) >= 8:
            shorter_word = var[:8]
            x_list[index] = shorter_word
    return(' '.join(x_list))

print(check_ed_lt8(x))

I get the desired output:我得到了想要的 output:

cars that ran up and opened a tatooag car dealersh educat

But the technical question had examples before it, like some words ending in 'ly' and I started wondering if I maybe just had to loop through a list of suffixes, and that's why I don't pass the edge cases so I modified my code but now, every time I add on to the list, I lose manipulation over one of the last items in the list但是技术问题之前有一些例子,比如一些以“ly”结尾的单词,我开始想知道我是否可能只需要遍历一个后缀列表,这就是为什么我没有通过边缘情况,所以我修改了我的代码但是现在,每次我添加到列表中时,我都会失去对列表中最后一项的操作

suffixes = ['ed', 'an']
def check_ed_lt8(x):
    x_list=x.split(" ")
    for index,var in enumerate(x_list):
        for suffix in suffixes:
            if suffix in var != 'opened':
                new_word = var[:-len(suffix)].strip('suffix')
                x_list[index] = new_word
            elif len(var) >= 8:
                shorter_word = var[:8]
                x_list[index] = shorter_word
    return(' '.join(x_list))

print(check_ed_lt8(x))

Returns:回报:

cars that r up a opened a tattoag car dealersh educated

In this return, I lost manipulation over the last item AND I didn't mean for “and” to lose “nd”.在这次回归中,我失去了对最后一项的操纵,我并不是说“and”会失去“nd”。 I know it lost it because of a combination of “d” and “n” from each prefix but I don't know why我知道它因为每个前缀中的“d”和“n”的组合而丢失了,但我不知道为什么

I lose more manipulation over the last few items the more items I place inside of the prefixes, for example if I add “ars” to the prefixes the outcome becomes:我在前缀中放置的项目越多,对最后几项的操作就越多,例如,如果我在前缀中添加“ars”,结果将变为:

c that r up a opened a tattoag car dealership educated 

What am I doing wrong?我究竟做错了什么?

I would suggest using re.sub for removing the ed at the end.我建议在最后使用 re.sub 删除 ed 。 Here is a one-liner:这是一个单行:

import re
x = 'cars that ran up and opened a tattoo aged car dealership educated'
y = ' '.join([w if w == "opened" else re.sub(r'ed$', '', w)[:8] for w in x.split(' ')])

If you want to remove multiple suffixes, extend your regexp accordingly:如果要删除多个后缀,请相应地扩展您的正则表达式:

y = ' '.join([w if w == "opened" else re.sub(r'(ed|an)$', '', w)[:8] for w in x.split(' ')])

Of course you can also build the regexp based on a list of suffixes:当然,您也可以根据后缀列表构建正则表达式:

suffixes = ['ed','an']
pattern = re.compile('('+'|'.join(suffixes)+')$')
y = ' '.join([w if w == "opened" else pattern.sub('', w)[:8] for w in x.split(' ')])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM