简体   繁体   English

返回由定界符分隔的字符串列表

[英]returning a list of strings separated by a delimiter

Im having some problems trying to solve this question. 我在尝试解决这个问题时遇到了一些问题。 Its from a practice exam and I just can't seem to get it right. 它来自实践考试,我似乎无法正确完成。 Im supposed to write a python function that takes in a string and a delimiter, and return a list back with the string stripped of the delimiter. 我应该写一个python函数,该函数接受一个字符串和一个定界符,并返回一个列表,其中的字符串被除去定界符。 We are not allowed to use the split function or "any such function". 我们不允许使用split函数或“任何此类函数”。 The example we received in the question was this 我们在问题中收到的示例是这样的

StringToken("this is so fun! I love it!", "!")

Outputs 输出

["this is so fun", "I love it"]

This is the code I made up, its super simple. 这是我编写的代码,它非常简单。

def tokenizer(string, tmp):
    newStr = []
    for i in range(len(string)):
        if string[i] != tmp:
            newStr.append(string[i])
    return newStr

and the output is this 输出是这个

['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'o', ' ', 'f', 'u', 'n', ' ', 'I', ' ', 'l', 'o', 'v', 'e', ' ', 'i', 't']

How can I rejoin each word? 如何重新加入每个单词?

If you join all the elements in the list you will get a single string which may not be what you are looking for. 如果将列表中的所有元素都加入,则会得到一个字符串,该字符串可能不是您想要的。

Create a string before append it to the list like; 创建一个字符串,然后将其附加到列表中;

>>> def StringToken(string, tmp):
    newStrlist = []
    newStr = ''
    for i in range(len(string)):
        if string[i] != tmp:
            newStr += string[i]
        elif newStr != '':
            newStrlist.append(newStr)
            newStr = ''
    return newStrlist
... ... ... ... ... ... ... ... ... ... 
>>> StringToken("this is so fun! I love it!", "!")
['this is so fun', ' I love it']

See comments in the code for a description. 有关说明,请参见代码中的注释。

def StringToken(string, tmp):
    newStr = ""   # A string to build upon
    lst = []      # The list to return
    for c in string: # Iterate over the characters
        if tmp == c: # Check for the character to strip
            if newStr != "":   # Prevent empty strings in output
                lst.append(newStr.strip())   # add to the output list
                newStr = ""                  # restart the string
                continue                     # move to the next character
        newStr += c  # Build the string
    return lst   # Return the list

Output 产量

StringToken("this is so fun! I love it!", "!")
# ['this is so fun', 'I love it']

Instead of looping over the all the letters in the string, you can use find to get the index of the next occurrence of the delimiter and then build your list accordingly: 无需遍历字符串中的所有字母,您可以使用find获取下一个定界符的索引,然后相应地构建列表:

def tokenizer(string, delim):
    new_list = []
    while True:
        index = string.find(delim)  # use find to next occurrence of delimiter
        if index > -1:
            new_list.append(string[:index])
            string = string[index + len(delim):]
        else:
            new_list.append(string)
            break              # break because there is no delimiter present anymore

    # remove whitespaces and trim the existing strings 
    return [item.strip() for item in new_list if item.strip()]

Usage: 用法:

>>> tokenizer("this is so fun! I love it!", "!")
["this is so fun", "I love it"]

Here's an alternative that's a little shorter than the current answers: 这是一个比当前答案短一点的替代方法:

def StringToken(string, tmp):
    newStr = []
    start = 0
    for ind, char in enumerate(string):
        if char == tmp:
            newStr.append(string[start:ind])
            start = ind + 1
    return newStr

Output 产量

>>> StringToken("this is so fun! I love it!", "!")
['this is so fun', ' I love it']

Edit: If you would like to remove leading or trailing spaces like in your example, that can be done using strip(): 编辑:如果您想删除示例中的前导或尾随空格,则可以使用strip()完成:

def StringToken(string, tmp):
    newStr = []
    start = 0
    for ind, char in enumerate(string):
        if char == tmp:
            newStr.append(string[start:ind].strip())
            start = ind + 1
    return newStr

Output 产量

>>> StringToken("this is so fun! I love it!", "!")
['this is so fun', 'I love it']

simply use join operator this will join entire list with a given delimiter. 只需使用join运算符,这将使用给定的分隔符将整个列表连接在一起。 Here in this you can use empty delimiter ''. 您可以在此处使用空定界符''。 try: 尝试:

a=['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'o', ' ', 'f', 'u', 'n', ' ', 'I', ' ', 'l', 'o', 'v', 'e', ' ', 'i', 't']
''.join(a)

output will be 输出将是

'This is so fun I love it'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM