简体   繁体   中英

how to split a string without any special characters ,uppercase or numbers in a word

I need to split this word into a sentence using python.Is there any way to do this??

   strng = 'thisisastring'

o/p:

this is a string

As Peter and Mark have already pointed out, this is a hard problem with no easy or unique solution. You certainly need a list of possible words to start with. Probably your best bet is then to use backtracking.

Here's a simple function that returns a list of tuples, where each tuple represents one possible sentence.

words = [
  "a", "as", "is", "light", "or", "project", 
  "projector", "string", "the", "this"
]

def findPhrase(text):
    result = []
    for word in words:
        if text == word:
            # if the entire text is the word, there is no need
            # to look at the (now empty) rest.
            result.append((word,))
        elif text.startswith(word):
            # if the text starts with the current word, try to 
            # find all partitions of the remaining text
            rest = findPhrase(text[len(word):])

            # if there are any such partitions, add them all to our
            # list of results, and put the current word in front
            # of each of these solutions
            for solution in rest:
                result.append((word,) + solution)
    return result

Note that I use (word,) in this code to make it a tuple, so we can simply add it together, ie ("is",) + ("a", "string") -> ("is", "a", "string") .

The basic idea of the algorithm is to split the string one word at a time. So, a first approximation would be the following, which takes the first word that might fit and then tries to split the rest of the string.

def my_split(text):
    if text == "":
        return []
    for word in words:
        if text.startswith(word):
            rest = text[len(word):]
            result = [word] + my_split(rest)
            return result

However, this does not work in general. In your example, once you reach the rest being "astring" , the algorithm might then try "as" as the next possible word, but because "tring" is not a word, it simply fails.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM