简体   繁体   中英

Finding Words After The Last Vowel

I'm currently trying to generate a list of words that rhyme with an input word according to the CMU Pronouncing dictionary I have managed to arrange all the words into a dictionary with their keys being a list of strings representing their values. However, due to something rhyming based on the last vowel, I'm sort of stuck on finding how to go about this in the case of words that contain more than one

def dotheyrhyme(filename,word):
    rhymes = {}
    list = []
    with open(filename) as f:
        text = f.readlines()[56:]
        for line in text:
            splitline = line.split("  ")
            rhymes[str(splitline[0])] = "".join(splitline[1:])
        f.close()
    comparer = rhymes[word.upper()].rstrip().split(" ")
    return comparer

I plan to use the comparer variable as a baseline and believe reversing this variable could also be a good way to go about it but I'm lost or overthinking ways to compare if the last vowel and letters after are the same and append accordingly?

Example:

{SECOND: 'S' 'EH1' 'K' 'AH0' 'N' 'D'} 

Would rhyme with

{'AND': 'AH0' 'N' 'D'} 

but these two wouldn't rhyme

 {'YELLOW': 'Y' 'EH1' 'L' 'OW0'}

And

 {HELLO: 'HH' 'AH0' 'L' 'OW1'}

But the methods I can't think of ways to counter varying lengths and multiple vowels.

Thanks for your help!

Finding last vowel requires you to have a set of vowels. After that you only got to iterate over the list backwards.

vowels = {...} # some list of vowels
word = ['S', 'EH1', 'K', 'AH0', 'N', 'D']

for i in word[::-1]:
    if i in vowels:
        last_vowel = i
        break

If open to other idea you can also look at this library which finds the rhymes for you : https://pypi.org/project/pronouncing/

You would have to start comparing from the end. There are special algorithms and data structures that can help in cases like yours - you can check Aho-Corasick algorithm .

But in the simple case, you would need to compare the words in the reverse order and find common substring above some threshold to call these words a rhyme, eg:

def if_rhymes(word1, word2):
    r1 = reverse(rhymes[word2])
    r2 = reverse(rhymes[word1])
    the_same = 0
    for sound1, sound2 in zip(r1, r2):
        if sound1 == sound2:
            the_same += 1
        else:
            break

     if the_same < threshold:
         return 'no rhyme'  # or False if you want
     else:
         return 'rhymes'  # or True

What the algorithm does

  1. It takes the list of sounds from the rhymes dictionary that you populated from file (for clarity I recommend doing it outside the rhyme testing function).
  2. Then it reverses the order of elements in lists of sounds for both words and creates a list of pairs (or tuples) using zip .
  3. Each of the tuples (sounds from the words in the reverse order) is compared. We count the ones that are the same and stop comparing on the first different pair of sounds from the back.
  4. Depending on the threshold (you may want to substitute the variable for an actual value) you consider given pair of words as a rhyme or not.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM