简体   繁体   中英

Reading random lines from a file in Python that don't repeat untill 4 other lines have passed

So I am trying to make a program that can help people with learning new languages but I am already stuck at the beginning. One of the requirements is to let Python print the lines in a random order. So I made this.

import random

def randomline(file):
    with open(file) as f:
        lines=f.readlines()
        print(random.choice(lines))

But now I got a problem with one of the other requirements. There has to be 4 other words in between before the word can show again and I have no idea how to do that.

I have a very primitive solution for you:

import random 

def randomline(file):
    with open(file) as f:
        lines=f.readlines()
        return random.choice(lines)

isOccuredInLastFourExistence = True
LastFourWords = []

file = "text_file.txt"
for i in range(0,15):
    new_word = randomline(file)
    print(LastFourWords)
    if new_word in LastFourWords:
        print("I have skipped")
        print(new_word)
        continue
    print(new_word)
    LastFourWords.append(new_word)
    if(len(LastFourWords)) > 4:
        LastFourWords.pop(0)

The file looked like this:
在此处输入图像描述


The output looks like:(Showing only partial result)

[]
New

['New\n']
Example

['New\n', 'Example\n']
After

['New\n', 'Example\n', 'After\n']
Some

['New\n', 'Example\n', 'After\n', 'Some\n']
I have skipped
Example

['New\n', 'Example\n', 'After\n', 'Some\n']
Please

['Example\n', 'After\n', 'Some\n', 'Please\n']
I have skipped
Please

['Example\n', 'After\n', 'Some\n', 'Please\n']
Only

['After\n', 'Some\n', 'Please\n', 'Only\n']
Word
['Some\n', 'Please\n', 'Only\n', 'Word']
New

So every time you have something which is already present in your list, it will be skipped. And the list clears the first position element when there are more than 4 elements.

you can use a queue:

# create list with empty elements against which choice is checked
queue = 4*['']

def randomline(file):
    with open(file) as f:
        lines=f.readlines()
        choice = random.choice(lines)
        if not choice in queue:
            print(choice)

            # appendcurrent word to the queue
            queue.append(choice)
            # remove the first element of the list
            queue.pop(0)

You can utilise deque from the collections library. This will allow you to specify a max length for your seen words list. As you append items to the list, if your list is at the max length and you append a new item the oldest item will be remove. This allows you to make a cache. So if you create a list using deque with max length 4. Then you chose a word and check if its in the list, If it is then chose another word, if its not in the list then print the word and add it to the list. you dont have to worry about managing the items in the list as they oldest will automatically drop out when you append something new

from collections import deque
from random import choice, sample

with open('test.dat') as words_file:
    words = words_file.readlines()
    word_cache = deque(maxlen=4)
    for _ in range(30):
        word = choice(words).strip()
        while word in word_cache:
            word = choice(words).strip()
        print(word)
        word_cache.append(word)

I would use linecache . It's from the standard library and allows you to select a specific line. If you know the number of lines in your file, this could work:

import linecache
import random

def random_lines(filename, repeat_after=4):

    n_lines = len(open(filename, "r").readlines())
    last_indices = []

    while True:

        index = random.randint(1, n_lines)

        if index not in last_indices:

            last_indices.append(index)
            last_indices = last_indices[-repeat_after:]

            line = linecache.getline(filename, index)
            yield line

This will create a generate which will output a random line from your file without needing to keep your lines in memory (which is great if you start having many lines).

As for your requirement of only allowing repetition after n number of times. This will take care of it. However, this has a very small chance of getting stuck in an infinite loop.

Another approach would be to create a list with all indices (ie line number), shuffle it, and then loop through them. This has the advantage of not risking to be in an infinite loop, but this also means that you'll need to go through every other line before you see the same line again, which may not be ideal for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM