简体   繁体   中英

Read a number of random lines from a file in Python

有人可以告诉我如何从Python文件中读取随机的行数吗?

Your requirement is a bit vague, so here's another slightly different method (for inspiration if nothing else):

from random import random
lines = [line for line in open("/some/file") if random() >= .5]

Compared with the other solutions, the number of lines varies less (distribution around half the total number of lines) but each line is chosen with 50% probability, and only one pass through the file is required.

To get a number of lines at random from your file you could do something like the following:

import random
with open('file.txt') as f:
    lines = random.sample(f.readlines(),5)

The above example returns 5 lines but you can easily change that to the number you require. You could also change it to randint() to get a random number of lines in addition to a number of random lines, but you'd have to make sure the sample size isn't bigger than the number of lines in the file. Depending on your input this might be trivial or a little more complex.

Note that the lines could appear in lines in a different order to which they appear in the file.

import linecache
import random
import sys


# number of line to get.
NUM_LINES_GET = 5

# Get number of line in the file.
with open('file_name') as f:
    number_of_lines = len(f.readlines())

if NUM_LINES_GET > number_of_lines:
     print "are you crazy !!!!"
     sys.exit(1)

# Choose a random number of a line from the file.
for i in random.sample(range(1,  number_of_lines+1), NUM_LINES_GET)
    print linecache.getline('file_name', i)

linecache.clearcache()
import os,random

def getrandfromMem(filename) :
  fd = file(filename,'rb')
  l = fd.readlines()
  pos = random.randint(0,len(l))
  fd.close()
  return (pos,l[pos])

def getrandomline2(filename) :
  filesize = os.stat(filename)[6]
  if filesize < 4096 :  # Seek may not be very useful
    return getrandfromMem(filename)

  fd = file(filename,'rb')
  for _ in range(10) : # Try 10 times
    pos = random.randint(0,filesize)
    fd.seek(pos)
    fd.readline()  # Read and ignore
    line = fd.readline()
    if line != '' :
       break

  if line != '' :
    return (pos,line)
  else :
    getrandfromMem(filename)

getrandomline2("shaks12.txt")

Assuming the offset is always at the beginning of the file:

import random
lines = file('/your/file').read().splitlines()
n_lines = random.randrange(len(lines))
random_lines = lines[:n_lines]

Note that this will read the entire file into memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM