简体   繁体   中英

readline function returning empty string

I am new to Python; little experience in programming C++. I saw this question but it doesn't address my problem.

Python 2.7.9, 64-bit AMD, Windows 7 Ultimate, NTFS, administrator privileges & no "read only" attribute on file to be read.

I want to create a list of strings which fulfill a certain criteria, the strings are lines of the file(see notepad.cc/diniko93).So I wrote the following function-

def makeLineList( filePtr, ptr ):
    lines = []
    while True:
        s = filePtr.readline()
        if not s=="":
            s = s[3:]
            s = s.split()
            if s[0].isdigit():
                print("O")
                lines.append(s)
            elif s[0] in {"+", "-"}:
                print("U")
                lines.append(s)
        else:
            print("none")
            break
    filePtr.seek(ptr, 0);    #I did this to restore file pointer, so other functions accessing this file later don't misbehave
    return lines

and the 2 possible main()-like (pardon my ignorance of python) bodies that I am using are-

with open("./testStage1.txt", 'r') as osrc:
    osrc.seek(291, 0)
    L = makeLineList( osrc, osrc.tell())
    print "".join(L)

and the other one-

osrc = open("./testStage1.txt", 'r')
osrc.seek(291, 0)
L = makeLineList( osrc, osrc.tell())
print "".join(L)
osrc.close()

both the times the output on terminal is a disappointing none

Please Note that the code above is minimum required to reproduce the problem and not the entire code.

EDIT: Based on @avenet's suggestion, I googled & tried to use iter ( __next__ obj.next() in python 3.3+ or next(obj) in 2.7) in my code but the problem persists, I am unable to read next line even if I call next(osrc) from inside the function check out these 2 snippets

  • version2 next used only in main()-ish part transform_line function is not called. Calling next() 3 times produces desirable/expected output but in
  • version3 I get a list index out of range error, even for lists[0] which definately has a digit

EDIT 2: I tried scope check inside my functions as if not osrc in locals(): and in next line with proper indent print("osrc not reachable") . And the output is osrc not reachable . I also tried using from tLib import transform_line from a temporary tLib.py but with identical results. Why is osrc not available in either case?

EDIT 3: Since the problem appears to be of scope. So to avoid passing of file variable- make a function whose sole purpose is to read a line. The decision to get next line or not depends upon returned value of a function like isLineUseful()

def isLineUseful( text, lookFor ):
    if text.find(lookFor)!=-1:
        return 1
    else:
        return 0
def makeList( pos, lookFor ):
    lines = []
    with open("./testStage1.txt", 'r') as src:
        src.seek(pos)
        print(src.read(1))
        while True:
            line = next(src)
            again = isLineUseful(line, lookFor)
            if again==0:
                src.seek(pos)
                break
            else:
                lines.append(line)
    return lines

t = makeList(84, "+")
print "\n".join(t)

Tried it, it works perfectly on this(notepad.cc/diniko93) sample testStage1.txt.

So my programming issue is solved (thanks to responders :D) & I am marking this as answered but posting a new question about the anomalous/ behavior of readline() & __next__ .

PS I am still learning the ways of python so I would be very happy if you could suggest a more pythonic & idomatic version of my code above.

First of all, you are not using Python as it should be used. The purpose of using a language like Python is to write just fewer lines of code to achieve the same result of other snippets of code in other programming languages, such as C++ or Java.

It's not necessary to pass a file pointer as a function parameter to read the file, you can open directly the file within the function to which you pass the filename.

Then you can call this function with the file name and store the list in a variable that you will eventually manipulate. If you are not familiar with exceptions handling, you could for example use a function from the module os to check if the file already exists: os.path.exists(filename) .

If you want to search for a pattern in the line you are currently using, you can simply use an if statement (there are a lot of ways of doing that, this is just an example):

if line not in list_of_strings_you_want_not_to_include: 
    lines.append(line)

If you to check if the pattern is at the beginning, you can use the startswith string function on the line:

if not str(line).startswith("+"):
    lines.append(line)     

If you want to skip a certain amount of characters, you can use the seek function (as you are effectively using). This is just a way that uses more lines of code, but it's still very simple:

def read_file(filename, _from):
    lines = []
    try:
        with open(filename) as file:
            file.seek(_from)
            for line in file:
                lines.append(line)     
    except FileNotFoundError:
        print('file not found')
    return lines

filename = "file.txt"
lines = read_file(filename, 10)

Much easier, you can also do this, instead of iterating explicitly through all lines:

with open(filename) as file:
    file.seek(_from)
    return list(file)

Or using your favourite function readlines :

with open(filename) as file:
    file.seek(_from)
    return file.readlines()

The purpose and the advantage of iterating explicitly through all lines is that you can do a lot of checking and whatever you want with the lines or characters in the right moment you are reading, so I would adopt certainly the first option I suggested above.

If you want to modify the lines your way:

def transform_line(line):
    if line != "":
        if line[0].isdigit():
            print("O")
        elif line[0] in {"+", "-"}:
            print("U")
    else:
        print("None")
    return line

with open("./testStage1.txt", 'r') as osrc:
    osrc.seek(291)
    lines = [transform_line(line) for line in osrc]
    #Do whatever you need with your line list

If you don't want to transform lines just do this:

with open("./testStage1.txt", 'r') as osrc:
    osrc.seek(291)
    lines = list(osrc)
    #Do whatever you need with your line list

Or just implement a line iterator if you need to stop on a certain condition:

def line_iterator(file):
    for line in file:
        if not line[0].isdigit() and not line in ["+", "-"]:
            yield line
        else:
            break

with open("./testStage1.txt", 'r') as osrc:
    osrc.seek(291)
    lines = list(line_iterator(osrc))
    #To skip lines from the list containing 'blah'
    lines = [x for x in lines if 'blah' not in line]
    #Do whatever you need with your line list

You try to process this input:

<P> unnecessart line </P>
<P> Following is an example of list </P>
<P> 1. abc </P>
<P>     + cba </P>
<P>     + cba </P>
<P>             + xyz </P>

Now in your brain, you just see the important bits but Python sees everything. For Python (and any other programming language), each line starts with < . That's why the if 's never match.

If you stripped the <P> , be sure to strip the spaces as well because

1. abc
    + cba

the second line starts with a space, so s[0] isn't + . To strip spaces, use s.trim() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM