How to get the nth character in extremely large text file?

Question

I have a very large text file (~40GB) containing unseparated digits. It's been a while since I've dealt with file I/O in python (or python more generally), and I remember some wizardry with generators being used to access such files. Google yielded little specific help; it seems like everyone deals with sensibly-formatted data they can access line-by-line. All I need to do is read the nth character without destroying the kernel by reading too much into RAM. Any ideas?

Answer 1

You can use f.seek to get the nth byte in the file. In most common encodings, it's also the nth character:

with open("file.txt") as f:
    char = f.seek(N - 1)

Answer 2

Use seek which will move reading file to given position. Then call read .

Additionally, if you don't want indeed any extra data being loaded to memory during read (just one byte/char) use also buffering=0 when opening a file.

with open("largeFile", buffering=0) as f:
    f.seek(10000)
    char = f.read(1)

How to get the nth character in extremely large text file?

Question

2 answers

solution1
2 2020-02-24 09:02:16

solution2
0 2020-02-24 10:35:35

How to get the nth character in extremely large text file?

Question

2 answers

solution1 2 2020-02-24 09:02:16

solution2 0 2020-02-24 10:35:35

solution1
2 2020-02-24 09:02:16

solution2
0 2020-02-24 10:35:35