简体   繁体   English

使用 python 从非常大的文本文件 (16gb) 中跳过任何行的省时方法

[英]Time efficient way to skip no of line from very large text file (16gb) using python

I have a very large text file of 16gb.我有一个非常大的 16gb 文本文件。 I need to skip no of line.I want to skip those line in time efficient manner.我需要跳过任何一行。我想以省时的方式跳过这些行。 I am using python for code.how to do that?我正在使用 python 作为代码。怎么做?

Just read the number of lines you want to skip and throw them away:只需阅读您要跳过的行数并将其丢弃:

with open(your_file) as f_in:
    for i in range(number_of_lines_to_skip):
        f_in.readline()
    # your file is now at the line you want...  

You can also use enumerate to have a generator that only yields lines once you have skipped the lines you want to:您还可以使用enumerate来创建一个生成器,该生成器仅在您跳过您想要的行后才产生行:

with open(your_file) as f_in:
    for line in (line for i, line in enumerate(f_in) if i>lines_to_skip):
        # here only when you have skipped the first lines

The second there is likely faster.第二个可能更快。

beware, calling next on a file object will raise StopIteration if the end of file is reached.请注意,如果到达文件末尾,对文件 object 调用next将引发StopIteration

go_to_line_number = some_line_number

with open(very_large_file) as fp:

    for _ in range(go_to_line_number):
        next(fp)

    for line in fp:
        # start your work from desired line number
        pass

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从巨大的文本文件(> 16GB)中提取特定行,其中每行以另一个输入文件中指定的字符串开头? - How to extract specific lines from a huge text file (>16GB) where each line starts with a string specified in another input file? 有没有办法在 Python 中将大约 16GB 的 SAS 文件拆分为多个文件/数据帧? - Is there any way to split a SAS file of around 16GB into multiple files/dataframes in Python? 逐行处理非常大(> 20GB)的文本文件 - Process very large (>20GB) text file line by line 在Python中修改大型文本文件的最后一行的最有效方法 - Most efficient way to modify the last line of a large text file in Python 尽管发生16gb交换,Python仍会引发内存错误 - Python raises Memory Error despite of 16gb Swap 在 Python 中从非常大的文本文件中删除重复项的更快方法? - Faster way to remove duplicates from a very large text file in Python? Dask Running Out Of Memory (16GB) 使用应用时 - Dask Running Out Of Memory (16GB) When using apply 为什么我的笔记本电脑无法处理大型数据集? 16gb RAM 被 1.5gb csv 击败 - Why is my laptop struggling with large data sets? 16gb RAM defeated by a 1.5gb csv 使用python和pandas从很大的文本文件中提取数据? - Extracting data from a very large text file using python and pandas? 64GB 可用,超过 16GB 时 cv2/python 崩溃 - 64GB available, cv2/python crash when exceeding 16GB
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM