简体   繁体   English

处理python(10万行)中的大量输入的最佳方法是什么?

[英]What is the best way to handle large inputs in python (100 k lines)?

I need to handle input of 100k lines (each line contains a string) and perform a function on each line. 我需要处理10万行的输入(每行包含一个字符串),并在每行上执行一个函数。 The function will return one result per string and should print it to the console. 该函数将为每个字符串返回一个结果,并将其打印到控制台。 What is the best way of doing this? 最好的方法是什么?

My current attempt is: 我目前的尝试是:

strings = []
for i in xrange(int(input())):
    strings.append(raw_input())

More background: I want to solve a problem on Hackerrank. 更多背景:我想解决有关Hackerrank的问题。 An input can look like this (powered by Hackerrank): https://hr-testcases.s3.amazonaws.com/4187/input02.txt?AWSAccessKeyId=AKIAINGOTNJCTGAUP7NA&Expires=1420719780&Signature=iSzA93z7GKVIcn4NvdqAbbCOfMs%3D&response-content-type=text%2Fplain 输入看起来像这样(由Hackerrank提供支持): https ://hr-testcases.s3.amazonaws.com/4187/input02.txt ? AWSAccessKeyId = AKIAINGOTNJCTGAUP7NA & Expires = 1420719780 & Signature = iSzA93z7GKVIcn4NvdqAbbCOfMs%3Dtypes-plas

You don't need to store the entire file in memory because you are calculating and printing results as you read the file. 您不需要将整个文件存储在内存中,因为在读取文件时您正在计算和打印结果。

As such, simply read the file line-by-line, do your calculations and print the results: 这样,只需逐行读取文件,进行计算并打印结果:

with open('large-file.txt') as the_file:
    for line in the_file:
       result = do_something_with(line)
       print(result)

use the stdin stream, the stdin is like a file stream 使用标准输入流,标准输入就像文件流

import sys
for line in sys.stdin
  do_work(line)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM