boto3是否可以使用readlines？

Question

I'm trying to run a diff on two files that are stored in S3, and would like to avoid downloading the files if possible. 我正在尝试对S3中存储的两个文件运行diff，并希望尽可能避免下载文件。

The sample code I am working with is as so: 我正在使用的示例代码是这样的：

import difflib

file1 = open('sample1.csv', 'r');
file2 = open('sample2.csv', 'r');

diff = difflib.ndiff(file1.readlines(), file2.readlines())

I see with boto3 package, I can open the file from S3, but how can I pass the equivalent of file1.readlines() and file2.readlines() into the ndiff function? 我看到带有boto3包，可以从S3打开文件，但是如何将等效的file1.readlines（）和file2.readlines（）传递给ndiff函数呢？

Answer 1

For future readers, I'll answer the exact question "Is it possible to use readlines with boto3?" 对于将来的读者，我将回答确切的问题“是否可以在boto3中使用阅读行？”

import io

// import stuff and set up s3_client

body = s3_client.get_object(Bucket=bucket, Key=key)['Body']
stream = io.BufferedReader(body._raw_stream)
stream.readlines()

As indicated by comments on the question, readlines() pulls everything into memory, which is why you can pass a hint to it so "no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint." 正如对该问题的评论所指出的那样，readlines（）将所有内容都拉到内存中，这就是为什么您可以向其传递提示，以便“如果到目前为止所有行的总大小（以字节/字符为单位）都超出，将不再读取行暗示。” ( https://docs.python.org/2/library/io.html#io.IOBase.readlines ) （ https://docs.python.org/2/library/io.html#io.IOBase.readlines ）

boto3是否可以使用readlines？

问题描述

1 个解决方案

解决方案1
0 2018-05-13 21:34:38

boto3是否可以使用readlines？

问题描述

1 个解决方案

解决方案1 0 2018-05-13 21:34:38

解决方案1
0 2018-05-13 21:34:38