简体   繁体   English

Python 将文件读入二维列表 - newline=None vs strip(“\r\n”)

[英]Python reading file into 2d list - newline=None vs strip(“\r\n”)

EDIT: bigbounty asked for sample data.编辑:bigbounty 要求提供样本数据。 I'm not sure how to keep the newlines when pasting here, so I've provided an image instead.我不确定在粘贴时如何保留换行符,所以我提供了一张图片。 See below.见下文。

I'm trying to determine the correct and most Pythonic way to strip newlines when reading data from an eternal file into a 2d list.我正在尝试确定在将来自永恒文件的数据读取到二维列表中时去除换行符的正确和最 Pythonic 的方式。 I'm having trouble working out what newline does within open() (and yes, I have checked the docs - it still hasn't clicked for me).我无法弄清楚newlineopen()中的作用(是的,我已经检查了文档 - 它仍然没有为我点击)。 Is the code below the correct way to read data into a 2d list, avoiding capturing new line characters?下面的代码是将数据读入二维列表的正确方法,避免捕获换行符吗? Is any part of it redundant (eg newline=None ?)它的任何部分是否多余(例如newline=None ?)

EDIT: I'm on windows, but looking for a cross-platform solution.编辑:我在 windows 上,但正在寻找一个跨平台的解决方案。

with open(file_name, "r", newline=None) as fh:
    list_2d = [[char for char in line.strip("\r\n")] for line in fh]

在此处输入图像描述

If you want to be compatible with all platforms, you can open w.r.t.如果要兼容所有平台,可以打开w.r.t。 'rU' mode which opens your file in Universal newline mode and then every newline occurrence will appear as a '\n' character (so then you only need to count '\n' chars). 'rU'模式以 通用换行模式打开文件,然后每个换行符都将显示为'\n'字符(因此您只需要计算'\n'字符)。 Since Python 3, it is deprecated , the 'rU' mode equals to newline=None , meaning that the code snippet is cross-platformatic.由于 Python 3 已弃用'rU'模式等于newline=None ,这意味着代码片段是跨平台的。

list_2d = []
with open(file_name, newline=None) as fh:
     list_2d.append([x for x in line.split("\n")])

No need to use 'r' specifier if you just wish to read because it is default argument already.如果您只想阅读,则无需使用'r'说明符,因为它已经是默认参数。

If you don't mind reading the whole file into memory in one go (which, it seems, you don't mind doing, since you're consuming the entire file and stuffing it into a list), you could use lines = file.read().splitlines() , which would be a list of strings, where each string is one line (with no trailing carriage return or newline characters).如果您不介意将整个文件读入 memory 在一个 go 中(您似乎不介意这样做,因为您正在使用整个文件并将其填充到列表中),您可以使用lines = file.read().splitlines() ,这将是一个字符串列表,其中每个字符串是一行(没有尾随回车符或换行符)。

Just don't put \n char in the list.只是不要将 \n char 放在列表中。

with open('a.txt', "r") as fh:
    list_2d = [[char for char in line if char!='\n'] for line in fh]

You don't need to mention.你不用提了。

  1. Read-only mode, that's default.只读模式,这是默认的。
  2. newLine = None, that's default. newLine = None,这是默认设置。
with open(file_name) as fh:
    list_2d = [[char for char in line if char != "\n"] for line in fh]

The newline argument to open enables universal newline mode if it is None or '' . opennewline参数如果是None''则启用通用换行模式 The difference between these two is that None also translates the newline characters to \n when the file is read (and translates them back if the file is written), whilst '' doesn't perform this translation.这两者之间的区别在于None还会在读取文件时将换行符转换为\n (如果写入文件则将它们转换回来),而''不执行此转换。

So if you use open with newline=None , you can expect any line ending in the file to be returned to you as \n , whichever platform you are on.因此,如果您将opennewline=None一起使用,您可以期望以文件结尾的任何行都以\n的形式返回给您,无论您在哪个平台上。

Since newline=None is the default (similarly, read for text mode is the default) your example can be written for any platform as:由于newline=None是默认值(类似地,读取文本模式是默认值)您的示例可以针对任何平台编写为:

with open(file_name) as fh:
    list_2d = [[char for char in line.strip("\n")] for line in fh]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM