简体   繁体   English

用Python将文本文件读入2D数组

[英]Reading a text file into a 2D array in Python

I am trying to extract data from a text file and place it into a 2D array where the columns are organized by what value they represent so I can plot the data and analyze it. 我正在尝试从文本文件中提取数据并将其放入2D数组中,在该数组中,列是根据它们所代表的值来组织的,因此我可以绘制数据并进行分析。 The data in the text file is formatted like 文本文件中的数据格式如下

10.24284 49.89447 462.90 312.4 Wed Dec 7, 2016 6:42:10p EST 10.24284 49.89447 462.90 312.4 2016年12月7日星期三EST

I want to be able to separate the values in each column into their own lists or a full 2D array. 我希望能够将每列中的值分成自己的列表或完整的2D数组。 I have tried looking into open('filename') as well as readlines but it just returns a mess of numbers that are not sorted in any way. 我尝试过研究open('filename')readlines但是它只是返回一堆乱码,没有以任何方式进行排序。 What is the best solution to the problem? 解决该问题的最佳方法是什么?

Using open('filename', 'r') (the r means read) you can loop over all the lines in the code with a simple for loop. 使用open('filename', 'r') (r表示读取),您可以使用简单的for循环遍历代码中的所有行。 something like this: 像这样的东西:

with open('filename', 'r') as inputfile:
    for line in inputfile:
        #do something with the string

You said that the data you had was formatted like this: 您说过,您的数据格式如下:

10.24284 49.89447 462.90 312.4 Wed Dec 7, 2016 6:42:10p EST 10.24284 49.89447 462.90 312.4 2016年12月7日星期三EST

you could take each line and split it on every space like this: 您可以将每一行分割成每个空格,如下所示:

line.split(" ")

you would now have something like: 您现在将得到类似:

['10.24284', '49.89447', '462.90', '312.4', 'Wed', 'Dec', '7,', '2016', '6:42:10p', 'EST']

if you wanted to keep the date together in the final array you could limit how many time you split it like this: 如果您想将日期保留在最终数组中,则可以限制拆分日期的时间,如下所示:

 line.split(" ", 4)

this would give you: 这会给你:

['10.24284', '49.89447', '462.90', '312.4', 'Wed Dec 7, 2016 6:42:10p EST']

The numbers part is trivial... 数字部分微不足道...

result = [map(float, L.split()[:4]) for L in open("datafile.txt")]

result[43][2] will be the third number on 44-th row (counting starts from 0 in Python) result[43][2]将是第44行的第三个数字(在Python中,计数从0开始)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM