简体   繁体   中英

what's the easiest way to load array data from a file in Python?

I know Matlab has some nice syntax where you can put into a file array definitions, like A = [[1,2,3],... , and then you can import that file and all those definitions are read automatically. I would like to do something similar in Python. Basically I'm looking for the easiest way to read tabular data from a file, and having the resulting object as numpy array instances. What's the easiest way to accomplish this? (or the most Pythonic way?)

Say the data in the file is as follows:

Array1
1 0 0 0
2 1 0 0
3 0.3333333333325028 0 0
4 0.6666666666657888 0 0

Array2
1 1 1 1
2 3 1 1
3 2 2 2
4 3 2 2
5 1 1 3
6 1 3 4
7 1 4 2

file test1.py :

#!/usr/bin/python
a=[1,2,3,4,5,6]

file test.py :

#!/usr/bin/python

import test1

print test1.a

Now if you run test.py:

$ ./test.py
[1, 2, 3, 4, 5, 6]

What Jahid said below works well if you want to put your data in Python modules.

If on the other hand you'd rather put your data in a separate file, eg a text file, and then read it in a script, you may want to use numpy.loadtxt (it's designed to automatically read matrix-like files into numpy arrays).

http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html

What you probably want is to put your data in the yaml file format. It is a text data format whose structure is based on higher-level scripting languages like Python. You can put multiple 2D arrays of arbitrary types in it. However, since it is just data, not code, it isn't as dangerous as putting the data directly in a Python script. It can pretty easily make 2D arrays , or more strictly nested lists (look at example 2.5 at that link specifically), as well as the equivalent of ordinary lists, dicts, nested dicts, strings, and any combination thereof. Since you can nest one data type in another, you can have a dictionary of 2D arrays, for example, which lets you put multiple arrays in a single file.

Here is your example in yaml:

Array1:
- [1, 0, 0, 0]
- [2, 1, 0, 0]
- [3, 0.3333333333325028, 0, 0]
- [4, 0.6666666666657888, 0, 0]

Array2:
- [1, 1, 1, 1]
- [2, 3, 1, 1]
- [3, 2, 2, 2]
- [4, 3, 2, 2]
- [5, 1, 1, 3]
- [6, 1, 3, 4]
- [7, 1, 4, 2]

And here is how to read it into numpy arrays (the file is called "temp.yaml" in my example), using the PyYaml package:

>>> import yaml
>>>
>>> with open('temp.yaml') as ym:
....    res = yaml.load(ym)
>>> res
{'Array1': [[1, 0, 0, 0],
  [2, 1, 0, 0],
  [3, 0.3333333333325028, 0, 0],
  [4, 0.6666666666657888, 0, 0]],
'Array2': [[1, 1, 1, 1],
  [2, 3, 1, 1],
  [3, 2, 2, 2],
  [4, 3, 2, 2],
  [5, 1, 1, 3],
  [6, 1, 3, 4],
  [7, 1, 4, 2]]}
>>> array1 = np.array(res['Array1'])
>>> array2 = np.array(res['Array2'])
>>> print(array1)
[[ 1.          0.          0.          0.        ]
 [ 2.          1.          0.          0.        ]
 [ 3.          0.33333333  0.          0.        ]
 [ 4.          0.66666667  0.          0.        ]]
>>> print(array2)
[[1 1 1 1]
 [2 3 1 1]
 [3 2 2 2]
 [4 3 2 2]
 [5 1 1 3]
 [6 1 3 4]
 [7 1 4 2]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM