简体   繁体   中英

reading a file from text file using numpy

I have a text file which is rather simple, I want to read this using numpy.I need to read the numbers in the rows with more than 2 columns where the line doesn't start with a "#".

   12

 C     0.000000     0.000000     0.000000
 C     0.000000     0.000000     1.400000
 C     1.212436     0.000000     2.100000
 C     2.424871     0.000000     1.400000
 C     2.424871     0.000000     0.000000
 C     1.212436     0.000000    -0.700000
 H    -0.943102     0.000000     1.944500
 H     1.212436     0.000000     3.189000
 H     3.367973     0.000000     1.944500
 H     3.367973     0.000000    -0.544500
 H     1.212436     0.000000    -1.789000
 H    -0.943102     0.000000    -0.544500

I have tried the following code:

import numpy as np
class mol:



import numpy as np
class mol:

    def __init__(self):
        self.masses = {'H': 1, 'D': 2, 'C': 12, 'O': 16}

    def read_xyz(self, filename):
        self.filename = filename
        with open(self.filename) as f:
            for line in f:
                if not line.startswith("#") and len(line.split())>3:
                    print np.loadtxt(line)

if __name__ == "__main__":
    test = mol()
    test.read_xyz('benz.xyz')

but my code crashes and if I print the line I get an empty line between the each row I dunno why. Any help will be great!

I would suggest you to use a regex instead, something like:

import numpy as np
class mol:

    def __init__(self):
        self.masses = {'H': 1, 'D': 2, 'C': 12, 'O': 16}

    def read_xyz(self, filename):
        self.filename = filename
        regexp = r'\s+\w+' + r'\s+([-.0-9]+)' * 3 + r'\s*\n'
        data = np.fromregex(self.filename, regexp, dtype='f')
        print(data)

if __name__ == "__main__":
    test = mol()
    test.read_xyz('benz.xyz')

In this case, I obtained:

[[ 0.        0.        0.      ]
 [ 0.        0.        1.4     ]
 [ 1.212436  0.        2.1     ]
 [ 2.424871  0.        1.4     ]
 [ 2.424871  0.        0.      ]
 [ 1.212436  0.       -0.7     ]
 [-0.943102  0.        1.9445  ]
 [ 1.212436  0.        3.189   ]
 [ 3.367973  0.        1.9445  ]
 [ 3.367973  0.       -0.5445  ]
 [ 1.212436  0.       -1.789   ]
 [-0.943102  0.       -0.5445  ]]

You need to modify the regex if you want to keep the first column with the character.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM