简体   繁体   中英

How to read the following table correctly with pd.read_csv in Python?

I read a file as

  1  [ 1s 1/2-1/2]+    0.83   -66.379    -1.0000000     
  2  [ 1s 1/2 1/2]+    0.83   -66.379    -1.0000000
  3  [ 1s 1/2-1/2]+    0.82   -61.930     1.0000000
  4  [ 1s 1/2 1/2]+    0.82   -61.930     1.0000000
  5  [ 1p 3/2-1/2]-    0.73   -40.210    -1.0000000
  6  [ 1p 3/2 1/2]-    0.77   -40.210    -1.0000000
  7  [ 1p 3/2-3/2]-    0.76   -40.210    -1.0000000
  8  [ 1p 3/2 3/2]-    0.64   -40.210    -1.0000000

in the following way:

spe=pd.read_csv("spe.dat",delimiter='s\+',skiprows=[0,1])
spe.columns=['index','label','weight','ee','tz']

I got the error message:

ValueError: Length mismatch: Expected axis has 1 elements, new values have 5 elements

I realized that the second column such as '[ 1s 1/2-1/2]+' was read as three columns. Is there any way to read the whole '[ 1s 1/2-1/2]+' as one column? Thanks.

You are not separating the columns properly when you read the DataFrame. I recommend reading the Python regex tutorial to understand how to use regular expressions for the separator.

columns = ['index','label','weight','ee','tz']
pd.read_csv('spe.dat', sep='\s{2,}', names=columns, index_col=0, skiprows=[0, 1])

returns

                label  weight      ee   tz
index                                     
1      [ 1s 1/2-1/2]+    0.83 -66.379 -1.0
2      [ 1s 1/2 1/2]+    0.83 -66.379 -1.0
3      [ 1s 1/2-1/2]+    0.82 -61.930  1.0
4      [ 1s 1/2 1/2]+    0.82 -61.930  1.0
5      [ 1p 3/2-1/2]-    0.73 -40.210 -1.0
6      [ 1p 3/2 1/2]-    0.77 -40.210 -1.0
7      [ 1p 3/2-3/2]-    0.76 -40.210 -1.0
8      [ 1p 3/2 3/2]-    0.64 -40.210 -1.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM