简体   繁体   中英

Python: read mixed float and string csv file

I have a csv file with mixed floats, a string and an integer, the formatted output from a FORTRAN file. A typical line looks like:

 507.930    ,  24.4097    ,   1.0253E-04, O  III   ,    4

I want to read it while keeping the float decimal places unmodified, and check to see if the first entry in each line is present is another list.

Using loadtxt and genfromtxt results in the demical places changing from 3 (or 4) to 12.

How should I tackle this?

If you need to keep precision exactly, you need to use the decimal module . Otherwise, issues with floating point arithmetic limitations might trip you up.

Chances are, though, that you don't really need that precision - just make sure you don't compare float s for equality exactly but always allow a fudge factor, and format the output to a limited number of significant digits:

# instead of if float1==float2:, use this:
if abs(float1-float2) <= sys.float_info.epsilon: 
    print "equal"

loadtxt appears to take a converters argument so something like:

from decimal import Decimal
numpy.loadtxt(..., converters={0: Decimal,
                               1: Decimal,
                               2: Decimal})

Should work.

Decimal 's should work with whatever precision you require although if you're doing significant number crunching with Decimal it will be considerably slower than working with float . However, I assume you're just looking to transform the data without losing any precision so this should be fine.

I finished up writing some string processing code. Not elegant but it works:

stuff=loadtxt(fname1,skiprows=35,dtype="f10,f10,e10,S10,i1",delimiter=','‌​) 
stuff2 = loadtxt('keylines.txt') # a list of the reference values
... # open file for writing etc
for i in range(0,len(stuff)): 
    bb=round(float(stuff[i][0]),3) # gets number back to correct decimal format
    cc=round(float(stuff[i][1]),5) # ditto
    dd=float(stuff[i][2]) 
    ee=stuff[i][3].replace(" ","")  # gets rid of extra FORTRAN spaes
    ff=int(stuff[i][4]) 
    for item in stuff2: 
        if bb == item: 
        fn.write( str(bb)+','+str("%1.5f" % cc)+','+str("%1.4e" % dd)+','+ee+','+str(ff)+'\n')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM