I am five .xpt data files which I am trying to read in python. I want to ge the data in those files into an array. This is for a machine learning project. I am trying that with this code but am getting an error on the line 'with xport.Reader(fname) as reader: '. Though I am still not sure if fixing this error would fix the whole code or now. Is there any other method I can use?
# imports to read the xpt file data
import xport
import numpy as np
import os
#read the xpt file data
FN = ["BMX.XPT", "BMX_B.XPT", "BMX_C.XPT", "BMX_D.XPT", "BMX_E.XPT"]
def get_data(fname):
Z={}
H=None
with xport.Reader(fname) as reader:
for row in reader:
if H is None:
H=row.keys()
H.remove("SEQN")
H.sort()
Z[row["SEQN"]] = [row[k] for k in H]
return Z,H
# call get_data method on each file
D,VN =[],[]
for fn in FN:
fn_full = os.path.join("../Data/", fn)
X,H = get_data(fn_full)
s = fn.replace(".XPT", "")
H = [s + ":" + x for x in H]
D.append(X)
VN += H
## The sequence numbers that are in all data sets
KY = set(D[0].keys())
for d in D[1:]:
KY &= set(d.keys())
KY = list(KY)
KY.sort()
def to_float(x):
try:
return float(x)
except ValueError:
return float("nan")
## Merge the data
Z = []
for ky in KY:
z = []
map(z.extend, (d[ky] for d in D))
## equivalent to
## for d in D:
## z.extend(d[ky])
z = [to_float(a) for a in z]
## equivalent to
## map(to_float, z)
Z.append(z)
Z = np.array(Z)
It seems the input to xport.reader should be a file instead of a string. You should open it as a file instead of passing the file name as a parameter.
Eg
with xport.Reader(open(fname, ‘r’)) as reader:
....
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.