简体   繁体   中英

How to deal with data from arff file with python?

I am pretty new for python. I am using python to read the arff file now:

import arff

for row in arff.load('cpu.arff'):   
    x = row
    print(x)

The part of sample output is like this format:

<Row(125.0,256.0,6000.0,256.0,16.0,128.0,198.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,269.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,220.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,172.0)>
<Row(29.0,8000.0,16000.0,32.0,8.0,16.0,132.0)>
<Row(26.0,8000.0,32000.0,64.0,8.0,32.0,318.0)>
<Row(23.0,16000.0,32000.0,64.0,16.0,32.0,367.0)>

Actually, only the last column of data is the label, and the rest of data are the attributes. I am wondering how I can save them by using array? Because I want to assign the data of last column as y, and the first six column data as my x, and then I will do the cross-validation for the data from arff file.

Or is there any approaches to separate data by attributes and label from arff file automatically?

Row objects from arff module support typical python array slicing , thus you can separate data from labels easily

import arff

X = []
y = []

for row in arff.load('cpu.arff'):   
    X.append(row[:-1])
    y.append(row[-1])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM