简体   繁体   中英

IsolationForest KeyError: “None of [Index([''], dtype='object')] are in the [columns]”

I am trying to implement my first anomaly detection with IsolationForest, but unfortunately it does not succeed.

I have a .csv file with different network parameters like ip.ttl, frame.len, etc.

#Einlesen
quelle = pd.read_csv('./x.csv')
pdf=quelle.to_numpy()
print(quelle.columns)

Index([';ip.proto;ttl;frame.len;ip.src;ip.dst;ip.len;ip.flags;eth.src;eth.dst;eth.type;vlan.id;udp.port'], dtype='object')

print(quelle.shape)

(1658, 1)

But when I try to create the IsolationForest model with a column like ip.ttl or frame.len (one of the columns), I get an error

model=IsolationForest(n_estimators=50, max_samples='auto',contamination=float(0.1),max_features=1.0)
model.fit(quelle[['frame.len']])

KeyError: "None of [Index(['frame.len'], dtype='object')] are in the [columns]"

Where is my mistake?

Thanks in advance

The dataframe has many datapoints but only a single column.

print(quelle.shape)
(1658, 1)

When you loaded the file into the dataframe it failed to auto detect what is the proper delimiter of the file and instead of reading each column, it packed all columns into a single column.

To solve this issue, you should specify delimiter when reading the file.

pd.read_csv('./x.csv', sep=';')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM