简体   繁体   中英

Error: Input contains NaN, infinity or a value too large for dtype('float32')

I am solving a random forest regression problem. code is below

import pandas as pd 
dataset =pd.read_csv ('C:/random forest/data.csv', decimal=',')
xrf1 = dataset.iloc[:,0:3].values
RESULTS_FOLDER='C:/random forest'
model_path = os.path.join(RESULTS_FOLDER, 'modele rf1.pkl')
model = joblib.load(model_path)
predrf1 = model.predict(xrf1) 

I am getting an error

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

Here you find the link of my data

https://www.dropbox.com/s/nuajvw0xuux7bm3/data.csv?dl=0

Please help me solve this error please.

The problem is that your number contains a comma, that Python does not understand. You can verify this by typing float('-12,95525169') , and you'll get the same error.

However, since you are using pandas, you can maybe solve this pretty easily.

If all your floats have the same comma separator, you can use the following to read your CSV file:

dataset = pd.read_csv("C:/random forest/data.csv", delimiter=";", decimal=",")

Adding the decimal="," parameter, pandas will parse the string and convert it to float properly.

Your string has a comma in it. Python's formatting (and thus it's parser) uses the period as decimal separator and does not have thousands separators.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM