I am cleaning a data frame currently and am running into issues because all of them are a mix of int and str, but I am trying to convert all of them to floats. The data frame is all numbers as well as some entries with '?'strings that I am trying to replace with '0' floats. How should I go about doing so?
# Load the data from the file
df = pd.read_csv('processed.state.csv')
df.apply(pd.to_numeric)
Yields an error: Unable to parse string "?" at position 165
df = pd.DataFrame([1,23,'1','2', "?"])
df.replace('?', 0).apply(pd.to_numeric)
A more generic solution to replace non-numbers to 0 will be
def fun(x):
try:
return float(x)
except ValueError:
return 0
df = pd.DataFrame({'c1': [1,23,'1','2', "?"], 'c2': [1,23,'abc','2', "?"]})
df.applymap(fun)
You can create your own function:
def to_float(item):
try:
return float(item)
except ValueError:
return 0
And apply that to the DataFrame instead.
You can use pandas.DataFrame.replace :
df = pd.read_csv('processed.state.csv' encoding = 'utf-8')
df.replace('?', 0)
df.apply(pd.to_numeric)
df['col'] = df['col'].map(lambda x: 0.0 if x == '?' else x).astype(np.float64)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.