简体   繁体   中英

Replace string character in np.array

I have a numpy array size (8634,3) that contain numerical values with mixed english and german type, eg. 34,12 and 34.15 X = np.array(df[['column_a','column_b','column_c']]) X.shape (8634,3) X.dtype dtype('O')

I want to replace the "," with "." using this function:

int_X = X.replace(',','.')

But I am getting this error:

AttributeError: 'numpy.ndarray' object has no attribute 'replace'

Can someone help me with the correct function that I need to use? Thanks

.replace() is a string method, so it won't work directly on numpy arrays. You can define a function to do it on an input string, and then vectorize that function to directly apply it to all elements of the array.

Take a look at the following example code snippet. I've converted the array to str type, done the required replacements and then converted them back to floats.

import numpy as np

a = np.array([1.2, 3.4, "4,5", "8,3", 6.9])
a = a.astype(str)
replace_func = np.vectorize(lambda x: float(x.replace(',','.')))
a = replace_func(a)
print(a)

# Out: [1.2 3.4 4.5 8.3 6.9]

Alternate method using np.char.replace() :

import numpy as np

a = np.array([1.2, 3.4, "4,5", "8,3", 6.9])
a = a.astype(str)
a = np.char.replace(a, ',', '.')
a = a.astype(float)
print(a)

# Out: [1.2 3.4 4.5 8.3 6.9]

You can try with

int_X = int_X.astype(str)
int_X = np.char.replace(X, ',', '.')

Example

int_X = np.array([34.12, 34.15, 56.15, "7,1", 80.16])
int_X = int_X .astype(str)
int_X = np.char.replace(int_X, ',', '.')
int_X
array(['34.12', '34.15', '56.15', '7.1', '80.16'], dtype='<U5')
int_X = int_X.astype(float)
int_X
array([34.12, 34.15, 56.15,  7.1 , 80.16])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM