Removing b'' from string column in a pandas dataframe

Question

I have a data frame as taken from SDSS database. Example data is here.

I want to remove the character 'b' from data['class'] . I tried

data['class'] = data['class'].replace("b','')

But I am not getting the result.

Answer 1

You're working with byte strings. You might consider str.decode :

data['class'] = data['class'].str.decode('utf-8')

Answer 2

Further explanation:

df = pd.DataFrame([b'123']) # create dataframe with b'' element

Now we can call

df[0].str.decode('utf-8') # returns a pd.series applying decode on str succesfully
df[0].decode('utf-8') # tries to decode the series and throws an error

Basically what you are doing with .str() is applying it for all elements. It could also be written like this:

df[0].apply(lambda x: x.decode('utf-8'))

Removing b'' from string column in a pandas dataframe

Question

2 answers

solution1
7 2017-10-11 20:17:18

solution2
2 2017-10-11 20:46:08

Removing b'' from string column in a pandas dataframe

Question

2 answers

solution1 7 2017-10-11 20:17:18

solution2 2 2017-10-11 20:46:08

solution1
7 2017-10-11 20:17:18

solution2
2 2017-10-11 20:46:08