Check which value in Pandas Dataframe Column is String

Question

I have a Dataframe that consists of around 0.2 Million Records. When I'm inputting this Dataframe as an input to a model, it's throwing this error:

Cast string to float is not supported.

Is there any way I can check which particular value in the data frame is causing this error?

I've tried running this command and checking if any value is a string in the column.

False in map((lambda x: type(x) == str), trainDF['Embeddings'])

Output:

True

Answer 1

In panda when we convert those type mix column we do

df['col'] = pd.to_numeric(df['col'],errors = 'coerce')

Which will return NaN for those item can not be convert to float, you can drop then with dropna or fill some default value with fillna

Answer 2

You should loop over trainDF 's indices and find the rows that have errors using try except .

>>> import pandas as pd
>>> trainDF = pd.DataFrame({'Embeddings':['100', '23.2', '44a', '453.2']})
>>> trainDF
  Embeddings
0        100
1       23.2
2        44a
3      453.2
>>> error_indices = []
>>> for idx, row in trainDF.iterrows():
...     try:
...         trainDF.loc[idx, 'Embeddings'] = float(row['Embeddings'])
...     except:
...         error_indices.append(idx)
...
>>> trainDF
  Embeddings
0      100.0
1       23.2
2        44a
3      453.2
>>> trainDF.loc[error_indices]
  Embeddings
2        44a

Check which value in Pandas Dataframe Column is String

Question

2 answers

solution1
0 2021-05-14 02:23:00

solution2
0 2021-05-14 02:26:39

Check which value in Pandas Dataframe Column is String

Question

2 answers

solution1 0 2021-05-14 02:23:00

solution2 0 2021-05-14 02:26:39

solution1
0 2021-05-14 02:23:00

solution2
0 2021-05-14 02:26:39