How to drop rows from pandas dataframe based on condition that the data type contained is float?

Question

I am working with a dataframe. I am aware that you could do something like:

dataframe[dataframe["column_name"] :  some condition]

But what I would like is something like:

 dataframe[type(dataframe["column_name"]) == float ]

For instance if we had the following dataset:

A    B    C    D
1    2    3    4
5    6         4
7    2    3    2
1    2    3    4

Then, I would like to remove the second row, because under column C of row2 the value is either missing, or is not a number(indicating the value is missing.)

But the way I tried it isn't working. And I get the following error. Can someone please help?

Warning (from warnings module):
  File "/Users/oishikachaudhury/Desktop/NYU/Risk Econ/Week 6/Hourly/trial.py", line 1
    import matplotlib.pyplot as plt
DtypeWarning: Columns (9,15,20,27,33,34,35,36,38,39,60) have mixed types.Specify dtype option on import or set low_memory=False.
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: False

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/oishikachaudhury/Desktop/NYU/Risk Econ/Week 6/Hourly/trial.py", line 8, in <module>
    dewpoint = fileObj[type(fileObj["HourlyDewPointTemperature"]) == float]
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 2800, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: False

Answer 1

You would want something like:

import numpy as np, pandas as pd
df1 = pd.DataFrame({
                   "B":[5, 2, 54, 3, 2], 
                   "C":[20, 16, np.nan, 3, 8], 
                   "D":[14, 3, 17, 2, 6]}) 
df1.loc[df1.isna().apply(sum,axis=1) == 0]

Output:

   B     C   D
0  5  20.0  14
1  2  16.0   3
3  3   3.0   2
4  2   8.0   6

Answer 2

Since OP is seeking to drop rows of float type, and not columns, here is a solution to do that:

df = pd.DataFrame({'A':['a', 'b', 'c', 'd'],'B': ['e', 'f', 1.2, 'g'], 'C': ["asdf",3.2,"s","d"]})

# Setup list of rows to keep
keeprows=[]

# Loop through each row in DF
for idx,row in enumerate(df.iterrows()):
    validcols = 0 # Count number of columns without float types
    for val in list(row[1]):
        if not type(val) == float:
            validcols+=1 # add one to column counter if value not float type
    if validcols != len(df.columns):
        continue
    else:
        keeprows.append(row[1]) # if all cols are not float, append to keep list

filtered = pd.concat(keeprows, axis = 1)
print(filtered)

This gives:

    A   B   C
0   a   e   asdf
3   d   g   d

Compared to the original dataframe:

    A   B   C
0   a   e   asdf
1   b   f   3.2
2   c   1.2 s
3   d   g   d

This is unfortunately verbose and slow (since it loops over every row), and can likely be improved.

How to drop rows from pandas dataframe based on condition that the data type contained is float?

Question

2 answers

solution1
1 ACCPTED 2020-07-16 00:43:11

solution2
0 2020-07-16 00:50:39

How to drop rows from pandas dataframe based on condition that the data type contained is float?

Question

2 answers

solution1 1 ACCPTED 2020-07-16 00:43:11

solution2 0 2020-07-16 00:50:39

solution1
1 ACCPTED 2020-07-16 00:43:11

solution2
0 2020-07-16 00:50:39