I have CSV data in the following format:
+-------------+-------------+-------+
| Location | Num of Reps | Sales |
+-------------+-------------+-------+
| 75894 | 3 | 12 |
| Burkbank | 2 | 19 |
| 75286 | 7 | 24 |
| Carson City | 4 | 13 |
| 27659 | 3 | 17 |
+-------------+-------------+-------+
The Location
column is of the object
datatype. What I would like to do is to remove all rows that have non-numeric Location labels. So my desired output, given the above table would be:
+----------+-------------+-------+
| Location | Num of Reps | Sales |
+----------+-------------+-------+
| 75894 | 3 | 12 |
| 75286 | 7 | 24 |
| 27659 | 3 | 17 |
+----------+-------------+-------+
Now, I could hard code the solution in the following manner:
list1 = ['Carson City ', 'Burbank'];
df = df[~df['Location'].isin(['list1'])]
Which was inspired by the following post:
How to drop rows from pandas data frame that contains a particular string in a particular column?
However, what I am looking for is a general solution, that will work for any table of the type outlined above.
Or you could do
df[df['Location'].str.isnumeric()]
Location Num of Reps Sales 0 75894 3 12 2 75286 7 24 4 27659 3 17
You can use pd.to_numeric
to coerce non numeric values to nan
and then filter based on if the Location is nan
:
df[pd.to_numeric(df.Location, errors='coerce').notnull()]
#Location Num of Reps Sales
#0 75894 3 12
#2 75286 7 24
#4 27659 3 17
In [139]: df[~df.Location.str.contains('\D')]
Out[139]:
Location Num of Reps Sales
0 75894 3 12
2 75286 7 24
4 27659 3 17
df[df['Location'].str.isdigit()]
Location Num of Reps Sales
0 75894 3 12
2 75286 7 24
4 27659 3 17
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.