简体   繁体   中英

Pandas: select rows with column datatype

I have a DataFrame where a timestamp column contains mixed types. Some rows have the time reported in Unix timestamps (numeric), some rows have the time reported as iso-format strings, and the remaining rows have Pandas datetime objects.

Is there a way for me to select all of the rows that have a non-datetime object in the timestamp column? I would like to run pd.to_datetime to convert the timestamp column of these rows to datetime objects.

The built-in select_dtypes does not do what I want. This library function selects the columns that (do not) have a certain type, but I want to select the rows where a given column value is (not) a specific type.

Example:

df = pd.DataFrame({
    'time': [
        Timestamp('2019-03-31 00:00:00-0400', tz='US/Eastern'),
        '2019-01-31 12:00:00-0700',
        1551000000
    ] })

Goal:

def get_not_datetime_rows(df):
    """Output the last two rows."""

What about something like this (assuming df is your DataFrame and "Timestamp" is the column in question)?

from datetime import datetime
idx = df["Timestamp"].apply(lambda x: type(x) != datetime)

Then use idx to slice your DataFrame

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM