简体   繁体   中英

How to add an extra middle step into a list comprehension?

Let's say I have a list[str] object containing timestamps in "HH:mm" format, eg

timestamps = ["22:58", "03:11", "12:21"]

I want to convert it to a list[int] object with the "number of minutes since midnight" values for each timestamp:

converted = [22*60+58, 3*60+11, 12*60+21]

... but I want to do it in style and use a single list comprehension to do it. A (syntactically incorrect) implementation that I naively constructed was something like:

def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
    return [int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]

... but this doesn't work because for hh, mm = ts.split(":") is not a valid syntax.

What would be the valid way of writing the same thing?

To clarify: I can see a formally satisfying solution in the form of:

def timestamps_to_minutes(timestamps: list[str]) -> list[int]:
    return [int(ts.split(":")[0]) * 60 + int(ts.split(":")[1]) for ts in timestamps]

... but this is highly inefficient and I don't want to split the string twice.

You could use an inner generator expression to do the splitting:

[int(hh)*60 + int(mm) for hh, mm in (ts.split(':') for ts in timestamps)]

Although personally, I'd rather use a helper function instead:

def timestamp_to_minutes(timestamp: str) -> int:
    hh, mm = timestamp.split(":")
    return int(hh)*60 + int(mm)

[timestamp_to_minutes(ts) for ts in timestamps]

# Alternative
list(map(timestamp_to_minutes, timestamps))

If you don't want to split string twice you can use := assignment operator:

timestamps = [int((s := t.split(":"))[0]) * 60 + int(s[1]) for t in timestamps]
print(timestamps)

Prints:

[1378, 191, 741]

Alternative:

print([int(h) * 60 + int(m) for h, m in (t.split(":") for t in timestamps)])

Prints:

[1378, 191, 741]

Note: := is a feature of Python 3.8+ commonly referred to as the " walrus operator ". Here's the PEP with the proposal.

Your initial pseudocode

[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm = ts.split(":")]

is pretty close to what you can do:

[int(hh) * 60 + int(mm) for ts in timestamps for hh, mm in [ts.split(':')]]

In Python 3.9, expressions like this were optimized so that creating a single-element array inside a comprehension just to access its single element immediately is as fast as a simple assignment.

If you use generators (as opposed to list-comprehensions) for middle-steps, the whole list will still be converted in one single pass:

timestamps = ["22:58", "03:11", "12:21"]

#NOTE: Use () for generators, not [].
hh_mms = (timestamp.split(':') for timestamp in timestamps)
converted = [int(hh) * 60 + int(mm) for (hh, mm) in hh_mms]

print(converted)
# [1378, 191, 741]

You can split the comprehension in multiple-steps, written on multiple lines, and you don't need to define any function.

Late to the party .. but why not use datetime / timedelta to convert your time?

For "hh:mm" this may be overkill, but you can easily adjust it to more complex time strings:

from datetime import datetime as dt
import typing

def timestamps_to_minutes(timestamps: typing.List[str]) -> typing.List[any]:
    """Uses datetime.strptime to parse a datetime string and return
    minutes spent in this day."""
    return [int(((p := dt.strptime(t,"%H:%M")) - dt(p.year,p.month, p.day)
                 ).total_seconds()//60) for t in timestamps]

timestamps = ["22:58", "03:11", "12:21"]

print(timestamps_to_minutes(timestamps))

Outputs:

[1378, 191, 741]

Just for fun, we could also use operator.methodcaller :

from operator import methodcaller
out = [int(h) * 60 + int(m) for h, m in map(methodcaller("split", ":"), timestamps)]

Output:

[1378, 191, 741]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM