I have a python list ['a1', 'b1', 'a2', 'b2','a3', 'b3']
. Set m=3
and I want get this list using loops, because here m=3
could be a larger number such as m=100
.
Since we can have
m = 3
['a' + str(i) for i in np.arange(1,m+1)]
# ['a1', 'a2', 'a3']
['b' + str(i) for i in np.arange(1,m+1)]
# ['b1', 'b2', 'b3']
then I try to get ['a1', 'b1', 'a2', 'b2','a3', 'b3']
using
[ ['a','b'] + str(i) for i in np.arange(1,m+1)]
and have TypeError: can only concatenate list (not "str") to list
Then I try
[ np.array(['a','b']) + str(i) for i in np.arange(1,m+1)]
and I still get errors as UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U1'), dtype('<U1')) -> None
.
How can I fix the problem? And even more, how to get something like ['a1', 'b1', 'c1', 'a2', 'b2','c2','a3', 'b3', 'c3']
through similar ways?
You need to iterate on both the range of numbers and the list of strings
In [106]: [s+str(i) for i in range(1,4) for s in ['a','b']]
Out[106]: ['a1', 'b1', 'a2', 'b2', 'a3', 'b3']
A simple combined list
comprehension would work as pointed out in the @j1-lee's answer (and later in other answers).
import string
def letter_number_loop(n, m):
letters = string.ascii_letters[:n]
numbers = range(1, m + 1)
return [f"{letter}{number}" for number in numbers for letter in letters]
Similarly, one could use itertools.product()
, as evidenced in Nick's answer , to obtain substantially the same:
import itertools
def letter_number_it(n, m):
letters = string.ascii_letters[:n]
numbers = range(1, m + 1)
return [
f"{letter}{number}"
for number, letter in itertools.product(numbers, letters)]
However, it is possible to write a NumPy-vectorized approach, making use of the fact that if the dtype
is object
, the operations do follow the Python semantics.
import numpy as np
def letter_number_np(n, m):
letters = np.array(list(string.ascii_letters[:n]), dtype=object)
numbers = np.array([f"{i}" for i in range(1, m + 1)], dtype=object)
return (letters[None, :] + numbers[:, None]).ravel().tolist()
Note that the final numpy.ndarray.tolist()
could be avoided if whatever will consume the output is capable of dealing with the NumPy array itself, thus saving some relatively small but definitely appreciable time.
The following do indicate that the functions are equivalent:
funcs = letter_number_loop, letter_number_it, letter_number_np
n, m = 2, 3
for func in funcs:
print(f"{func.__name__!s:>32} {func(n, m)}")
letter_number_loop ['a1', 'b1', 'a2', 'b2', 'a3', 'b3']
letter_number_it ['a1', 'b1', 'a2', 'b2', 'a3', 'b3']
letter_number_np ['a1', 'b1', 'a2', 'b2', 'a3', 'b3']
For larger inputs, this is substantially faster, as evidenced by these benchmarks:
timings = {}
k = 16
for n in (2, 20):
for k in range(1, 10):
m = 2 ** k
print(f"n = {n}, m = {m}")
timings[n, m] = []
base = funcs[0](n, m)
for func in funcs:
res = func(n, m)
is_good = base == res
timed = %timeit -r 64 -n 64 -q -o func(n, m)
timing = timed.best * 1e6
timings[n, m].append(timing if is_good else None)
print(f"{func.__name__:>24} {is_good} {timing:10.3f} µs")
to be plotted with:
import matplotlib.pyplot as plt
import pandas as pd
n_s = (2, 20)
fig, axs = plt.subplots(1, len(n_s), figsize=(12, 4))
for i, n in enumerate(n_s):
partial_timings = {k[1]: v for k, v in timings.items() if k[0] == n}
df = pd.DataFrame(data=partial_timings, index=[func.__name__ for func in funcs]).transpose()
df.plot(marker='o', xlabel='Input size / #', ylabel='Best timing / µs', ax=axs[i], title=f"n = {n}")
These show that the explicitly looped versions ( letter_number_loop()
and letter_number_it()
) are somewhat comparable, while the NumPy-vectorized ( letter_number_np()
) fares much better relatively quickly for larger inputs, up to ~2x speed-up.
You can have more than one for
in a list comprehension:
prefixes = ['a', 'b', 'c']
m = 3
output = [f"{prefix}{num}" for num in range(1, m+1) for prefix in prefixes]
print(output) # ['a1', 'b1', 'c1', 'a2', 'b2', 'c2', 'a3', 'b3', 'c3']
If you have multiple for
s, those will be nested, as in
for num in range(1, m+1):
for prefix in prefixes:
...
You could use itertools.product
to get all combinations of the the letters and the m
range and then join them in an f-string (rather than using join
as one element is an integer so would require converting to a string):
[f'{x}{y}' for x, y in itertools.product(range(1, m+1), ['a', 'b'])]
Output:
['a1', 'b1', 'a2', 'b2', 'a3', 'b3']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.