I have a list of months, like shown below, where the months will always
be separated by hyphen.
l = ['201701-201703', '201801-201804']
I want to expand the list to include every months
in between this range.
Desired output should be :
['201701','201702','201703','201801','201802','201803','201804']
My solution:
l1 = [[str(a) for a in list(map(lambda x:x+int(i[0:6]),list(range(abs(eval(i))+1))))] for i in l]
print(l1)
[['201701', '201702', '201703'], ['201801', '201802', '201803', '201804']]
# Flatten the list
l2 = [item for sublist in l1 for item in sublist]
print(l2)
['201701', '201702', '201703', '201801', '201802', '201803', '201804']
My solution is very cumbersome to read and will fail when range spans over multiple years. Can someone suggest a code which is simpler to read and understand?
Update: If the end result are dates, that's also fine.
Here's how I'd approach this using pandas
:
df = pd.DataFrame(l, columns=['date'])
# split start and end date into two columns and parse as datetime
df_ = df.date.str.split('-', expand=True).apply(pd.to_datetime, format='%Y%m')
# add MonthEnd -> this will ensure that the last month is included in the range
df_[1] = df_[1] + pd.offsets.MonthEnd()
# use pd.daterange to generate the range with a frequence of a month
(df_.apply(lambda x: pd.date_range(*x, freq='1M', closed='right'), axis=1)
.explode()
.dt.strftime('%Y%m')
.values.tolist())
# ['201701', '201702', '201703', '201801', '201802', '201803', '201804']
Hope it is more readable.
I increased the size of the second expression by 5 characters, but reduced the size of the first one by 26 characters.
Deleted map()
, list()
, abs()
, lambda
, eval()
. Introduced split()
, enumerate()
.
l = ['201701-201703', '201801-201804']
l1 = [range(*[int(v)+i for i,v in enumerate(e.split('-'))]) for e in l]
print(l1)
l2 = [str(item) for sublist in l1 for item in sublist]
print(l2)
output:
[range(201701, 201704), range(201801, 201805)]
['201701', '201702', '201703', '201801', '201802', '201803', '201804']
UPDATE1
This modification to jump the border of the year.
l = ['201711-201802', '201901-201904']
l1 = [range(*[int(v)+i for i,v in enumerate(e.split('-'))]) for e in l]
print(l1)
l2 = [str(item) for sublist in l1 for item in sublist if 0 < item % 100 < 13]
print(l2)
output:
[range(201711, 201803), range(201901, 201905)]
['201711', '201712', '201801', '201802', '201901', '201902', '201903', '201904']
You could do something like this:
[ str(q) for p in l for q in range(int(p[:6]), int(p[7:])+1) ]
However, this doesn't work for ranges starting in one year and ending in the next. For that you can do it like this:
from datetime import datetime
from dateutil.relativedelta import relativedelta
l = ['201701-201703', '201801-201804', '201701-201804']
res_list = []
for p in l:
d1 = datetime.strptime(p[:6], '%Y%m')
d2 = datetime.strptime(p[7:], '%Y%m')
while d1 <= d2:
res_list.append(d1.strftime('%Y%m'))
d1 += relativedelta(months=1)
res_list
Output:
['201701',
'201702',
'201703',
'201801',
'201802',
'201803',
'201804',
'201701',
'201702',
'201703',
'201704',
'201705',
'201706',
'201707',
'201708',
'201709',
'201710',
'201711',
'201712',
'201801',
'201802',
'201803',
'201804']
You can do the same with list comprehension, but it isn't elegant anymore. But just for fun, here you go:
[ str(date)
for d1, d2 in map(lambda p: map(int, p.split("-")), l)
for year_offset in range((d2 - d1)//100 +1)
for date in range(max(d1, 100*(d1//100 + year_offset) +1 ),
min(d2, 100*(d1//100 + year_offset) +12) +1) ]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.