简体   繁体   English

在 Python 中从字符表达式创建列表的优雅方式

[英]Elegant way of creating list from character expression in Python

I have a list of months, like shown below, where the months will always be separated by hyphen.我有一个月份列表,如下所示,其中月份always用连字符分隔。

l = ['201701-201703', '201801-201804']

I want to expand the list to include every months in between this range.我想扩展列表以包括此范围months的每个月。

Desired output should be :期望的输出应该是

['201701','201702','201703','201801','201802','201803','201804']

My solution:我的解决方案:

l1 = [[str(a) for a in list(map(lambda x:x+int(i[0:6]),list(range(abs(eval(i))+1))))] for i in l]
print(l1)
    [['201701', '201702', '201703'], ['201801', '201802', '201803', '201804']]

# Flatten the list
l2 = [item for sublist in l1 for item in sublist]
print(l2)
    ['201701', '201702', '201703', '201801', '201802', '201803', '201804']

My solution is very cumbersome to read and will fail when range spans over multiple years.我的解决方案阅读起来非常麻烦,并且当范围跨越多年时会失败。 Can someone suggest a code which is simpler to read and understand?有人可以推荐一个更易于阅读和理解的代码吗?

Update: If the end result are dates, that's also fine.更新:如果最终结果是日期,那也没关系。

Here's how I'd approach this using pandas :以下是我使用pandas处理此问题的方法:

df = pd.DataFrame(l, columns=['date'])

# split start and end date into two columns and parse as datetime
df_ = df.date.str.split('-', expand=True).apply(pd.to_datetime, format='%Y%m')

# add MonthEnd -> this will ensure that the last month is included in the range
df_[1] = df_[1] + pd.offsets.MonthEnd()

# use pd.daterange to generate the range with a frequence of a month
(df_.apply(lambda x: pd.date_range(*x, freq='1M', closed='right'), axis=1)
                                    .explode()
                                    .dt.strftime('%Y%m')
                                    .values.tolist())

# ['201701', '201702', '201703', '201801', '201802', '201803', '201804']

Hope it is more readable.希望它更具可读性。

I increased the size of the second expression by 5 characters, but reduced the size of the first one by 26 characters.我将第二个表达式的大小增加了 5 个字符,但将第一个表达式的大小减少了 26 个字符。

Deleted map() , list() , abs() , lambda , eval() .删除了map()list()abs()lambdaeval() Introduced split() , enumerate() .引入split()enumerate()

l = ['201701-201703', '201801-201804']
l1 = [range(*[int(v)+i for i,v in enumerate(e.split('-'))]) for e in l]
print(l1)
l2 = [str(item) for sublist in l1 for item in sublist]
print(l2)

output:输出:

[range(201701, 201704), range(201801, 201805)]
['201701', '201702', '201703', '201801', '201802', '201803', '201804']

UPDATE1更新1

This modification to jump the border of the year.本次修改跳到了当年的境界。

l = ['201711-201802', '201901-201904']
l1 = [range(*[int(v)+i for i,v in enumerate(e.split('-'))]) for e in l]
print(l1)
l2 = [str(item) for sublist in l1 for item in sublist if 0 < item % 100 < 13]
print(l2)

output:输出:

[range(201711, 201803), range(201901, 201905)]
['201711', '201712', '201801', '201802', '201901', '201902', '201903', '201904']

You could do something like this:你可以这样做:

[ str(q) for p in l for q in range(int(p[:6]), int(p[7:])+1) ]

However, this doesn't work for ranges starting in one year and ending in the next.但是,这不适用于从一年开始到下一年结束的范围。 For that you can do it like this:为此,您可以这样做:

from datetime import datetime
from dateutil.relativedelta import relativedelta

l = ['201701-201703', '201801-201804', '201701-201804']

res_list = []
for p in l:
    d1 = datetime.strptime(p[:6], '%Y%m')
    d2 = datetime.strptime(p[7:], '%Y%m')
    while d1 <= d2:
        res_list.append(d1.strftime('%Y%m'))
        d1 += relativedelta(months=1)

res_list

Output:输出:

['201701',
 '201702',
 '201703',
 '201801',
 '201802',
 '201803',
 '201804',
 '201701',
 '201702',
 '201703',
 '201704',
 '201705',
 '201706',
 '201707',
 '201708',
 '201709',
 '201710',
 '201711',
 '201712',
 '201801',
 '201802',
 '201803',
 '201804']

You can do the same with list comprehension, but it isn't elegant anymore.您可以对列表理解做同样的事情,但它不再优雅了。 But just for fun, here you go:但只是为了好玩,给你:

[ str(date)
  for d1, d2 in map(lambda p: map(int, p.split("-")), l) 
  for year_offset in range((d2 - d1)//100 +1)
  for date in range(max(d1, 100*(d1//100 + year_offset) +1 ), 
                    min(d2, 100*(d1//100 + year_offset) +12) +1) ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM