简体   繁体   English

pandas df,使用其他两列的输入创建列表的列

[英]pandas df, creating column of lists with input from other two columns

example df: 示例df:

id   start       end
a    2018-04-01  2018-04-03
b    2018-04-01  2018-04-03
c    2018-04-02  2018-04-03

ideal output A 理想输出A

id   start       end          lst
a    2018-04-01  2018-04-03   [2018-04-01, 2018-04-02, 2018-04-03]
b    2018-04-01  2018-04-03   [2018-04-01, 2018-04-02, 2018-04-03]
c    2018-04-02  2018-04-03   [2018-04-02, 2018-04-03]

What I have so far (doesn't work) 到目前为止我所拥有的(无效)

def gen_day_list(s1, s2):
    for d1 in s1:
        for d2 in s2:
            delta = d2 - d1
            for i in range(delta.days + 1):
                return (d1 + dt.timedelta(i))

df[date_list] = df.apply(gen_day_list(df['date1'], df['date2']))

Once I get the ideal output A, I would then try to run the following code to get to ideal output B 获得理想的输出A后,我将尝试运行以下代码以达到理想的输出B

lst1 = ['a','b','c']
lst2 = ['b','c','d']
lst3 = ['c','d','e']

comp_lst = lst1 + lst2 +lst3

from collections import Counter
Counter(comp_lst)

ideal output B 理想输出B

Counter({'a': 1, 'b': 2, 'c': 3, 'd': 2, 'e': 1})
Counter({'2018-04-01': 2, '2018-04-02': 3, '2018-04-03': 3})

Any help would be greatly appreciated! 任何帮助将不胜感激!

IIUC 联合会

df['lst']=[pd.date_range(start=x,end=y,freq='D').date.astype(str).tolist() for x , y in zip(df.start,df.end)]
Counter(sum(df['lst'].tolist(),[]))
Out[327]: Counter({'2018-04-01': 2, '2018-04-02': 3, '2018-04-03': 3})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM