简体   繁体   中英

How to transform weekly data to daily for specific columns using Python

I am a newbie at python and programming in general. I hope the following question is well explained.

I have a big dataset, with 80+ columns and some of these columns have only data on a weekly basis. I would like transform these columns to have values on a daily basis by simply dividing the weekly value by 7 and attributing the result to the value itself and the 6 other days of that week.

This is what my input dataset looks like:

   date                  col1           col2           col3
02-09-2019               14               NaN            1
09-09-2019               NaN              NaN            2
16-09-2019               NaN              7              3
23-09-2019               NaN              NaN            4
30-09-2019               NaN              NaN            5
07-10-2019               NaN              NaN            6
14-10-2019               NaN              NaN            7
21-10-2019               21               NaN            8
28-10-2019               NaN              NaN            9
04-11-2019               NaN              14             10
11-11-2019               NaN              NaN            11
..

This is what the output should look like:

   date                  col1           col2           col3
02-09-2019                2               NaN            1
09-09-2019                2               NaN            2
16-09-2019                2               1              3
23-09-2019                2               1              4
30-09-2019                2               1              5
07-10-2019                2               1              6
14-10-2019                2               1              7
21-10-2019                3               1              8
28-10-2019                3               1              9
04-11-2019                3               2              10 
11-11-2019                3               2              11
..

I can´t come up with a solution, but here is what I thought might work:

def convert_to_daily(df):
    for column in df.columns.tolist():
        if column.isna(): # if true 
            for line in range(len(df[column])):
                # check if value is not empty and 
                succeeded by an 6 empty values or some 
                better logic  
                # I don´t know how to do that.

I believe you need select columns contains at least one missing value, forward filling missing values and divide by 7 :

m = df.isna().any()
df.loc[:, m] = df.loc[:, m].ffill(limit=7).div(7)
print (df)
          date  col1  col2  col3
0   02-09-2019   2.0   NaN     1
1   09-09-2019   2.0   NaN     2
2   16-09-2019   2.0   1.0     3
3   23-09-2019   2.0   1.0     4
4   30-09-2019   2.0   1.0     5
5   07-10-2019   2.0   1.0     6
6   14-10-2019   2.0   1.0     7
7   21-10-2019   3.0   1.0     8
8   28-10-2019   3.0   1.0     9
9   04-11-2019   3.0   2.0    10
10  11-11-2019   3.0   2.0    11

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM