简体   繁体   中英

Creating Data Frame with repeating values that repeat

I'm trying to create a dataframe in Pandas that has two variables ("date" and "time_of_day" where "date" is 120 observations long with 30 days (each day has four observations: 1,1,1,1; 2,2,2,2; etc.) and then the second variable "time_of_day) repeats 30 times with values of 1,2,3,4.

The closest I found to this question was here: How to create a series of numbers using Pandas in Python , which got me the below code, but I'm receiving an error that it must be a 1-dimensional array.

df = pd.DataFrame({'date': np.tile([pd.Series(range(1,31))],4), 'time_of_day': pd.Series(np.tile([1, 2, 3, 4],30 ))})

So the final dataframe would look something like

date time_of_day
1 1
1 2
1 3
1 4
2 1
2 2
2 3
2 4

Thanks much!

you need once np.repeat and once np.tile

df = pd.DataFrame({'date': np.repeat(range(1,31),4), 
                   'time_of_day': np.tile([1, 2, 3, 4],30)})
print(df.head(10))
   date  time_of_day
0     1            1
1     1            2
2     1            3
3     1            4
4     2            1
5     2            2
6     2            3
7     2            4
8     3            1
9     3            2

or you could use pd.MultiIndex.from_product , same result.

df = (
    pd.MultiIndex.from_product([range(1,31), range(1,5)], 
                               names=['date','time_of_day'])
      .to_frame(index=False)
)

or product from itertools

from itertools import product
df = pd.DataFrame(product(range(1,31), range(1,5)), columns=['date','time_of_day'])

New feature in merge cross

out = pd.DataFrame(range(1,31)).merge(pd.DataFrame([1, 2, 3, 4]),how='cross')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM