[英]`numpy.tile()` sorts automatically - is there an alternative?
I'd like to initialize a pandas
DataFrame so that I can populate it with multiple time series. 我想初始化
pandas
数据帧,这样我可以有多个时间序列填充它。
import pandas as pd
import numpy as np
from string import ascii_uppercase
dt_rng = pd.date_range(start = pd.tseries.tools.to_datetime('2012-12-31'),
end = pd.tseries.tools.to_datetime('2014-12-28'),
freq = 'D')
df = pd.DataFrame(index = xrange(len(dt_rng) * 10),
columns = ['product', 'dt', 'unit_sales'])
df.product = sorted(np.tile([chr for chr in ascii_uppercase[:10]], len(dt_rng)))
df.dt = np.tile(dt_rng, 10)
df.unit_sales = np.random.random_integers(0, 25, len(dt_rng) * 10)
However, when I check the first few values of df.dt
, I see that all values in the field have already been sorted, eg df.dt[:10]
yields 2012-12-31
ten times. 但是,当我检查
df.dt
的前几个值时,我发现该字段中的所有值均已排序,例如df.dt[:10]
2012-12-31
十倍。 I'd like to have this output to be 2012-12-31
, 2013-01-01
, ..., 2013-01-08
, 2013-01-09
(first ten values). 我想有这样的输出是
2012-12-31
, 2013-01-01
,..., 2013-01-08
, 2013-01-09
(前十位值)。
In general, I'm looking for behavior similar to R
's "recycling". 通常,我正在寻找类似于
R
的“回收”的行为。
A combination of reduce()
and the append()
method of a pandas.tseries.index.DatetimeIndex
object did the trick. 结合使用了
pandas.tseries.index.DatetimeIndex
对象的reduce()
和append()
方法就可以了。
import pandas as pd
import numpy as np
from string import ascii_uppercase
dt_rng = pd.date_range(start = pd.tseries.tools.to_datetime('2012-12-31'),
end = pd.tseries.tools.to_datetime('2014-12-28'),
freq = 'D')
df = pd.DataFrame(index = xrange(len(dt_rng) * 10),
columns = ['product', 'dt', 'unit_sales'])
df.product = sorted(np.tile([chr for chr in ascii_uppercase[:10]], len(dt_rng)))
df.dt = reduce(lambda x, y: x.append(y), [dt_rng] * 10)
df.unit_sales = np.random.random_integers(0, 25, len(dt_rng) * 10)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.