I'm a newbie in python and using Dataframe from pandas package (python3.6).
I set it up like below code,
df = DataFrame({'list1': list1, 'list2': list2, 'list3': list3, 'list4': list4, 'list5': list5, 'list6': list6})
and it gives an error like ValueError: arrays must all be same length
So I checked all the length of arrays, and list1
& list2
have 1 more data than other lists. If I want to add 1 data to those other 4 lists( list3
, list4
, list5
, list6
) by using pd.resample
, then how should I write code...?
Also, those lists are time series list with 1 minute.
Does anybody have an idea or help me out here?
Thanks in advance.
EDIT So I changed as what EdChum said. and added time list at the front. it is like below.
2017-04-01 0:00 895.87 730 12.8 4 19.1 380
2017-04-01 0:01 894.4 730 12.8 4 19.1 380
2017-04-01 0:02 893.08 730 12.8 4 19.3 380
2017-04-01 0:03 890.41 730 12.8 4 19.7 380
2017-04-01 0:04 889.28 730 12.8 4 19.93 380
and I typed code like
df.resample('1min', how='mean', fill_method='pad')
And it gives me this error: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'
I'd just construct a Series
for each list and then concat
them all:
In [38]:
l1 = list('abc')
l2 = [1,2,3,4]
s1 = pd.Series(l1, name='list1')
s2 = pd.Series(l2, name='list2')
df = pd.concat([s1,s2], axis=1)
df
Out[38]:
list1 list2
0 a 1
1 b 2
2 c 3
3 NaN 4
As you can pass a name
arg for the Series
ctor it will name each column in the df, plus it will place NaN
where the column lengths don't match
resample
refers to when you have a DatetimeIndex
for which you want to rebase or adjust the length based on some time period which is not what you want here. You want to reindex
which I think is unnecessary and messy:
In [40]:
l1 = list('abc')
l2 = [1,2,3,4]
s1 = pd.Series(l1)
s2 = pd.Series(l2)
df = pd.DataFrame({'list1':s1.reindex(s2.index), 'list2':s2})
df
Out[40]:
list1 list2
0 a 1
1 b 2
2 c 3
3 NaN 4
Here you'd need to know the longest length and then reindex
all Series using that index, if you just concat
it will automatically adjust the lengths and fill missing elements with NaN
According to this documentation , it looks quite difficult to do this with pd.resample()
: You should calculate a frequence which add only one value to your df, and the function seems really not made for this ^^ (seems to permit easy reshaping, ex : 1 min to 30sec or 1h) ! You'd better try what EdChum did :P
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.