I am new to python. I wrote a function that returns a pandas data frame. I am trying to apply this function to a list and I would like to merge all the results to one data frame. For example, if my function looks like:
def test(x):
return pd.DataFrame({'a':[x],'b':['test']})
I want to apply it to list [1,2,3,4,5]
, and get the result as a data frame which looks like:
a b
1 test
2 test
3 test
4 test
5 test
If I do [test(x) for x in [1,2,3,4,5]]
, it returns a weird list. Anyone could help me with this please? Thanks!
PS: the function I am actually using:
def cumRet(startDate,endDate=None,symbols=None,inDir=None):
if endDate is None:
endDate=startDate
if inDir is None:
inDir='E:\\python\\data\\mktData\\'
dates=dateRange(startDate,endDate)
if symbols is None:
adjClose=pd.merge(mktData_R(dates.iloc[0].strftime('%Y-%m-%d'),var=['adjClose'])
,mktData_R(dates.iloc[-1].strftime('%Y-%m-%d'),var=['adjClose'])
,on='symbol'
,how='outer')
else:
adjClose=pd.merge(mktData_R(dates.iloc[0].strftime('%Y-%m-%d'),symbols=symbols,var=['adjClose'])
,mktData_R(dates.iloc[-1].strftime('%Y-%m-%d'),symbols=symbols,var=['adjClose'])
,on='symbol'
,how='outer')
adjClose['adjClose_x'][pd.isnull(adjClose['adjClose_x'])]=1
adjClose['adjClose_y'][pd.isnull(adjClose['adjClose_y'])]=1
adjClose['cumRet']=adjClose['adjClose_y']/adjClose['adjClose_x']-1
return adjClose[['symbol','cumRet']]
your original code produced this:
In [49]:
t = [1,2,3,4,5]
def test(x):
return pd.DataFrame({'a':[x],'b':['test']})
[test(t) for x in [1,2,3,4,5]]
Out[49]:
[ a b
0 [1, 2, 3, 4, 5] test, a b
0 [1, 2, 3, 4, 5] test, a b
0 [1, 2, 3, 4, 5] test, a b
0 [1, 2, 3, 4, 5] test, a b
0 [1, 2, 3, 4, 5] test]
Which is not what you intended as you're performing a list comprehension which will loop over each element and produce a list containing 5 dfs which themselves contain your element values as a list for column a.
You can avoid all this by just passing the list as an arg to the DataFrame constructor the values need to be list-like but as your arg is already a list you don't need to wrap it in another list, additionally for the b
column the length of the values have to match the length of the a
column so you need to repeat the value by the length of the list:
In [4]:
t = [1,2,3,4,5]
def test(x):
return pd.DataFrame({'a':x,'b':['test']* len(x)})
test(t)
Out[4]:
a b
0 1 test
1 2 test
2 3 test
3 4 test
4 5 test
In your approach you are creating five dataframes not one.
You can do this without creating a list of size of your list with 'test'
strings(as suggested by @EdChum) :
l = [1,2,3,4,5]
def test(x):
return pd.DataFrame({'a':x, 'b':'test'})
test(l)
>>> a b
0 1 test
1 2 test
2 3 test
3 4 test
4 5 test
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.