繁体   English   中英

如何将返回 dataframe 的 function 应用到另一个 dataframe 的每一行

[英]how to apply a function that returns a dataframe to each row of another dataframe

我有以下 function 占用一行并返回 dataframe ld_coefs_df_long ,我不确定 ZC1C425268E68385D1AB5074C17A94F14 语法是否正确:

 def apply_calc_coef (row):
   device_id = row['deviceId']
   sound_filename = row['filename'] 
   sound_fullpath = os.path.join(basepath, device_id, sound_filename)
   offset = row['offset']
   telemetry_id = row['telemetry_id']


   fs,data = wavfile.read(sound_fullpath)
   ld_coefs_df = calc_ld_coefs(data, fs, offset=offset, win_len=2000, ndeg=12)

   ld_coefs_df_long = pd.melt(ld_coefs_df, id_vars=['time'] , var_name='series_type' )
   ld_coefs_df_long['telemetry_id'] = telemetry_id
   ld_coefs_df_long.rename(columns={"time": "t_seconds"})

   return ld_coefs_df_long // is a dataframe

i need to apply this function to each row of a dataframe called ds_telemetry_pd and stack the results ( which is a dataframe for each row) together to have anew dataframe, using following syntax:

 ds_telemetry_pd.apply(lambda x:  apply_calc_coef(x))

显然我的方向不是很好,因为我的 function 或我应用 t 的方式或两者都不正确。

编辑:当我使用ds_telemetry_pd.apply(apply_calc_coef, axis=1)我得到以下错误:

 `cannot copy sequence with size 3 to array axis with dimension 14400`

ValueError                                Traceback (most recent call last)
<command-3171623043129929> in <module>()
----> 1 ds_telemetry_pd.apply(apply_calc_coef, axis=1)

/databricks/python/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4150                     if reduce is None:
   4151                         reduce = True
-> 4152                     return self._apply_standard(f, axis, reduce=reduce)
   4153             else:
   4154                 return self._apply_broadcast(f, axis)

/databricks/python/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4263                 index = None
   4264 
-> 4265             result = self._constructor(data=results, index=index)
   4266             result.columns = res_index
   4267 

/databricks/python/lib/python3.5/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    264                                  dtype=dtype, copy=copy)
    265         elif isinstance(data, dict):
--> 266             mgr = self._init_dict(data, index, columns, dtype=dtype)
    267         elif isinstance(data, ma.MaskedArray):
    268             import numpy.ma.mrecords as mrecords

/databricks/python/lib/python3.5/site-packages/pandas/core/frame.py in _init_dict(self, data, index, columns, dtype)
    400             arrays = [data[k] for k in keys]
    401 
--> 402         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    403 
    404     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

/databricks/python/lib/python3.5/site-packages/pandas/core/frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   5401 
   5402     # don't force copy because getting jammed in an ndarray anyway
-> 5403     arrays = _homogenize(arrays, index, dtype)
   5404 
   5405     # from BlockManager perspective

/databricks/python/lib/python3.5/site-packages/pandas/core/frame.py in _homogenize(data, index, dtype)
   5712                 v = lib.fast_multiget(v, oindex.values, default=NA)
   5713             v = _sanitize_array(v, index, dtype=dtype, copy=False,
-> 5714                                 raise_cast_failure=False)
   5715 
   5716         homogenized.append(v)

/databricks/python/lib/python3.5/site-packages/pandas/core/series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
   2950             raise Exception('Data must be 1-dimensional')
   2951         else:
-> 2952             subarr = _asarray_tuplesafe(data, dtype=dtype)
   2953 
   2954     # This is to prevent mixed-type Series getting all casted to

/databricks/python/lib/python3.5/site-packages/pandas/core/common.py in _asarray_tuplesafe(values, dtype)
    390             except ValueError:
    391                 # we have a list-of-list
--> 392                 result[:] = [tuple(x) for x in values]
    393 
    394     return result

ValueError: cannot copy sequence with size 3 to array axis with dimension 14400 ```

您的结果将是一系列数据帧。 为了将它们转换为一个 DataFrame,您可以执行以下操作:

pd.concat(ds_telemetry_pd.apply(apply_calc_coef, axis=1).tolist(), ignore_index=True)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM