[英]Return a dataframe from a function
I'm having a little issue with the following code:我在使用以下代码时遇到了一些问题:
def process_row(row):
liens = row['Liens']
df_excel = pd.read_excel(liens)
df_excel['Name'] = row['Name']
print(df_excel)
return df_excel
result_df = new_df.apply(process_row, axis=1)
new_df is a dataframe where 'Liens' is a column with multiple windows link to excel such as C:/Documents/.../test.xlsx When printing df_excel it does display the data in the excel file correctly, but not in result_df, it returns me a serie that is different from excel_df and I have no idea why... I would like to have a dataframe result_df with all the data from the multiple excel file regrouped. new_df 是一个 dataframe,其中“留置权”是一个包含多个 windows 链接到 excel 的列,例如 C:/Documents/.../test.xlsx 打印 df_excel 时,它确实显示 excel 中的数据,但不正确地显示在 result_filed 中它返回一个与 excel_df 不同的系列,我不知道为什么......我想要一个 dataframe result_df,其中包含来自多个 excel 文件的所有数据重新组合。
Thanks for your help!!谢谢你的帮助!!
If your goal is to read a bunch of excel files based on information in a dataframe and combine the resulting dataframes, something like this should do the trick:如果您的目标是根据 dataframe 中的信息读取一堆 excel 文件并组合生成的数据帧,那么这样的事情应该可以解决问题:
import pandas as pd
def main():
new_df = pd.DataFrame({"col_a": ["a", "b", "c"]})
def process_row(row) -> pd.DataFrame:
other_df = pd.DataFrame({"col_b": ["d", "e", "f"]})
other_df["col_a"] = row.col_a
return other_df
# Iterate over all rows, process the row and concat the resulting dataframes
result_df = pd.concat(
map(process_row, new_df.itertuples()),
ignore_index=True
)
print(result_df)
if __name__ == "__main__":
main()
The output is as follows: output如下:
col_b col_a
0 d a
1 e a
2 f a
3 d b
4 e b
5 f b
6 d c
7 e c
8 f c
You can substitute your process_row
and dataframe here.您可以在此处替换您的
process_row
和 dataframe。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.