简体   繁体   English

是否可以从数据集中生成 pdf 并保存到代工厂

[英]is it possible to generate pdf from datasets and save to foundry

FPDF is a library that allows to convert a pandas dataframe to nicely formatted pdf reports. FPDF是一个允许将 pandas dataframe 转换为格式良好的 pdf 报告的库。 Is there a feature in foundry code repo or code workbook to write pdf files into foundry from a spark or pandas dataframe?代工厂代码仓库或代码工作簿中是否有将 pdf 文件从 spark 或 pandas dataframe 写入代工厂的功能?

i have a requirement to create a nicely formatted pdf report from a foundry dataset filtered to few rows.我需要从过滤到几行的铸造数据集中创建格式良好的 pdf 报告。

While I'm not familiar with the FPDF library specifically, Foundry supports generating files from datasets in transforms or Code Workbooks.虽然我并不特别熟悉 FPDF 库,但 Foundry 支持从转换或代码工作簿中的数据集生成文件。

To create a single Pandas-based PDF from your dataset, convert your dataset to Pandas and get an output file handle from Foundry such as.要从数据集中创建单个基于 Pandas 的 PDF,请将数据集转换为 Pandas 并从 Foundry 获取 output 文件句柄,例如。 In Code Workbooks,在代码工作簿中,

def pdf_dataset(input_df):
    output = Transforms.get_output()
    pd = input_df.toPandas()
    output_fs = output.filesystem()
        with output_fs.open(output_file_path, "wb") as f:
            # use FDPF as needed

Alternatively, you can create a PDF per-row in parallel via Spark.或者,您可以通过 Spark 并行创建每行 PDF。 This can be done most easily by structuring your data such that the parameters needed to generate each PDF are colocated in rows and from there you can run a Python function on to generate the PDF and write it out of Python memory to the destination dataset. This can be done most easily by structuring your data such that the parameters needed to generate each PDF are colocated in rows and from there you can run a Python function on to generate the PDF and write it out of Python memory to the destination dataset.

In a Code Workbook this would resemble在代码工作簿中,这类似于

def pdf_dataset(input_df):
    output = Transforms.get_output()

    def generate_pdf(row):
        output_fs = output.filesystem()
        with output_fs.open(output_file_path, "wb") as f:
            # use FDPF as needed
            
    input_df.rdd.foreach(generate_pdf)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 是否可以在没有对所有输入数据集的查看权限的情况下在 Foundry 中运行计划? - Is it possible to run a schedule in Foundry without View permissions on all input datasets? 是否可以通过 Reports 中的参数在 Foundry Code Workbook 中构建数据集? - Is it possible to build datasets in Foundry Code Workbook via parameters in Reports? 是否可以从代工厂代码创作中找到数据集的最后一个代工厂同步日期? - Is it possible to find the last foundry sync date of a dataset from foundry code authoring? 是否可以从 Foundry 将数据集导出为 ANSI 文本文件? - Is it possible to export a dataset as an ANSI text file from Foundry? 在 Foundry 中,是否可以从串联导入中提取原始文件名? - In Foundry, is it possible to extract the original file names from a concatenated import? 在 Palantir Foundry 中,为什么我不应该构建 Changelog 数据集的下游? - In Palantir Foundry, why should I not build downstream of Changelog datasets? 是否可以通过 Foundry 中编辑表单的派生值更改 object 的标题属性? - Is it possible to change the title property of an object via a derived value from an edit form in Foundry? 如何在 Palantir Foundry 中合并多个具有相同模式的数据集? - How to union several datasets with the same schema in Palantir Foundry? 是否可以在 Foundry Slate 中创建自定义输入小部件? - Is that possible to create a custom input widget in Foundry Slate? 是否可以在 Foundry 转换中指定 output 文件的名称? - Is it possible to specify the name of the output file in a Foundry transform?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM