简体繁体 English

python可写的文件格式，在Spark中可作为Dataframe读取

[英]A file format writable by python, readable as a Dataframe in Spark

原文 2015-06-30 07:13:30 4 1 python/ scala/ apache-spark

I have python scripts (no Spark here) producing some data files, that I want to be readable easily as Dataframes in a scala/spark application. 我有python脚本（这里没有Spark）生成一些数据文件，我想像scala / spark应用程序中的Dataframes一样容易阅读。

What's the best choice ? 最佳选择是什么？

1 个解决方案

If your data doesn't have newlines in then a simple text-based format such as TSV is probably best. 如果您的数据中没有换行符，那么最好使用诸如TSV之类的基于文本的简单格式。

If you need to include binary data then a separated format like protobuf makes sense - anything for which a hadoop InputFormat exists should be fine. 如果您需要包括二进制数据，那么像protobuf这样的单独格式就很有意义-存在hadoop InputFormat的任何内容都可以。

将任何文件（可读为文本文件）转换为 Excel 格式（.xlsx），然后转换为 dataframe -Python - Convert any file (Readable as Text File) to Excel Format (.xlsx) virtually then convert as dataframe -Python

python列表列表的用户可读文件格式 - User-readable file format for python list of lists

"在python中以人类可读格式解码pickle文件" - Decode pickle file in human-readable format in python

在Python中将杂乱的数据文件清理为更具可读性的格式？ - Cleaning up a messy data file to a more readable format in Python?

巨大的洞穴冒险文件：从二进制 (?) 解码为可读格式 (Python) - Colossal Cave Adventure File: Decode From Binary (?) to readable format (Python)

以可读格式在python中腌制字典 - pickle a dictonary in python in a readable format

使用python和'|'将spark数据帧写入文件分隔符 - Write spark dataframe to file using python and '|' delimiter

如何将python中的时间戳格式化为可读格式？ - How to format a timestamp in python to a readable format?

将python输出为python可读格式 - Output python into python-readable format

Spark数据框-Python - Spark Dataframe-Python

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将任何文件（可读为文本文件）转换为 Excel 格式（.xlsx），然后转换为 dataframe -Python - Convert any file (Readable as Text File) to Excel Format (.xlsx) virtually then convert as dataframe -Python python列表列表的用户可读文件格式 - User-readable file format for python list of lists "在python中以人类可读格式解码pickle文件" - Decode pickle file in human-readable format in python 在Python中将杂乱的数据文件清理为更具可读性的格式？ - Cleaning up a messy data file to a more readable format in Python? 巨大的洞穴冒险文件：从二进制 (?) 解码为可读格式 (Python) - Colossal Cave Adventure File: Decode From Binary (?) to readable format (Python) 以可读格式在python中腌制字典 - pickle a dictonary in python in a readable format 使用python和'|'将spark数据帧写入文件分隔符 - Write spark dataframe to file using python and '|' delimiter 如何将python中的时间戳格式化为可读格式？ - How to format a timestamp in python to a readable format? 将python输出为python可读格式 - Output python into python-readable format Spark数据框-Python - Spark Dataframe-Python

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM