简体   繁体   English

如何将字典格式的txt文件转换为python中的数据帧?

[英]How to convert a txt file with dictionary format to dataframe in python?

I have a file comprising data like,我有一个包含数据的文件,例如,

{"cid": "ABCD", "text": "alphabets", "time": "1 week", "author": "xyz"}
{"cid": "EFGH", "text": "verb", "time": "2 week", "author": "aaa"}
{"cid": "IJKL", "text": "noun", "time": "3 days", "author": "nop"}

I wish to read this file and create a dataframe like,我希望读取这个文件并创建一个数据框,例如,

cid     text    time    author
ABCD    alpha   1week   xyz
EFGH    verb    2week   aaa
IJKL    noun    3days   nop

You can try reading the file as csv with a different seperator and grabbing the first column , then apply ast.literal_eval to convert to actual dictionary and convert back to dataframe:您可以尝试使用不同的分隔符将文件读取为 csv 并获取第一列,然后应用ast.literal_eval转换为实际字典并转换回数据帧:

import ast
output = pd.DataFrame(pd.read_csv('file.txt',sep='|',header=None).iloc[:,0]
         .apply(ast.literal_eval).tolist())

print(output)

    cid       text    time author
0  ABCD  alphabets  1 week    xyz
1  EFGH       verb  2 week    aaa
2  IJKL       noun  3 days    nop

Working example:工作示例:

file = """{"cid": "ABCD", "text": "alphabets", "time": "1 week", "author":"xyz"}
{"cid": "EFGH", "text": "verb", "time": "2 week", "author": "aaa"}
{"cid": "IJKL", "text": "noun", "time": "3 days", "author": "nop"}"""

import io #dont need for reading a file directly , just for example
import ast
print(pd.DataFrame(pd.read_csv(io.StringIO(file),sep='|',header=None).iloc[:,0]
             .apply(ast.literal_eval).tolist()))

    cid       text    time author
0  ABCD  alphabets  1 week    xyz
1  EFGH       verb  2 week    aaa
2  IJKL       noun  3 days    nop
​

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM