[英]how do I get mixed JSON and TSV data into a data frame?
So if my understanding is correct, You want to load a tsv file as a pandas dataframe right?所以如果我的理解是正确的,你想加载一个 tsv 文件作为 pandas dataframe 对吗?
Assuming you have a tsv file.假设你有一个 tsv 文件。
df = pd.read_csv("path to the tsv file", sep="\t")
This will load your tsv file as a DF.这会将您的 tsv 文件加载为 DF。
Then what you can do is iterate over the column that has your json.然后你可以做的是迭代具有你的 json 的列。
for col in df[["columnname"]]:
obj = df[col]
for item in obj.values:
json_obj = json.loads(item)
This is verbatim a homework question for the UMich Data Science degree, so I won't answer in detail.这是 UMich 数据科学学位的逐字作业问题,所以我不会详细回答。 That said, my overall successful approach was to read in the file as a list, and then evaluate each item in the list with a for loop.
也就是说,我的总体成功方法是将文件作为列表读入,然后使用 for 循环评估列表中的每个项目。 Since each item came in as a string, if it looked like a JSON object, I used json.loads() on the item to convert it to Python dictionary format.
由于每个项目都以字符串形式出现,如果它看起来像 JSON object,我在项目上使用 json.loads() 将其转换为 Python 字典格式。 If it did not look like a JSON object, I used.split() and then created a dictionary of key-value pairs with the given keys, and subsetting each element of the split as a value.
如果它看起来不像 JSON object,我使用 .split() 然后用给定的键创建一个键值对字典,并将拆分的每个元素子集化为一个值。 Clunky, but it worked.
笨拙,但它奏效了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.