[英]Extract specific data from JSON file
I have this json file that is available in the link https://raw.githubusercontent.com/Cyral/Bakeoof/master/full_format_recipes.json我有这个 json 文件,可在链接https://raw.githubusercontent.com/Cyral/Bakeoof/master/full_format_recipes.Z466DEEC76ECDF5FCA6D38571F6324D54
and I used Pandas to open the recipes JSON file.我使用 Pandas 打开食谱 JSON 文件。
import pandas as pd
df = pd.read_json('full_format_recipes.json', lines=True)
print(df)
and this is the output I get这是我得到的 output
0 \
0 {'directions': ['1. Place the stock, lentils, ...
1 \
0 {'directions': ['Combine first 9 ingredients i...
2 \
0 {'directions': ['In a large heavy saucepan coo...
3 \
0 {'directions': ['Heat oil in heavy large skill...
4 \
0 {'directions': ['Preheat oven to 350°F. Lightl...
5 \
0 {'directions': ['Mix basil, mayonnaise and but...
6 \
0 {'directions': ['Cook potatoes and carrots in ...
7 \
0 {'directions': ['Stir together sugar and chili...
8 \
0 {'directions': ['Stir together soy sauce, suga...
9 ... \
0 {'directions': ['Chop enough parsley leaves to... ...
20120 \
0 {'directions': ['Bring all ingredients to a si...
20121 \
0 {'directions': ['1. Preheat the oven to 400°F....
20122 \
0 {'directions': ['Mix first 4 ingredients in la...
20123 \
0 {'directions': ['Stir water, sugar and juice i...
20124 \
0 {'directions': ['Wash spareribs. Remove excess...
20125 \
0 {'directions': ['Beat whites in a bowl with an...
20126 \
0 {'directions': ['Bring broth to simmer in sauc...
20127 \
0 {'directions': ['Using a sharp knife, cut a sh...
20128 \
0 {'directions': ['Heat 2 tablespoons oil in hea...
20129
0 {'directions': ['Position rack in bottom third...
[1 rows x 20130 columns]
I want to extract only the directions and the title for each recipe, How can I do that?我只想提取每个食谱的说明和标题,我该怎么做?
Remove lines=True
- that's meant for JSON files in which an object exists on each line, or if you want to read in each object individually.删除lines=True
- 这适用于 JSON 文件,其中 object 存在于每一行,或者如果您想单独读取每个 object。 That is, with line=True
although each object has the same properties, they are not collated into a single entity.也就是说,使用line=True
尽管每个 object 具有相同的属性,但它们不会被整理成一个实体。
Making that modification, you'll be able to access the properties you desire:进行该修改后,您将能够访问所需的属性:
import pandas as pd
url = 'https://raw.githubusercontent.com/Cyral/Bakeoof/master/full_format_recipes.json'
df = pd.read_json(url)
print(df['directions'])
print(df['title'])
Ref: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html参考: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html
If you get rid of lines=True
it should format your code accordingly to allow you to access your JSON as you normally would.如果您摆脱lines=True
它应该相应地格式化您的代码,以允许您像往常一样访问您的 JSON。 You don't need it in that structure.在那种结构中你不需要它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.