[英]Convert DataFrame with url in string format to JSON properly
I have a data frame with 2 columns, one of which consists of URLs.我有一个包含 2 列的数据框,其中一列由 URL 组成。
Sample code:示例代码:
df = pd.DataFrame(columns=('name', 'image'))
df = df.append({'name': 'sample_name', 'image': 'https://images.pexels.com/photos/736230/pexels-photo-736230.jpeg?auto=compress&cs=tinysrgb&dpr=1&w=500'}, ignore_index=True)
df = df.append({'name': 'sample_name2', 'image': 'https://cdn.theatlantic.com/assets/media/img/mt/2017/10/Pict1_Ursinia_calendulifolia/lead_720_405.jpg?mod=1533691909'}, ignore_index=True)
I want to convert this dataframe to JSON directly.我想直接将此数据帧转换为 JSON。 I've used
to_json()
method of DataFrame to convert, but when I do it, it kind of messes up the urls in the data frame.我已经使用
to_json()
方法进行转换,但是当我这样做时,它会弄乱数据框中的 url。
Conversion to JSON:转换为 JSON:
json = df.to_json(orient='records')
When I print it, the conversion inserts '\' character to beginning of every '/' character in my url.当我打印它时,转换会在我的 url 中每个 '/' 字符的开头插入 '\' 字符。
print(json)
Result:结果:
[{"name":"sample_name","image":"https:\/\/images.pexels.com\/photos\/736230\/pexels-photo-736230.jpeg?auto=compress&cs=tinysrgb&dpr=1&w=500"},{"na
me":"sample_name2","image":"https:\/\/cdn.theatlantic.com\/assets\/media\/img\/mt\/2017\/10\/Pict1_Ursinia_calendulifolia\/lead_720_405.jpg?mod=15
33691909"}]
I want the json to look like (no extra '\' in urls):我希望 json 看起来像(在 url 中没有额外的 '\'):
[{"name":"sample_name","image":"https://images.pexels.com/photos/736230/pexels-photo-736230.jpeg?auto=compress&cs=tinysrgb&dpr=1&w=500"},{"na
me":"sample_name2","image":"https://cdn.theatlantic.com/assets/media/img/mt/2017/10/Pict1_Ursinia_calendulifolia/lead_720_405.jpg?mod=15
33691909"}]
I checked documentation of to_json()
and other questions as well but couldn't find an answer to deal with it.我还检查了
to_json()
的文档和其他问题,但找不到处理它的答案。 How can I just convert my url strings to json, as they are in data frame?我怎样才能将我的 url 字符串转换为 json,因为它们在数据框中?
Pandas uses ujson
[PiPy] internally to encode the data to a JSON blob. Pandas 在内部使用
ujson
[PiPy]将数据编码为 JSON blob。 ujson
by default escapes slashes with the escape_forward_slashes
option. ujson
默认使用escape_forward_slashes
选项转义斜线。
You can just json.dumps(…)
the result of converting your dataframe to a dictionary with .to_dict
:您可以
json.dumps(…)
将数据框转换为字典的结果.to_dict
:
>>> import json
>>> print(json.dumps(df.to_dict('records')))
[{"name": "sample_name", "image": "https://images.pexels.com/photos/736230/pexels-photo-736230.jpeg?auto=compress&cs=tinysrgb&dpr=1&w=500"}, {"name": "sample_name2", "image": "https://cdn.theatlantic.com/assets/media/img/mt/2017/10/Pict1_Ursinia_calendulifolia/lead_720_405.jpg?mod=1533691909"}]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.