简体   繁体   English

如何从这些字符串中提取数据?

[英]How can I extract the data from these strings?

I am making a program that consists of scraping data from a job page, and I get to this data我正在制作一个程序,它包含从工作页面抓取数据,然后我得到了这些数据

{"job":{"ciphertext":"~01142b81f148312a7c","rid":225177647,"uid":"1416152499115024384","type":2,"access":4,"title":"Need app developers to handle our app upgrades","status":1,"category":{"name":"Mobile Development","urlSlug":"mobile-development"
,"contractorTier":2,"description":"We have an app currently built, we are looking for someone to \n\n1) Manage the app for bugs etc \n2) Provide feature upgrades \n3) Overall Management and optimization \n\nPlease get in touch and i will share more details. ","questions":null,"qualifications":{"type":0,"location":null,"minOdeskHours":0,"groupRecno":0,"shouldHavePortfolio":false,"tests":null,"minHoursWeek":40,"group":null,"prefEnglishSkill":0,"minJobSuccessScore":0,"risingTalent":true,"locationCheckRequired":false,"countries":null,"regions":null,"states":null,"timezones":null,"localMarket":false,"onSiteType":null,"locations":null,"localDescription":null,"localFlexibilityDescription":null,"earnings":null,"languages":null
],"clientActivity":{"lastBuyerActivity":null,"totalApplicants":0,"totalHired":0,"totalInvitedToInterview":0,"unansweredInvites":0,"invitationsSent":0
,"buyer":{"isPaymentMethodVerified":false,"location":{"offsetFromUtcMillis":14400000,"countryTimezone":"United Arab Emirates (UTC+04:00)","city":"Dubai","country":"United Arab Emirates"
,"stats":{"totalAssignments":31,"activeAssignmentsCount":3,"feedbackCount":27,"score":4.9258937139,"totalJobsWithHires":30,"hoursCount":7.16666667,"totalCharges":{"currencyCode":"USD","amount":19695.83
,"jobs":{"postedCount":59,"openCount":2
,"avgHourlyJobsRate":{"amount":19.999534874418824

But the problem is that the only data I need is: -Title -Description -Customer activity (lastBuyerActivity, totalApplicants, totalHired, totalInvitedToInterview, unansweredInvites, invitationsSent) -Buyer (isPaymentMethodVerified, location (Country)) -stats (All items) -jobs (all items) -avgHourlyJobsRate但问题是我需要的唯一数据是: -Title -Description -Customer activity (lastBuyerActivity, totalApplicants, totalHired, totalInvitedToInterview, unansweredInvites, invoicesSent) -Buyer (isPaymentMethodVerified, location (Country)) -stats (All items) -jobs (所有项目)-avgHourlyJobsRate

convert the data to dictionary using json.loads使用 json.loads 将数据转换为字典

then you can easily use the dictionary keys that your want to lookup or filter the data.然后您可以轻松使用您想要查找或过滤数据的字典键。

This seems to be a dictionary so you can extract something from it by doing: dictionary["job"]["uid"] for example.这似乎是一本字典,因此您可以通过执行以下操作从中提取某些内容:例如dictionary["job"]["uid"] If it is a Json file convert the data to a Python dictionary如果是 Json 文件,则将数据转换为 Python 字典

These sort of data are JSON type data.这些类型的数据是 JSON 类型的数据。 Python understands these sort of data through dictionary data type. Python 通过字典数据类型来理解这些类型的数据。
Suppose you have your data stored in a string.假设您将数据存储在字符串中。 You can use di = exec(myData) to convert the string to dictionary.您可以使用di = exec(myData)将字符串转换为字典。 Then you can access the structured data like: di["job"] which return's the job section of the data.然后您可以访问结构化数据,如: di["job"]返回数据的作业部分。

di = exec(myData)
print(`di["job"]`)

However this is just a hack and it is not recommended because it's a bit messy and unpythonic .然而,这只是一个 hack,不推荐这样做,因为它有点凌乱和unpythonic
The appropriate way is to use JSON library to convert the data to dictionary.合适的方法是使用 JSON 库将数据转换为字典。 Take a look at the code snippet below to get an idea of what is the appropriate way:看看下面的代码片段,了解什么是合适的方法:

import json

myData = "Put your data Here"
res = json.loads(myData)
print(res["jobs"])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM