Code:
import requests
response= requests.get("https://www.crunchbase.com/search/people/field/organizations/num_employees_enum/anheuser-busch")
response.raise_for_status()
webFile =open('myFile.txt', 'wb')
for chunk in res.iter_content(10000):
webFile.write(chunk)
webFile.close()
I found the following error:
requests.exceptions.HTTPError: 416 Client Error: Requested Range Not Satisfiable for url: https://www.crunchbase.com/search/people/field/organizations/num_employees_enum/anheuser-busch
If you remove the line response.raise_for_status()
you will receive the following output from crunchbase:
As you were browsing www.crunchbase.com something about your browser made us think you were a bot. There are a few reasons this might happen:
In fact, you are a bot, instead of Python requests you should try using their own API.
EDIT
To use the crunchbase API, you need to register here: https://about.crunchbase.com/solutions/ the free basic access licence should be enough to access organizations according the documentation.
Once you have registered you will have a user API key, then you can make your requests as follows:
https://api.crunchbase.com/v3.1/organizations?user_key=[user_key]
The equivalent to the query you made using the API would be something like this:
import json,requests
url = "https://api.crunchbase.com/v3.1/organizations/anheuser-busch"
params = dict(user_key="your_key")
resp = requests.get(url=url, params=params)
data = json.loads(resp.text)
webFile = open('myFile.txt', 'w')
for organization in data:
webFile.write(organization["num_employees_max"])
webFile.close()
Haven't tested it myself but it should get you going.
Here is all the data available for organizations: https://data.crunchbase.com/docs/organization
And here is the reference to get started with the API: https://data.crunchbase.com/docs/using-the-api
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.