简体   繁体   中英

Yelp DataSet - Access specifically to review.jason

Im doing an online course where we use the Yelp Dataset.

The dataset is available here
https://www.yelp.com/dataset

The download arrives as a yelp.dataset.tar file.

If I extract that file using say a win 7, it becomes a file named "yelp_dataset" of type - I'm not sure because it doesnt have a "." extenstion. The course which uses python to get into the "Review Data" goes straight to

path = 'C:/Users/xyz/Desktop/Python Folder/Data/yelp_dataset/review.json'
f = open(path)
d = jsonloads(f.readline)) 

however I obviously don't have review.json or any of the other.json files like user.json etc. Having read the documentation on the dataset I read that "Each file is composed of a single object type, one JSON-object per-line." however not sure how to get at the review.json object.

Many thanks

Thank you Furas - you put me on the right track. I should not have been extracting the file using Winzip or similar. The correct thing to do is to use Python to extract the files.

path = 'aFILEPATH/yelp_dataset/yelp_dataset.tar'

# open file
file = tarfile.open(path)

# extracting file
file.extractall()

file.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM