繁体   English   中英

如何遍历文件并将所有id值放入列表中[Python]

[英]How to iterate over a file and put the all the id values into an list [Python]

{'asin': '0756029929', 'description': 'Spanish Third-year Pin, 1 inch in diameter.  Set of 10.', 'title': 'Spanish Third-year Pin Set of 10', 'price': 11.45, 'imUrl': 'http://ecx.images-amazon.com/images/I/51AqSOl7qLL._SY300_.jpg', 'salesRank': {'Toys & Games': 918374}, 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Novelty', 'Clothing']]}
{'asin': '0756029104', 'description': 'Viva Espaol pin, 1 x 1 inch. Set of 10.', 'title': 'Viva Espanol Pins Set of 10', 'price': 11.45, 'imUrl': 'http://ecx.images-amazon.com/images/I/51By%2BZpF9DL._SY300_.jpg', 'salesRank': {'Home & Kitchen': 2651070}, 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Novelty', 'Clothing']]}
{'asin': '0839933363', 'description': 'This necklace from the popular manga and anime series, Death Note. The necklace\'s charm is black and silver with the text, "Death Note" upon it. The approx. length of the necklace is 12"Dimension & Measurement:Length: Approx. 12"', 'title': 'Death Note Anime Manga: Cross Logo necklace', 'imUrl': 'http://ecx.images-amazon.com/images/I/51f0HkHssyL._SY300_.jpg', 'salesRank': {'Toys & Games': 1350779}, 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories']]}
{'asin': '1304567583', 'description': 'pink bikini swimwear glow in the dark fashion', 'title': 'Pink Bikini Swimwear Glow in the Dark Fashion', 'price': 19.99, 'imUrl': 'http://ecx.images-amazon.com/images/I/41pS%2B98jhlL._SY300_.jpg', 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'Costumes']]}
{'asin': '1304567613', 'description': 'Bikini Swimwear glow in the dark fashion blue', 'title': 'Bikini Swimwear Blue Glow in the Dark Fashion', 'price': 29.99, 'imUrl': 'http://ecx.images-amazon.com/images/I/41ZNIUvYkyL._SY300_.jpg', 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'Costumes']]}
{'asin': '1465014578', 'title': '2013 Desk Pad Calendar', 'imUrl': 'http://ecx.images-amazon.com/images/I/51NadiHHHsL._SX342_.jpg', 'related': {'also_bought': ['B009SDBX0Q', 'B009DCUY1G'], 'bought_together': ['B009SDBX0Q', 'B009DCUY1G']}, 'salesRank': {'Clothing': 505645}, 'categories': [['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Band & Music Fan', 'Accessories']]}
{'asin': '1620574128', 'related': {'also_bought': ['B0015KET88', 'B00004WKPP', 'B000F8T8U0', 'B000F8V736', 'B000F8VAOM', 'B0015KGFQM', 'B003U6P4OS', '1564519392', 'B000F8XF8Q', 'B0042SR3E2', 'B004PBLVDU', 'B000G3LR9Y', 'B0006PKZBI', 'B0007PC9CK', 'B001G98DS0', 'B001UFWJLW', 'B003S8XLWA', '0486214834', '1609964713', 'B000P1PVMQ', '0590308572', 'B000QDZY52', '1564514188', 'B0006PKZ7W', 'B000T2YKIM', 'B000QDTWF0', 'B000FA6DXS', 'B0007P94ZA', 'B000WA3FKU', 'B00004WKPU', 'B000F8XF68', 'B004DJ51JE', '

我有此文件,想将asins的所有值放入自己的列表中,这是到目前为止的内容,但是我无法弄清楚如何执行此操作或执行此操作的最佳方法,因为该文件具有.json的扩展名,但不是有效的json格式,因此为什么我要像普通文本文件一样尝试这样做。

with open('File.json', "r") as f:
    for line in f:
         if 'asin' in line:
        #Code that gets the values of asins
         clothing_ids.append(#then add them values to clothing_ids)

print(clothing_ids)

将列表理解与ast.literal_eval

import ast
data = ast.literal_eval(open('File.json').read())
asins = [i['asin'] for i in data]

如果保证文件没有污染,则可以eval每一行:

with open('File.json', "r") as f:
    asins = [eval(line)['asin'] for line in f]

这是相同的代码,使用@ Ajax1234的ast.literal_eval()来避免受污染的文件出现问题,但是使用我的列表理解功能,该功能可以单独评估每一行,因此不会将整个数据集存储为临时文件。

import ast

with open('File.json', "r") as f:
    asins = [ast.literal_eval(line)['asin'] for line in f]

对于基于注释的奖励问题,请获取所有“ also_bought”项目的列表,包括重复项:

import ast

also_bought = []
asins = []
with open('File.json', "r") as f:
    for line in f:
        item = ast.literal_eval(line)
        asins.append(item['asin'])
        if 'related' in item:
           related = item['related']
           if 'also_bought' in related:
              also_bought.extend(related['also_bought'])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM