简体   繁体   English

遍历字典:类型错误:列表索引必须是整数或切片,而不是 str

[英]Iterating through a dictionary: TypeError: list indices must be integers or slices, not str

I'm new to Python trying to build a web scraper with Scrapy and I am getting a lot of non-printing and blank spaces in the results.我是 Python 的新手,试图用 Scrapy 构建一个网络爬虫,结果中出现了很多非打印和空格。 I'm attempting to iterate through a dictionary with a for loop where the values are lists, then run the .strip() method to get rid of all the non-printing characters.我正在尝试使用 for 循环遍历字典,其中值是列表,然后运行 ​​.strip() 方法以删除所有非打印字符。 Only now I this error instead, "TypeError: list indices must be integers or slices, not str".直到现在我才出现这个错误,“类型错误:列表索引必须是整数或切片,而不是 str”。 I know I must be reaching into the object wrong, but after a few days of sifting through docs and similar exceptions I haven't found a way to resolve it yet.我知道我一定是错误地接触了对象,但是经过几天的文档和类似异常的筛选后,我还没有找到解决它的方法。

The code I'm using is:我正在使用的代码是:

# -*- coding: utf-8 -*-
import scrapy
from ..items import JobcollectorItem
from ..AutoCrawler import searchIndeed


class IndeedSpider(scrapy.Spider):
    name = 'indeed'
    page_number = 2
    start_urls = [searchIndeed.current_page_url]

    def parse(self, response):
        items = JobcollectorItem()

        position = response.css('.jobtitle::text').extract()
        company = response.css('span.company::text').extract()
        location = response.css('.location::text').extract()

        # print(position[0])

        items['position'] = position
        items['company'] = company
        items['location'] = location

        for key in items.keys():
            prestripped = items[key]
            for object in prestripped:
                object = object.strip('\n')
            items[key] = prestripped

        yield items

I'm using python 3.7.4.我正在使用 python 3.7.4。 Any tips on simplifying the function to get rid of the nested for loops would also be appreciated.任何关于简化函数以摆脱嵌套 for 循环的提示也将不胜感激。 The code for the entire project can be found here .整个项目的代码可以在这里找到。

Thanks for the help!谢谢您的帮助!

Edit0: The exception is thrown at line 27 reading: " prestripped = items[key][value] TypeError: list indices must be integers or slices, not str"编辑 0:在第 27 行引发异常阅读:“prestripped = items[key][value] TypeError:列表索引必须是整数或切片,而不是 str”

Edit1: The data structure is items{'key':[list_of_strings]} where the dictionary name is items, the keys are string and the key's value is a list, with each list element being a sting. Edit1:数据结构是 items{'key':[list_of_strings]} ,其中字典名称是 items,键是字符串,键的值是一个列表,每个列表元素都是一个字符串。

Edit2: Updated the code to reflect Alex.Kh's answer. Edit2:更新了代码以反映 Alex.Kh 的回答。 Also, here is an approximation of what is currently getting returned: {company: ['\\nCompany Name', '\\n', '\\nCompany Name', '\\n', '\\n', '\\n',], location: ['Some City, US', 'Some City, US'], position: [' ', '\\n', '\\nPosition Name', ' ', ' Position Name']}此外,这里是当前返回的近似值:{company: ['\\nCompany Name', '\\n', '\\nCompany Name', '\\n', '\\n', '\\n',] , location: ['Some City, US', 'Some City, US'], position: [' ', '\\n', '\\n职位名称', ' ', '职位名称']}

In addition to my comment, I think I know how to simplify and fix your code as well.除了我的评论,我想我也知道如何简化和修复你的代码。

...
for key in items.keys():
   restripped = items[key]
   #BEWARE: a novice mistake here as object is just a copy
   for object in restripped: #assuming extract() returns a list
         object=object.strip() # will change a temporary copy
   items[key] = restripped
...

I am not sure why exactly you need a value in your loop, so you could also just say for key in items.keys(): .我不确定为什么你的循环中需要一个值,所以你也可以只说for key in items.keys(): You main mistake was probably accessing the dictionary incorrectly( items[key][value]->items[key] as value is an actually value that corresponds to that key).您的主要错误可能是错误地访问了字典( items[key][value]->items[key]因为value是与该键对应的实际值)。

Edit :I spotted a huge mistake from my part in the for loop.编辑:我在for循环中发现了一个巨大的错误。 As it creates a copy, the statement object=object.strip() will not affect the actual list.在创建副本时,语句object=object.strip()不会影响实际列表。 Guess not using Python for a while does make you forget certain features猜猜暂时不使用 Python 确实会让您忘记某些功能

I will leave the incorrect solution a reminder both to me and others.我会给我和其他人留下不正确的解决方案的提醒。 The correct way to use the strip() method is as follows: strip()方法的正确使用方法如下:

...
#correct solution
for key in items.keys():
   restripped = items[key]
   for i,object in enumerate(restripped):
         # alternatively: restripped[i]=restripped[i].strip()
         restripped[i]=object.strip()
   items[key] = restripped
...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 迭代字典会抛出 TypeError:列表索引必须是整数或切片,而不是元组 - Iterating on dictionary throws TypeError: list indices must be integers or slices, not tuple 字典类型错误:列表索引必须是整数或切片,而不是 str - Dictionary TypeError: list indices must be integers or slices, not str TypeError:列表索引必须是整数或切片,而不是str字典python - TypeError: list indices must be integers or slices, not str dictionary python 类型错误:使用字典时,列表索引必须是整数或切片,而不是 str - TypeError: list indices must be integers or slices, not str when using dictionary “TypeError:list indices必须是整数或切片,而不是str” - “TypeError: list indices must be integers or slices, not str” TypeError:列表索引必须是整数或切片,而不是 str - TypeError: List indices must be integers or slices and not str 类型错误:列表索引必须是整数或切片,而不是 str - TypeError: list indices must be integers or slices, not str “TypeError:列表索引必须是整数或切片,而不是 str” - "TypeError: list indices must be integers or slices, not str" TypeError:列表索引必须是整数或切片而不是 str - TypeError: list indices must be integers or slices not str Python3-TypeError:列表索引必须是整数或切片,而不是str-List - Python3 - TypeError: list indices must be integers or slices, not str - List
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM