Python：将值添加到空字典

Question

I have scraped a data from website and I would like to save all of data.我从网站上抓取了一个数据，我想保存所有数据。 However, it only saves the last value of the data.但是，它只保存数据的最后一个值。 I have made an empty dictionary but i'm struggling with adding element in empty dictionary我制作了一本空字典，但我正在努力在空字典中添加元素

Here's my code这是我的代码

from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy

try:
    source = requests.get('https://www.imdb.com/chart/top/')
    source.raise_for_status()

    soup = BeautifulSoup(source.text,'html.parser')


    movies = soup.find('tbody', class_="lister-list").find_all('tr')    
    
data = {}

    for movie in movies: 
        
        name = movie.find('td', class_='titleColumn').a.text
        
        rank = movie.find('td', class_="titleColumn").get_text(strip=True).split('.')[0] 

        year = movie.find('td', class_="titleColumn").span.text.strip('()')

        rating = movie.find('td', class_="ratingColumn imdbRating").strong.text
        
except Exception as e:
    print(e)

print(data)

Answer 1

Close to your goal, simply add the information to your dict and append it with each iteration to a list.接近您的目标，只需将信息添加到您的字典和 append 每次迭代到列表中。 So you are able to create a dataframe:所以你可以创建一个 dataframe：

for movie in movies:

    data.append({
        'name': movie.find('td', class_='titleColumn').a.text,
        'rank': movie.find('td', class_="titleColumn").get_text(strip=True).split('.')[0],
        'year': movie.find('td', class_="titleColumn").span.text.strip('()'),
        'rating': movie.find('td', class_="ratingColumn imdbRating").strong.text
    })

Example例子

from bs4 import BeautifulSoup
import requests
import pandas as pd

source = requests.get('https://www.imdb.com/chart/top/')
source.raise_for_status()

soup = BeautifulSoup(source.text,'html.parser')

movies = soup.find('tbody', class_="lister-list").find_all('tr')
data = []

for movie in movies:

    data.append({
        'name': movie.find('td', class_='titleColumn').a.text,
        'rank': movie.find('td', class_="titleColumn").get_text(strip=True).split('.')[0],
        'year': movie.find('td', class_="titleColumn").span.text.strip('()'),
        'rating': movie.find('td', class_="ratingColumn imdbRating").strong.text
    })

pd.DataFrame(data)

Output Output

	name姓名	rank秩	year年	rating评分
0 0	Die Verurteilten死亡	1 1	1994 1994	9.2 9.2
1 1	Der Pate德佩特	2 2	1972 1972年	9.2 9.2
2 2	The Dark Knight黑暗骑士	3 3	2008 2008年	9 9
3 3	Der Pate 2德佩特 2	4 4	1974 1974年	9 9
4 4	Die zwölf Geschworenen Die zwölf Geschworenen	5 5	1957 1957年	8.9 8.9

.... ……

Answer 2

you can replace your for loop with this one to add nested dictionaries, so you can find your movie info by name, then what info you wanted from it你可以用这个替换你的for循环来添加嵌套字典，这样你就可以按名字找到你的电影信息，然后你想从中得到什么信息

for movie in movies:
    
    name = movie.find('td', class_='titleColumn').a.text

    data[name] = {}
    
    rank = movie.find('td', class_="titleColumn").get_text(strip=True).split('.')[0] 

    year = movie.find('td', class_="titleColumn").span.text.strip('()')

    rating = movie.find('td', class_="ratingColumn imdbRating").strong.text

    data[name]["rank"] = rank
    data[name]["year"] = year
    data[name]["rating"] = rating

print(data)

Answer 3

I would suggest you to store the cur movie in data but make the name of the movie as a key我建议您将 cur 电影存储在数据中，但将电影名称作为键

from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy

try:
    source = requests.get('https://www.imdb.com/chart/top/')
    source.raise_for_status()

    soup = BeautifulSoup(source.text,'html.parser')


    movies = soup.find('tbody', class_="lister-list").find_all('tr')    
    
data = {}

    for movie in movies: 
        
        name = movie.find('td', class_='titleColumn').a.text
        
        rank = movie.find('td', class_="titleColumn").get_text(strip=True).split('.')[0] 

        year = movie.find('td', class_="titleColumn").span.text.strip('()')

        rating = movie.find('td', class_="ratingColumn imdbRating").strong.text
        cur = {
            'name': name,
            'rank': rank,
            'year': year.
            'rating': rating
        }
        # storing the cur movie in data but name of the movie as a key 
        data[name] = cur
        
except Exception as e:
    print(e)

print(data)

Python：将值添加到空字典

问题描述

3 个解决方案

解决方案1
2 2022-08-15 18:44:13

Example例子

Output Output

解决方案2
0 2022-08-15 18:45:41

解决方案3
0 2022-08-15 18:46:37

Python：将值添加到空字典

问题描述

3 个解决方案

解决方案1 2 2022-08-15 18:44:13

Example例子

Output Output

解决方案2 0 2022-08-15 18:45:41

解决方案3 0 2022-08-15 18:46:37

解决方案1
2 2022-08-15 18:44:13

解决方案2
0 2022-08-15 18:45:41

解决方案3
0 2022-08-15 18:46:37