我使用 python pandas 来提取一些数据（页面标题），但输出的顺序与我放入代码中的 URL 的顺序不同

Question

So I Wrote the code and ran it and got the.xlsx file but the output is not as the same order of the Url list i put in the code.所以我编写了代码并运行它并获得了.xlsx 文件，但是 output 与我放入代码中的 Url 列表的顺序不同。

#importing the libraries
import re
import lxml
import  chardet
from os import truncate
import bs4
from bs4 import BeautifulSoup
import multiprocessing
import requests
import pandas as pd
from fake_useragent import UserAgent
import numpy as np

urls = list(('https://isabad.com/advanced-professional-email-templates-opencart-extension' ,
'https://isabad.com/seo-basic-pack-opencart-extension',
'https://isabad.com/x-shipping-pro',
'https://isabad.com/bot-blocker-opencart-extension',
'https://isabad.com/opencart-mobile-application'
))

dit = {}
user_agent = UserAgent()
for url in urls:
        data = requests.get(url, headers={"user-agent": user_agent.chrome})
        soup = bs4.BeautifulSoup(data.content, "lxml")
        dit[url] = soup.find_all("title")
        ex = pd.DataFrame({"title": dit ,})
        print(ex)
        ex.to_excel('sasa.xlsx', index=False, engine='xlsxwriter')

How Can I fix this problem?我该如何解决这个问题？

Answer 1

You are using the set data structure for storing the list of URLs and the set data structure in python is an unordered data structure.您正在使用set数据结构来存储 URL 列表，而 python 中的set数据结构是无序数据结构。 To have the output in the same order, you should store the URLs in list data structure as follows:要使 output 以相同的顺序排列，您应该将 URL 存储在list数据结构中，如下所示：

urls = [
  'https://www.sample.com/search/category-mobile/' ,
  'https://www.sample.com/search/category-tablet-ebook-reader',
  'https://www.sample.com/search/category-laptop/',
  'https://www.sample.com/search/category-computer-parts/',
  'https://www.sample.com/search/category-office-machines/'
]

Cheers!干杯!

Answer 2

use a list so the results would be in the same order that you defined.使用list ，以便结果与您定义的顺序相同。

urls = ['https://www.sample.com/search/category-mobile/' ,
'https://www.sample.com/search/category-tablet-ebook-reader',
'https://www.sample.com/search/category-laptop/',
'https://www.sample.com/search/category-computer-parts/',
'https://www.sample.com/search/category-office-machines/'
]

我使用 python pandas 来提取一些数据（页面标题），但输出的顺序与我放入代码中的 URL 的顺序不同

问题描述

2 个解决方案

解决方案1
2 2021-01-20 15:10:08

解决方案2
1 2021-01-20 15:17:26

我使用 python pandas 来提取一些数据（页面标题），但输出的顺序与我放入代码中的 URL 的顺序不同

问题描述

2 个解决方案

解决方案1 2 2021-01-20 15:10:08

解决方案2 1 2021-01-20 15:17:26

解决方案1
2 2021-01-20 15:10:08

解决方案2
1 2021-01-20 15:17:26