简体   繁体   English

Python Selenium:无法从 Instagram 中获取 HREF 链接<time>标签</time>

[英]Python Selenium: Can't Get HREF Link Off Instagram in <time> tags

PostLinkExtraction = driver.find_element_by_xpath("//article[1]/div[3]/div[1]/div/div[2]/div[1][*[local-name()='a']]").get_attribute('href')
print (PostLinkExtraction)

Im trying to print the href link from the Time Stamp on Instagram under the first post on my Instagram Timeline.我试图在我的 Instagram 时间轴上的第一篇文章下打印 Instagram 时间戳中的 href 链接。 The code above returns none for some reason.上面的代码由于某种原因没有返回。 Below is the code for anyone who wants to run it and see where I may have went wrong, but the overall goal I want to accomplish is to extract the href link from the <-time> tags.以下代码供任何想要运行它并查看我可能哪里出错的人使用,但我想要完成的总体目标是从 <-time> 标记中提取 href 链接。 Below is an image of where the <-time> tags will be in developer tools下图是 <-time> 标签在开发者工具中的位置

在此处输入图像描述

from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from time import sleep
from selenium.webdriver.common.keys import Keys
from selenium import webdriver

user = 'username'
passw = 'password'



driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.instagram.com/')
driver.implicitly_wait(10)

driver.find_element_by_name('username').send_keys(user)
driver.find_element_by_name('password').send_keys(passw)
Login = "//button[@type='submit']"
sleep(2)
driver.find_element_by_xpath(Login).submit()
sleep(1)
# Logs into Instagram
print ('Logged In')

#------------------------ATTENTION

NotNow = "//button[contains(text(),'Not Now')]"
driver.find_element_by_xpath(NotNow).click()
# Clicks Pop Up
print ('Close Pop Up')

# It's weird but the pop up opens once, only after this page.
# If ever a problem delete one, or have the first click be
# directed to your Instagram Profiles timeline

NotNow = "//button[contains(text(),'Not Now')]"
driver.find_element_by_xpath(NotNow).click()
#Clicks Pop Up; Comment out the line above if it causes an error
print ('Close Pop Up')

#-----------------------------------



driver.refresh()
print ('refreshing')
driver.implicitly_wait(10)
PostLinkExtraction = driver.find_element_by_xpath("//article[1]/div[3]/div[1]/div/div[2]/div[1][*[local-name()='a']]").get_attribute('href')
print (PostLinkExtraction)

I find out the issue is because of your xpath. Fix it and you will print out the href of your first post.我发现问题是因为您的 xpath。修复它,您将打印出第一篇文章的 href。

PostLinkExtraction = driver.find_element_by_xpath("//article[1]/div[3]/div[1]/div/div[2]/div[1]/a").get_attribute('href')
print (PostLinkExtraction)

The result:结果:

在此处输入图像描述

Short Answer: Stop sticking to xpaths and find the elements you're looking for in this way: 1 - put all the elements with the same tag in an array简短回答:停止坚持 xpaths 并以这种方式找到您要查找的元素:1 - 将所有具有相同标签的元素放入一个数组中

2 - search for two-three attributes that renders it unique 2 - 搜索使其独一无二的两到三个属性

3- extract it cycling in the array and use it 3- 提取它在数组中循环并使用它

Easy, fast and clean.简单、快速、干净。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM