简体   繁体   English

无法从 Gizmodo 网站使用 Selenium、python 请求和 beautifulsoup 抓取评论文本

[英]Can't grab comment text using Selenium, python requests, and beautifulsoup from Gizmodo website

I'm working on a side project trying to web scrape data from this site .我正在做一个副项目,试图从这个网站上抓取数据。

However, when I grab the elements, tags, etc with the text I keep getting noneType object has no text (trying to print comment.text using BeautifulSoup functionality)但是,当我用文本抓取元素、标签等时,我一直得到noneType对象没有文本(尝试使用 BeautifulSoup 功能打印comment.text

I've tried it on:我试过了:

<p class="paragraphContainer__ParagraphContainer-sc-77igqf-0 gRtBxu">Well, congrats on it still working, but now you’re walking around with a beatup phone that will always look like a piece of shit. A $10 case would prevent that. An ugly beatup phone will just drive you to get another $1000 pocket super computer sooner. I bet that lens cost you more than $10. </p>

And on:然后:

<div class="reply__content js_reply-content post-content blurable whitelisted-links"><p class="paragraphContainer__ParagraphContainer-sc-77igqf-0 gRtBxu">Well, congrats on it still working, but now you’re walking around with a beatup phone that will always look like a piece of shit. A $10 case would prevent that. An ugly beatup phone will just drive you to get another $1000 pocket super computer sooner. I bet that lens cost you more than $10. </p><p class="paragraphContainer__ParagraphContainer-sc-77igqf-0 gRtBxu">Go buy a case.</p></div>

And on:然后:

//[@id="reply_1829766226"]/div[1]/p[1]/text()

I can do this for another site but not Gizmodo.我可以为另一个站点执行此操作,但不能为 Gizmodo 执行此操作。 What am I missing?我错过了什么?

Using this command:使用这个命令:

text = soup.find("p", attrs="text")

or或者

text = soup.find_all("p", attrs="text")

With the above elements taking the place of p and the attrs as required.上面的元素根据需要代替pattrs And searching for XPath when required as in the last element output.并在需要时搜索 XPath,如最后一个元素输出。 My goal is to extract the entire text "Well, congrats....."我的目标是提取整个文本“好吧,恭喜......”

How about API API怎么样

import requests


r = requests.get('https://gizmodo.com/ajax/comments/views/replies/1829749807?startIndex=0&maxReturned=5&maxChildren=4&approvedOnly=true&cache=true&experimental=true&sorting=top').json()

for item in r['data']['items']:
    for value in item['children']['items'][:]:
        for com in value['body'][:]:
            if "value" in com.keys():
                for sas in com['value'][0:]:
                    if 'value' in sas.keys():
                        print(sas['value'])

Output:输出:

It’s turning into a longer-lasting problem, there aren’t many good phones available that have a headphone jack at this point. If that’s a dealbreaker, and the notch is a dealbreaker, you’re basically out of options. 
Yeah, I couldn’t care less about the notch really. But connecting peripherals      
 is important and I don’t own/plan on buying bluetooth headphones or carrying around an extra dongle to use wired
 headphones.
I use headphones
 infrequently enough that ensuring I have a battery powered device charged and ready all the time or carrying a dongle in my pocket
isn’t convenient
. But I use it just enough that when I want to connect my phone to the car aux jack (no bluetooth) or use a pair of wired headphones I don’t want to have to go digging for some other component.
USB-C headphones?
“But connecting peripherals is important and I don’t own/plan on buying bluetooth headphones or carrying around an extra dongle to use wired headphones.
”
Why can’t you just leave the dongle attached to the end of the headphones? Or      
 buy a pair of USB-C headphones?
Have you actually handled one in real life, or are you just trolling? I sincerely ask, because I did at my local BB store, and to my amazement the notch was not the first thing I noticed, it was the entirety of the phone that blew me away in it’s beauty. The display actually balances out the notch, that it’s actually not overwhelming or intrusive in any way.
I’m starring your comment
 just because you called their notch a “ballsack”, LOL
. It does dangle quite a lot, I’ll give you that.
I think if I were in the Android ecosystem, I’d buy this phone over anything from Samsung, LG, etc., but looking at the front face design makes me really appreciate the attention to detail
that Apple applies to the curves and proportions of
their phones. The front of my XS Max looks 100x better than this.
Alright guys, you heard the man. Shut it down. Google shouldn’t even try anymore.  
Apparently you can just think of the superior aesthetics of the phone every time you realize how awful the pictures are in comparison.
I mean, my Moto Z (not the “force” version) fell out of my backpack while riding a 
motorcycle the other day with nothing on it but a “TurboPower Pack”, then got run over by a car as I was pulling over to retrieve it.
Still 100% functional, with a new camera lens in the mail.
It’s possible to make a durable phone, just not when the whole damn thing is glass. At that point just use plastic, because it’s going to be covered by a plastic case
 anyway.
Well, congrats on it still working, but now you’re walking around with a beatup phone that will always look like a piece of shit. A $10 case would prevent that. An ugly beatup phone will just drive you to get another $1000 pocket super computer sooner. I bet that lens cost you more than $10.
Go buy a case.
What are you doing with your phone? iPhone 7, no case for over 2
 years. Phone has one little scuff in the top corner. I do have a pop socket, which I find reduced the droppability significantly.
No everyone has butter fingers.
LOL
, get a load of this guy.
P
robably because the specific panel they installed into the Pixel 3XL doesn’t have a foldable element(that the iPhone XS does).
The iPhone XS in truth has the screen fold
under
 the display, that’s how they were able to get the edge-to-edge at the bottom with 
no chin. It’s all an illusion.
Honestly, I personally hate phones with no "chin" whatsoever.  Edge to edge display certainly looks nice but with no buffer between the display and your hand it's too easy to accidentally interact with the edge of the screen.  I still accidentally hit enter a bunch of times with my palm while inputting text on my P2xl.
They did it specifically to troll people like you. Congratulations, you fell for it!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 无法使用python请求和beautifulsoup从页面提取信息 - Can't extract info from page using python requests and beautifulsoup 使用 Selenium、Python 和 XPATH 尝试从网站抓取图像 url,不起作用 - Using Selenium, Python and XPATH to try to grab image urls from a website, doesn't work 使用 beautifulsoup 从网站抓取多个网页,请求在 Python - Webscrape multiple webpages from a website using beautifulsoup, requests in Python Python selenium 无法从网站获取文本 - Python selenium can't get text from website 如何在python中使用硒和beautifulsoup从网站上抓报纸? - How to scrape newspaper articles from website using selenium and beautifulsoup in python? 使用 python 请求和 BeautifulSoup 从带有框架或 flexbox 的网站中抓取数据 - Scrape data from website with frames or flexbox using python requests and BeautifulSoup 如何使用 BeautifulSoup 和请求从网站获取数据? - How can I get data from a website using BeautifulSoup and requests? 无法使用 Python 和请求从网站下载图片 - Can't download pictures from website using Python and requests 无法使用 python 请求从特定网站抓取 - Can't scrape from a specific website using python requests BeautifulSoup python 模块在网站中找不到文本 - BeautifulSoup python module can't find text in a website
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM