How to extract text which lies after <strong> tag in element

Question

Trying to extract text from a element which looks like this:

<div><strong>"Beginning_of_text"</strong>"Rest_of_text"</div>

When I try to extract "Rest_of_text" using Scrapy shell with

response.css("div::text").extraxt()

It gives me nothing. Do I have to use some special command to get to text that lies after a <strong> tag inside an element?

Answer 1

仅对于“ Rest_of_text”，可以使用response.xpath('//div/strong/following-sibling::text()').get()

Answer 2

Given the text you provided, the command you've mentioned should've returned the following:

['"Rest_of_text"']

The problem may occur if there is whitespace before strong tag, eg:

<div>   <strong>"Beginning_of_text"</strong>"Rest_of_text"</div>

In this case, if you execute the same command, you'll get this:

['   ', '"Rest_of_text"']

But in case if there's nothing after the strong tag, you'll get this:

['   ']

The best way to handle all these cases I know is to do the following:

>>> full_text = ''.join(response.xpath('//div//text()').extract())
>>> before_strong, after_strong = full_text.split(response.css('strong::text').extract_first())

So in the text you've provided, before_strong will be equal to '' and after_strong will be equal to '"Rest_of_text"' , which seems to be what you want to get.

How to extract text which lies after <strong> tag in element

Question

2 answers

solution1
2 2018-11-07 12:41:21

solution2
0 ACCPTED 2018-11-06 12:03:09

How to extract text which lies after <strong> tag in element

Question

2 answers

solution1 2 2018-11-07 12:41:21

solution2 0 ACCPTED 2018-11-06 12:03:09

solution1
2 2018-11-07 12:41:21

solution2
0 ACCPTED 2018-11-06 12:03:09