pythons lxml.html, grab all at once

Question

Using lxml.html, I was able to get the data-pid using fromstring(source).xpath('/html/body/article/section/div[1]/div[2]/p[2]')[0].get('data-pid')

However, it only returns one of them (in this case 4559733570). I recall being able to grab all of them at once, but I don't remember how. Can somebody point me in the right direction?

HTML Code looks like this:

http://i.imgur.com/hn0Jqyi.png

Answer 1

xpath, directly returning all the values

Assuming you care about attributes data-pid in all p elements:

>>> fromstring(source).xpath("//p/@data-pi")

shall return what you need.

Answer 2

From your png and xpath query, it seems like all the <p> elements you are interested in are nested in the same <div> . The xpath query /html/body/article/section/div[1]/div[2]/p[2] will return only the second <p> element in the selected div ( [2] ). If you want all the paragraphs in the div, use /html/body/article/section/div[1]/div[2]/p .

[ p.get("data-pid") for p in fromstring(source).xpath('/html/body/article/section/div[1]/div[2]/p') ]

pythons lxml.html, grab all at once

Question

2 answers

solution1
0 2014-07-09 03:40:19

xpath, directly returning all the values

solution2
0 ACCPTED 2014-07-09 04:11:58

pythons lxml.html, grab all at once

Question

2 answers

solution1 0 2014-07-09 03:40:19

xpath, directly returning all the values

solution2 0 ACCPTED 2014-07-09 04:11:58

solution1
0 2014-07-09 03:40:19

solution2
0 ACCPTED 2014-07-09 04:11:58