How to select multiple text nodes as a single string using xpath expression?

Question

I am pretty new to xpath and I am trying to scrape a website using xpath expression in scrapy. The structure of the page that I am trying to scrape is-

...
<div class="article-body">
<p class="body">Text1</p>
<p class="body">Text2</p>
<p class="body">Text3</p>
...

The xpath that I am trying is-

//div[@class="article-body"]/p/text()

But all I get is Text1 in my database. Instead of this, I want the output as-

Text1.Text2.Text3

I think I should use concat or string-join or some function like that. But I am unable to work it out. Since I have to pass this xpath expression as an argument in scrapy, I need to have it as a single expression only.

I tried feeding the concat function into my django-scraper as-

concat(//div[@class="article-body"]/p)

But it threw this error at me-

File "C:\Anaconda2\lib\site-packages\scrapy\selector\unified.py", line 100, in xpath raise ValueError(msg if six.PY3 else msg.encode("unicode_escape"))

I got this same error when I tried (there is no other <p> element on the page)-

concat(//p)

or

string-join(//p)

However, when I am trying, string(//p) I am getting Text1 in my database.

Answer 1

have you try this :-

concat(//div[@class="article-body"]/p)

String values = myTestDriver.findElement(By.xpath("concat(//div[@class="article-body"]/p)"));

OR

You need to do something like this

    ArrayList<String> name;
    String name1;
    List<WebElement> options = myTestDriver.findElements(By.xpath("//div[@class="article-body"]/p"));
    System.out.println(options.size());
    for(int i=0 ; i<options.size() ; i++){
        System.out.println(options.get(i).getText());
        name1 = options.get(i).getText();
        name.add(name1);
    }

Now you can perform concatination

How to select multiple text nodes as a single string using xpath expression?

Question

1 answers

solution1
0 2015-12-31 09:09:22

How to select multiple text nodes as a single string using xpath expression?

Question

1 answers

solution1 0 2015-12-31 09:09:22

solution1
0 2015-12-31 09:09:22