简体   繁体   English

使用硒从网页捕获文本

[英]Capture a text from webpage using selenium

I am trying to capture a piece of text from a website which keeps changing. 我正在尝试从不断变化的网站中捕获一段文字。 It looks like : 看起来像 :

Order ID : XXIO-123344-3456 订单ID:XXIO-123344-3456

The prefix is constant but the numbers will always change. 前缀是常数,但数字始终会变化。 I want to capture this number and store it. 我想捕获此数字并将其存储。 I have tried storeTextPresent with regex pattern regexp:Email.*@.*com . 我已经尝试storeTextPresent使用正则表达式模式regexp:Email.*@.*com storeTextPresent It does return me a True but it does not return me the value. 它确实为我返回了True,但没有返回值。 Of course storeTextPresent is supposed to only return True or False . 当然storeTextPresent应该只返回TrueFalse So now how can I capture the exact value? 那么,现在如何获取确切的值?

Here's a screen shot of the part of the webpage. 这是网页部分的屏幕截图。 Can't show the whole page, so sorry. 无法显示整个页面,很抱歉。

在此处输入图片说明

So any ideas guys? 那么有什么想法吗?

I export these test after recording into python remote control. 记录到python远程控制后,我导出这些测试。 So python specific code is more welcome. 因此,更欢迎使用python特定的代码。

assertText command with the regular expression regexp:^XXIO-.+ would resolve the issue. 使用正则表达式regexp:^XXIO-.+ assertText命令将解决此问题。 Try this in conjunction with the element id you need to verify. 尝试将其与您需要验证的元素ID结合使用。

Just had a look in the manual. 刚刚看了一下手册。 I found the storeText command under Store Commands and Selenium Variables . 我在Store Commands和Selenium Variables下找到了storeText命令。 My guess is that if you use storeText instead of storeTextPresent. 我的猜测是,如果您使用storeText而不是storeTextPresent。

Also, instead of trying to find the text using a regex pattern, you could try using an xpath , DOM , or CSS locator. 另外,可以尝试使用xpathDOMCSS定位器,而不是尝试使用正则表达式模式查找文本。

Thanks for the idea but i could'nt find a locater for the text. 感谢您的想法,但我找不到该文本的定位器。 This is the code i captured using firebug . 这是我使用firebug捕获的代码。

<div class="chkOutBox">
<h2 id="tnq" class="marb10">Order Details</h2>
<div class="ordRevAddressArea">
<div class="ordRevDelSlotArea">
<div class="clear"></div>
<div class="bFont">Order ID:&nbsp; BBO-72262-171012</div>
<div class="scartPgHdr">
<h3 class="catHdr">Fruits &amp; Vegetables</h3>

Here we are capturing the id number (Line 6)......May be someone can tell me how to figure out a possible locater for this from the above code ....By the way I solved my problem by capturing the URL of the page which had the order ID.I used a regex to seperate out the order id and thats it.......Its only a temporary solution....... 在这里,我们正在捕获ID号(第6行)......也许有人可以告诉我如何从上面的代码中找出一个可能的定位器...通过我捕获问题的方式带有订单ID的页面的URL。我使用了正则表达式将订单ID分开,仅此而已.......这只是一个临时解决方案.......

Python Code. Python代码。

def get_order_id(driver):
    """ Gets the order id, given an Order Details page. """
    try:
        bFonts = driver.find_element_by_class_name("bFont")
        for element in bFonts:
            if "Order ID" in element.text:
                return element.text.split()[-1]
    except NoSuchElementException:
        return None

This assumes that the class name, bFont never changes. 这假定类名称bFont永不更改。 If it does, you can rewrite it to search over the div tag. 如果是这样,则可以将其重写以搜索div标签。 It also assumes that "Order ID" will be found. 它还假定将找到“订单ID”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM