简体   繁体   English

用于抓取 web gui 字符串的脚本,Python 2.7 上的错误在 Python 3 上运行

[英]Script to scrape web gui for string, errors on Python 2.7 runs on Python 3

As part of my job I may need to access the GUI of a Cisco phone (Cisco 8851, 7960, etc.) to determine what MAC address is assigned to the device.作为我工作的一部分,我可能需要访问 Cisco 电话(Cisco 8851、7960 等)的 GUI 以确定分配给设备的 MAC 地址。 The MAC address is prefixed with SEP (eg SEPAABBCCDDEEFF) as the syslogs provided by the call processing server, on specific events, only includes the IP address, not the MAC address. MAC 地址以 SEP 为前缀(例如 SEPAABBCCDDEEFF)作为呼叫处理服务器提供的系统日志,在特定事件上,仅包括 IP 地址,不包括 MAC 地址。 MAC address is needed to confirm if a given set of configurations exist for that device on the server.需要 MAC 地址来确认服务器上该设备是否存在给定的一组配置。 Manually pulling up a chunk of 50-100 phones to check MAC addresses via http is awful.通过 http 手动拉起一大块 50-100 部手机来检查 MAC 地址是很糟糕的。 I tried to automate this, and sort of did, but I missed the mark in Python3 and crapped out entirely on Python2, won't run past user input collection.我尝试将其自动化,并且确实做到了,但我错过了 Python3 中的标记并完全在 Python2 上失败,不会运行超过用户输入集合。

My questions:我的问题:

1) General question regarding awk and/or similar tools -- to check the data returned from the website for the necessary SEP* string I use awk to print out multiple instances of SEP* that exist on the same line, but it provides two outputs rather than just the first. 1) 关于 awk 和/或类似工具的一般问题 - 检查从网站返回的数据是否有必要的 SEP* 字符串,我使用 awk 打印出存在于同一行的多个 SEP* 实例,但它提供两个输出而不仅仅是第一个。 I've tried using "grep -o "SEP*" but this provides only SEP as the response. Thoughts on how I can have this return the first instance of SEP* (eg SEPAABBCCDDEEFF) only, instead of an entire line of html code?我试过使用“grep -o”SEP*”,但这只提供 SEP 作为响应。关于如何让它返回 SEP* 的第一个实例(例如 SEPAABBCCDDEEFF)的想法,而不是整行 html 代码?

Issue #1 - As you can see the awk attempt does provide the first instance cleanly but the second instance it provides a lengthy bit of garbage on both ends.问题 #1 - 如您所见,awk 尝试确实干净地提供了第一个实例,但第二个实例在两端都提供了很长的垃圾。 My intention was to only provide a single SEP* value per web link it parses.我的意图是只为它解析的每个 web 链接提供一个 SEP* 值。

@ubuntu:~/Scripts/CiscoScripts$ python transientPhones.py 
How many phones?: 1
What is the phone IP address?: <ip-addr>
SEPAABBCCDDEEFF
width=20></TD><TD><B>SEPAABBCCDDEEFF</B></TD></TR><TR><TD><B>
kenneth@ubuntu:~/Scripts/CiscoScripts$ 

2) Running the script in a Python 2.7 environment, the script fails on SyntaxError: invalid syntax after collection of the user input. 2) 在 Python 2.7 环境中运行脚本,脚本因 SyntaxError 失败:收集用户输入后的语法无效。 I am failing to understand why (beyond I'm doing it wrong or in an incompatible way).我不明白为什么(除了我做错了或以不兼容的方式)。 My home environment is python 3.x (latest) and I did not take that into consideration when working up scripts to use in a python 2.7 environment, and as I am new to python and coding I have really only been learning python3 syntax and conventions.我的家庭环境是 python 3.x(最新),在编写要在 python 2.7 环境中使用的脚本时,我没有考虑到这一点,因为我是 python 的新手,而语法和学习编码我真的只有 python . Any thoughts here?这里有什么想法吗?

Issue #2 -- This one has me confused.问题#2——这个让我很困惑。 I'm sure there's a simple answer/solution here... I'm not experienced enough to see it.我敢肯定这里有一个简单的答案/解决方案......我没有足够的经验来看到它。

$:python transientPhones.py
How many phones?: 1
What is the phone IP address?: 192.168.1.1
Traceback (most recent call last):
  File "transientPhones.py", line 13, in <module>
    ipAddress.append(input('What is the phone IP address?: '))
  File "<string>", line 1
    192.168.1.1
            ^
SyntaxError: invalid syntax

Code:代码:

#!/usr/bin/python
#import required modules
import subprocess

#Define Variables
x = input('How many phones?: ')
x = int(x)
ipAddress = []

#Loop to grab IP addresses
for i in range(x) : #Loop X amount of times based on input from user
        ipAddress.append(input('What is the phone IP address?: '))

#Grab XML Data and awk it for SEP*.
for n in ipAddress :
        subprocess.call ("curl --max-time 5 -s http://" + n + "/CGI/Java/Serviceability?adapter=device.statistics.device | awk '/SEP*/{for(i=1;i<=NF;++i)if($i~/SEP*/)print $i}'", shell=True)

You get this error because input in Python 2 and input in Python 3 are absolutely different functions.您会收到此错误,因为 Python 2 中的输入和 Python 3 中的输入是完全不同的功能。

In Python 2, input gets user input and executes it , while in Python 3 you can achieve the same thing using eval(input(...)) .在 Python 2 中, input获取用户输入并执行它,而在 Python 3 中,您可以使用eval(input(...))实现相同的目的。 So, you call 192.168.1.1 as Python code.因此,您将 192.168.1.1 称为 Python 代码。 Of course, this is not a valid Python code, hence you get the SyntaxError.当然,这不是有效的 Python 代码,因此您会收到 SyntaxError。

In Python 3, input just gets user input , while in Python 2 raw_input does the same thing.在 Python 3 中, input只是获取用户输入,而在 Python 2 中, raw_input做同样的事情。

This means that in your case you need to use raw_input for Python 2 and input for Python 3. Of course, you may just replace input with raw_input , but if you try to run the code on Python 3 it wouldn't work. This means that in your case you need to use raw_input for Python 2 and input for Python 3. Of course, you may just replace input with raw_input , but if you try to run the code on Python 3 it wouldn't work.

There are some good solutions to this problem.这个问题有一些很好的解决方案。 You may redefine the input function with raw_input on Python 2 and leave everything as it is on Python 3.您可以使用 Python 2 上的 raw_input 重新定义输入 function 并将所有内容保留在 Python 3 上。

#!/usr/bin/python
#import required modules
import subprocess

try:
    input = raw_input
except NameError:
    pass

#Define Variables
x = input('How many phones?: ')
x = int(x)
ipAddress = []

#Loop to grab IP addresses
for i in range(x) : #Loop X amount of times based on input from user
        ipAddress.append(input('What is the phone IP address?: '))

#Grab XML Data and awk it for SEP*.
for n in ipAddress:
        subprocess.call("curl --max-time 5 -s http://" + n + "/CGI/Java/Serviceability?adapter=device.statistics.device | awk '/SEP*/{for(i=1;i<=NF;++i)if($i~/SEP*/)print $i}'", shell=True)

Look at this answer for more information: Use of input/raw_input in python 2 and 3查看此答案以获取更多信息: 在 python 2 和 3 中使用 input/raw_input

I want to post this as an answer because it accomplishes what I was trying to do much more eloquently, in terms of actually using python and modules as opposed to just spawning a subprocess to run regular commands.我想将此作为答案发布,因为它更雄辩地完成了我试图做的事情,就实际使用 python 和模块而言,而不是仅仅产生一个子进程来运行常规命令。 I'll also link to where I store my script.我还将链接到我存储脚本的位置。

https://github.com/Unhall0w3d/mind-enigma/blob/master/transientPhones_v2.py https://github.com/Unhall0w3d/mind-enigma/blob/master/transientPhones_v2.py

Caveats: No error handling.警告:没有错误处理。 If the script times out attempting to hit the http page, huge traceback.如果脚本在尝试访问 http 页面时超时,则会出现巨大的回溯。 http needs to be accessible. http 需要可访问。 Target URL structure has to be the same (and is in Cisco IP Communicator software, as well as most of their 7XXX, 8XXX and 9XXX series phones).目标 URL 结构必须相同(并且在 Cisco IP Communicator 软件以及他们的大部分 7XXX、8XXX 和 9XXX 系列电话中)。

Script:脚本:

#!/usr/var/python

import re
import requests
from bs4 import BeautifulSoup

#Define how many phones we need to hit
x = input('How many phones?: ')
x = int(x)

#Collect IP addresses
ipAddress = []

#Here we loop to grab the list of IP Addresses to access.
for i in range(x):
        ipAddress.append(input('What is the phone IP address?: '))

#Here we loop to access each IP address provided (equivalent of Network Configuration page) to collect Device Type + MAC + Registered state.
for n in ipAddress:
        URL = 'http://' + n + '/CGI/Java/Serviceability?adapter=device.statistics.configuration' #URL is dynamically created based on IPs collected
        page = requests.get(URL, timeout=6)
        soup = BeautifulSoup(page.content, 'html.parser')
#looking for instance of SEP* or CIPC*, such as CIPCKPERRY or SEPAABBCCDDEEFF. Returned as variable 'results'
        results = soup.find(text=re.compile('SEP*|CIPC*'))
#looking for instance of "Active" on the webpage indicating device is registered to a given CCM. Returned as variable 'results2'
        results2 = soup.find_all(text=re.compile('Active'))
#conditional statement that dictates if "Active" is not found, report only the device model and name. Otherwise report the device it is registered to. (e.g. cucmpub.ipt.local Active)
        if results2 is None:
                print(results)
        else:
                print(results, results2)

Expected Return:预期收益:

kenneth@ubuntu:~/Scripts/CiscoScripts$ python transientPhones_v2.py
How many phones?: 2
What is the phone IP address?: 1.1.1.1
What is the phone IP address?: 2.2.2.2
Cisco Unified IP Phone Cisco Communicator ( CIPCKPERRY ) ['SERVER-FQDN   Active']
Cisco IP Phone CP-8851 ( SEPAABBCCDDEEFF ) ['SERVER-FQDN  Active']

Use Case: Transient Connection Events on Cisco Unified Communications Manager generate syslogs that report back a phone/endpoint IP address that attempted to register against the server but failed mid-process.使用案例:Cisco Unified Communications Manager 上的瞬态连接事件生成系统日志,报告尝试向服务器注册但在过程中失败的电话/端点 IP 地址。 This can be due to rehoming to a higher priority server, lack of server side configuration for the endpoint, loss of network on the side of the endpoint.这可能是由于重新定位到更高优先级的服务器、端点缺少服务器端配置、端点端网络丢失。 Looking through the web pages manually takes significantly more time to identify the MAC address associated with the endpoint.手动查看 web 页面需要更多时间来识别与端点关联的 MAC 地址。 As the IP can typically vary for the endpoints their configurations are saved against the phone model and MAC address.由于 IP 通常会因端点而异,因此它们的配置会根据电话 model 和 MAC 地址保存。 This dramatically speeds up the collection of those models, mac addresses, and IF the phone is actively registered against a server (eg if Active is found in the html response -- results2), report back the server FQDN/IP and Active (as they are held within the same html tag).这大大加快了收集这些型号、mac 地址的速度,如果手机主动注册到服务器(例如,如果在 html 响应中找到 Active —— 结果 2),则报告服务器 FQDN/IP 和 Active(因为它们保存在同一个 html 标签内)。

Again, I'm a novice, but in about an hour this morning I was able to get this rolling and it's functional for what it is.再说一次,我是个新手,但是今天早上大约一个小时后,我就可以开始滚动了,而且它可以正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM