[英]Python script to collect all hostnames of ip addresses with only prime entities
I have a Python script to collect hostnames of ip address with primes as byte entities. 我有一个Python脚本来收集以质数作为字节实体的ip地址的主机名。 Eg, 211.13.17.2 is a valid ip according to my problem set where every byte entity(decimal representation) is a prime.
例如,根据我的问题集,211.13.17.2是有效的ip,其中每个字节实体(十进制表示形式)都是质数。
Code: 码:
from itertools import product
import socket
# prime or not
def prime(n):
if n > 1:
p = 0
for i in range(2, n-1):
if divmod(n, i)[1] == 0:
p = 1
break
if p == 0:
return True
def get_host_name(b1, b2, b3, b4):
addr = str(b1) + '.' + str(b2) + '.' + str(b3) + '.' + str(b4)
try:
return socket.gethostbyaddr(addr)
except socket.herror:
pass
# find host names whose ip addresses are all primes
byte = [b for b in range(0, 256) if prime(b)]
ips = list(product(byte, byte, byte, byte))
print 'Total ips = ', len(ips)
for ip in ips:
if get_host_name(*ip):
print get_host_name(*ip)
The problem is my script is too slow. 问题是我的脚本太慢了。 I need expert help to optimize this code.
我需要专家帮助来优化此代码。 Please pinpoint all mistakes and ways to make it behave faster.
请查明所有错误和使其更快运行的方法。
for the prime numbers, you can use something like this, 对于质数,您可以使用类似这样的方法,
import numpy as np
isprime = lambda x: np.all(np.mod(x, range(2, 1 + int(np.sqrt(x)))))
primes = np.array([ x for x in range(2, 255) if isprime(x) ])
and you can have a generator for ip addresses by 您可以通过以下方式为IP地址生成一个
('{}.{}.{}.{}'.format(*x) for x in itertools.product(primes, repeat=4))
but most likely the code is slow in the socket
part, and because of the number of combinations that it needs to check; 但是最有可能代码在
socket
部分运行缓慢,并且由于需要检查的组合数量过多; for that you may try parallelism, by using a pool of worker processes; 为此,您可以使用工作进程池来尝试并行处理; something like this:
像这样的东西:
from multiprocessing import Pool
from socket import gethostbyaddr
def gethost(addr):
try:
return gethostbyaddr(addr)
except:
pass
if __name__ == '__main__':
p = Pool(3)
print (p.map(gethost,['74.125.228.137',
'11.222.333.444',
'17.149.160.49',
'98.139.183.24']))
edit : for only prime numbers less than 50, (50K+ combinations) and 20 worker processes it takes almost 6 minutes on my machines and it finds 16K+ results. 编辑 :对于仅小于50的质数((50K +个组合)和20个工作进程),在我的机器上花费了将近6分钟,并且发现了16K +个结果。 so, with this huge number of combinations parallelism cannot help much.
因此,使用如此众多的组合,并行性无济于事。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.