简体   繁体   English

从 Python 上的 .TXT 文件解析 IP 地址

[英]Parsing IP Adresses from a .TXT file on Python

So I am trying to take all the IP addresses from a .txt and have them print out in a new text file, one per line on Python所以我试图从 .txt 中获取所有 IP 地址,并将它们打印在一个新的文本文件中,在 Python 上每行一个

The text looks like this文字看起来像这样

  1    <1 ms    <1 ms    <1 ms  192.168.0.1  
  2    47 ms    14 ms    27 ms  cpe-67-254-0-1.nycap.res.rr.com [67.254.0.1]  
  3    30 ms    29 ms    21 ms  g3-27.glvlny09-rtr001.albany.rr.com  [24.29.45.249]  
  4    26 ms    12 ms    11 ms  24.58.33.254  
  5    19 ms    19 ms    19 ms  be26.rochnyei01r.northeast.rr.com [24.58.32.52]  
  6    33 ms    35 ms    35 ms  bu-ether45.chcgildt87w-bcr00.tbone.rr.com [107.14.19.106]  
  7    31 ms    30 ms    31 ms  0.ae1.pr1.chi10.tbone.rr.com [107.14.17.194]  
  8    32 ms    30 ms    44 ms  216.1.94.65  
  9    39 ms    38 ms    40 ms  207.88.13.128.ptr.us.xo.net [207.88.13.128]  
 10    38 ms    43 ms    37 ms  207.88.12.167.ptr.us.xo.net [207.88.12.167]  
 11    36 ms    39 ms    37 ms  207.88.14.181.ptr.us.xo.net [207.88.14.181]  
 12   176 ms   153 ms   147 ms  209.48.42.54  
 13    43 ms    43 ms    43 ms  216.239.46.248  
 14    43 ms    44 ms    46 ms  72.14.236.98  
 15    51 ms    48 ms    50 ms  72.14.232.73  
 16    60 ms    60 ms    58 ms  216.239.47.39  
 17    67 ms    75 ms    69 ms  216.239.59.82  
 18   100 ms    97 ms    98 ms  216.239.41.138  
 19    99 ms    98 ms   100 ms  64.233.174.191  
 20   101 ms   100 ms    99 ms  209.85.241.73  
 21   100 ms    98 ms    99 ms  lax17s04-in-f14.1e100.net [216.58.219.46] 

I was told to use splicing, but I;ve written some code about 4 times and nothing seems to work, similar questions are on here but I can't get them to work有人告诉我使用拼接,但我已经写了大约 4 次代码,但似乎没有任何效果,类似的问题在这里,但我无法让它们工作

Any sense of direction would help!任何方向感都会有所帮助!

this is what I have so far这是我迄今为止所拥有的

file = open('ipaddr.txt', 'r')
        ips = []
        for text in file.readlines():
            text = text.rstrip()
            found = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})',text)
            if regex:
                ips.extend(found)

In your case the IP address seems to always be in the last column, so this should suffice:在你的情况下,IP 地址似乎总是在最后一列,所以这应该就足够了:

with open('file.txt', 'r') as f:
    for line in f:
        cols = line.split() # split line at whitespace
        ip = cols[-1] # get last column
        ip = ip.strip('[]') # remove brackets
        print(ip) # print the IP address

If you really need to use splicing, you can use last_part = line[32:] to get the part containing the domain names and IP addresses.如果确实需要使用拼接,可以使用last_part = line[32:]得到包含域名和IP地址的部分。 Then you need to check if a bracket (or space) is in this part.然后你需要检查这个部分是否有括号(或空格)。 If there isn't, print last_part , else use .split() and .strip() or .index() and splicing to get the part between the brackets and print it.如果没有,则打印last_part ,否则使用.split().strip().index()并拼接以获取括号之间的部分并打印它。

Use a regular expression:使用正则表达式:

import re 

self.proxy_list = []
for line in open(file, "r+").readlines():
      pattern = re.compile(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{2,5})') `enter 
      self.proxy_list.append(pattern.search(line)[0])    

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM