简体   繁体   English

(Python)如何将我的代码从2.7转换为python 3

[英](Python) How to convert my code to python 3 from 2.7

I'm trying to build a basic website crawler, in Python. 我正在尝试使用Python构建基本的网站搜寻器。 However, the code that I've gathered from this website here is for python 2.7. 不过,我已经从本网站收集的代码这里就是Python 2.7。 I'm wondering how I can code this for python 3 or greater. 我想知道如何为python 3或更高版本进行编码。 I've began to try and convert it, but I keep running into errors. 我已经开始尝试将其转换,但是我一直遇到错误。

import re
import urllib

textfile = open('depth_1.txt', 'wt')
print("Enter the URL you wish to crawl..")
print('Usage  - "http://phocks.org/stumble/creepy/" <-- With the double quotes')
myurl = input("@> ")
for i in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(myurl).read(), re.I):
    print(i)
    for ee in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(i).read(), re.I):
        print(ee)
        textfile.write(ee+'\n')
textfile.close()

Prepare your Python2 code 准备您的Python2代码

Say 2.py 2.py

import re
import urllib

textfile = open('depth_1.txt', 'wt')
print("Enter the URL you wish to crawl..")
print('Usage  - "http://phocks.org/stumble/creepy/" <-- With the double quotes')
myurl = input("@> ")
for i in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(myurl).read(), re.I):
    print(i)
    for ee in re.findall('''href=["'](.[^"']+)["']''', urllib.urlopen(i).read(), re.I):
        print(ee)
        textfile.write(ee+'\n')
textfile.close()

Convert it with 2to3 2to3

2to3 -w 2.py

Now look into the directory with dir or ls 现在使用dirls查找目录

> dir
2016-09-24  01:53               533 2.py
2016-09-24  01:51               475 2.py.bak

2.py.bak is your original code and 2.py is Python 3 code. 2.py.bak是您的原始代码,而2.py是Python 3代码。

See what changes have been made 查看已进行的更改

import re
import urllib.request, urllib.parse, urllib.error

textfile = open('depth_1.txt', 'wt')
print("Enter the URL you wish to crawl..")
print('Usage  - "http://phocks.org/stumble/creepy/" <-- With the double quotes')
myurl = eval(input("@> "))
for i in re.findall('''href=["'](.[^"']+)["']''', urllib.request.urlopen(myurl).read(), re.I):
    print(i)
    for ee in re.findall('''href=["'](.[^"']+)["']''', urllib.request.urlopen(i).read(), re.I):
        print(ee)
        textfile.write(ee+'\n')
textfile.close()

This works if you are using only built-ins and standard modules. 如果仅使用内置模块和标准模块,则此方法有效。 In your case, it's ok. 就您而言,没关系。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM