简体   繁体   English

使用Python寻找构建简单的文件路径查找和替换程序

[英]Looking to a build a simple find and replace program for file paths using Python

I think I should start this off by saying that I'm very new to Python. 我认为我应该从说我是Python的新手开始。 I've gone through some lessons and picked up a fairly solid understanding of the fundamentals, but my skill set is still very limited and the results I achieve on my own will probably be very crude. 我已经上了几课,对基础知识有了相当扎实的理解,但是我的技能仍然很有限,而且我自己获得的结果可能非常粗糙。 I'm currently trying to put together a find and replace program that can output edited paths. 我目前正在尝试组合一个可以输出已编辑路径的查找和替换程序。 See below: 见下文:

filepaths = [
"\wwwtest\test\site\services\cogs\index.html",
"\wwwtest\test\site\dba\index.html"
]

find = "\wwwtest\test\site"

replace_one = "http:\\www.test.com"

replace_two = "https:\\www.test.com"

for paths in filepaths:
    print(paths.replace(find, replace_one))
    print(paths.replace(find, replace_two))

So in essence, what I'm trying to do is input file paths and output URLs. 因此,从本质上讲,我想做的是输入文件路径和输出URL。 Running the above code in a shell yields the following results: 在shell中运行上述代码会产生以下结果:

http:\www.test.com\services\cogs\index.html
https:\www.test.com\services\cogs\index.html
http:\www.test.com\dba\index.html
https:\www.test.com\dba\index.html

I think I'm more or less on the right track, but obviously there are some issues with the output, namely the removal of the second \\ preceding the www of each URL. 我认为我或多或少都在正确的轨道上,但是输出显然存在一些问题,即删除每个URL的www前面的第二个\\。 I imagine that this is the result of Python interpreting the double \\ as some kind of special character, but I've been unable to determine how best to resolve this issue. 我想这是Python将double \\解释为某种特殊字符的结果,但是我一直无法确定如何最好地解决此问题。

Following the successful replacement of the first block of the path/URL, I'll also need to replace each \\ with / (or maybe this should come first?) which is creating its own set of issues. 成功替换路径/ URL的第一个块之后,我还需要将每个\\替换为/(或者应该先出现?),这会产生自己的一系列问题。 The following code is what I've tried so far: 到目前为止,以下代码是我尝试过的:

urls = [
    "http:\www.test.com\services\cogs\index.html",
    "https:\www.test.com\services\cogs\index.html"
    ]

find = '\'

replace = '/'

for paths in urls:
    print(paths.replace(find, replace))

This code yields the following: 此代码产生以下内容:

SyntaxError: EOL while scanning string literal

I believe I'm missing some information regarding the use of slashes in Python. 我相信我缺少一些有关在Python中使用斜杠的信息。 Any advice on how best to develop such a program, and also how to make the code efficient would be very much appreciated. 非常感谢您提供有关如何最好地开发此类程序以及如何使代码高效的任何建议。 I also realize that hard coding the paths I want to manipulate into the program probably isn't the most effective way of doing this, but I may leave that for a separate question to keep this post from getting longer than it already is (although any advice on how to properly read from and write to text files for this example would also be appreciated). 我还意识到,硬编码要操作到程序中的路径可能不是执行此操作的最有效方法,但是我可能将其留给一个单独的问题,以防止发布的内容超过已经存在的时间(尽管对于如何正确读取和写入此示例的文本文件的建议,也将不胜感激)。

Thank you, D 谢谢你D

In python, \\ is the escape character, so if you want to literally represent \\ you have to repeat it twice, ie "C:\\\\Windows\\\\path" or define it as a string literal in place by putting an r before the string, r"C:\\Windows\\path" 在python中, \\是转义字符,因此,如果要字面表示\\ ,则必须重复两次,即"C:\\\\Windows\\\\path"或通过在r之前加一个r将其定义为字符串文字。字符串, r"C:\\Windows\\path"

Here are some good examples to show you how it operates. 这里有一些很好的例子,向您展示它是如何工作的。

The \\ is the beginning of an escape sequence . \\转义序列的开始。

In the first part if you wish to output two "\\\\" you will need to write it like this: 在第一部分中,如果您希望输出两个"\\\\" ,则需要这样编写:

replace_one = "http:\\\\www.test.com"

replace_two = "https:\\\\www.test.com"

if you wish to find a \\ you will need to do it like this to escape the escape sequence: 如果希望找到\\ ,则需要执行以下操作以转义转义序列:

find = '\\'

Every time you want to use \\ , use instead \\\\ . 每次要使用\\ ,请改用\\\\ Consequently, if you want \\\\ , use \\\\\\\\ : 因此,如果需要\\\\ ,请使用\\\\\\\\

filepaths = [
"\\wwwtest\\test\\site\\services\\cogs\\index.html",
"\\wwwtest\\test\\site\\dba\\index.html"
]

find = "\\wwwtest\\test\\site"

replace_one = "http:\\\\www.test.com"

replace_two = "https:\\\\www.test.com"

for paths in filepaths:
    print(paths.replace(find, replace_one))
    print(paths.replace(find, replace_two))

and

urls = [
    "http:\\\\www.test.com\\services\\cogs\\index.html",
    "https:\\\\www.test.com\\services\\cogs\\index.html"
    ]

find = '\\'

replace = '/'

for paths in urls:
    print(paths.replace(find, replace))

should fix your problem. 应该解决您的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM