简体   繁体   English

Python:从字典中替换文本文件中的多个单词

[英]Python: replacing multiple words in a text file from a dictionary

I am having trouble figuring out where I'm going wrong. 我无法弄清楚我哪里出错了。 So I need to randomly replace words and re-write them to the text file, until it no longer makes sense to anyone else. 因此,我需要随机替换单词并将其重新写入文本文件,直到对其他人不再有意义。 I chose some words just to test it, and have written the following code which is not currently working: 我选择了一些单词来测试它,并编写了以下代码,目前无法正常工作:

# A program to read a file and replace words until it is no longer understandable

word_replacement = {'Python':'Silly Snake', 'programming':'snake charming', 'system':'table', 'systems':'tables', 'language':'spell', 'languages':'spells', 'code':'snake', 'interpreter':'charmer'}

main = open("INF108.txt", 'r+')

words = main.read().split()

main.close()

for x in word_replacement:    
    for y in words:
        if word_replacement[x][0]==y:
            y==x[1]

text = " ".join(words)

print text

new_main = open("INF108.txt", 'w')
new_main.write(text)
new_main.close()

This is the text in the file: 这是文件中的文字:

Python is a widely used general-purpose, high-level programming language. Python是一种广泛使用的通用高级编程语言。 It's design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in languages such as C++ or Java. 它的设计理念强调代码可读性,其语法允许程序员用比C ++或Java等语言更少的代码行表达概念。 The language provides constructs intended to enable clear programs on both a small and large scale.Python supports multiple programming paradigms, including object-oriented, imperative and functional programming or procedural styles. 该语言提供了用于在小规模和大规模上实现清晰程序的构造.Python支持多种编程范例,包括面向对象,命令式和函数式编程或程序样式。 It features a dynamic type system and automatic memory management and has a large and comprehensive standard library.Python interpreters are available for installation on many operating systems, allowing Python code execution on a wide variety of systems. 它具有动态类型系统和自动内存管理功能,并具有大型全面的标准库.Python解释器可在许多操作系统上安装,允许在各种系统上执行Python代码。 Using third- party tools, such as Py2exe or Pyinstaller, Python code can be packaged into stand-alone executable programs for some of the most popular operating systems, allowing for the distribution of Python-based software for use on those environments without requiring the installation of a Python interpreter. 使用第三方工具,如Py2exe或Pyinstaller,可以将Python代码打包到一些最流行的操作系统的独立可执行程序中,允许分发基于Python的软件,以便在这些环境中使用而无需安装一个Python解释器。

I've tried a few methods of this but as someone new to Python it's been a matter of guessing, and the last two days spent researching it online, but most of the answers I've found are either far too complicated for me to understand, or are specific to that person's code and don't help me. 我已经尝试了一些方法,但作为Python的新手,这是一个猜测的问题,并且最近两天花在网上进行研究,但我发现的大部分答案要么太复杂,我不能理解,或是特定于该人的代码,并没有帮助我。

OK , let's take this step by step. 好的 ,让我们一步一步来。

main = open("INF108.txt", 'r+')
words = main.read().split()
main.close()

Better to use the with statement here. 最好在这里使用with语句。 Also, r is the default mode. 此外, r是默认模式。 Thus: 从而:

with open("INF108.txt") as main:
    words = main.read().split()

Using with will make main.close() get called automatically for you when this block ends; 使用with将使main.close()在此块结束时自动为您调用; you should do the same for the file write at the end as well. 你也应该为最后的文件写做同样的事情。


Now for the main bit: 现在为主要位:

for x in word_replacement:    
    for y in words:
        if word_replacement[x][0]==y:
            y==x[1]

This little section has several misconceptions packed into it: 这个小部分包含了几个误解:

  1. Iterating over a dictionary ( for x in word_replacement ) gives you its keys only. 迭代字典( for x in word_replacement )只给出了它的 Thus, when you want to compare later on, you should just be checking if word_replacement[x] == y . 因此,当您想稍后进行比较时,您应该检查if word_replacement[x] == y Doing a [0] on that just gives you the first letter of the replacement. 在那上面做[0]只会给你替换的第一个字母
  2. Iterating over the dictionary is defeating the purpose of having a dictionary in the first place. 迭代字典就是打破了首先使用字典的目的。 Just loop over the words you want to replace, and check if they're in the dictionary using y in word_replacement . 只需遍历要替换的单词,并使用y in word_replacement 检查它们是否在字典y in word_replacement
  3. y == x[1] is wrong in two ways. y == x[1]两个方面是错误的。 First of all, you probably meant to be assigning to y there, not comparing (ie y = x[1] -- note the single = sign). 首先,你可能意味着在那里分配 y ,而不是比较 (即y = x[1] - 注意单个=符号)。 Second, assigning to a loop variable doesn't even do what you want. 其次,分配给循环变量甚至不能做你想要的。 y will just get overwritten with a new value next time around the loop, and the words data will NOT get changed at all. y将在下一次循环中被新值覆盖,并且words数据将根本不会被更改。

What you want to do is create a new list of possibly-replaced words, like so: 你想要做的是创建一个可能被替换的单词的列表,如下所示:

replaced = []
for y in words:
    if y in word_replacement:
        replaced.append(word_replacement[y])
    else:
        replaced.append(y)
text = ' '.join(replaced)

Now let's do some refinement. 现在让我们做一些改进。 Dictionaries have a handy get method that lets you get a value if the key is present, or a default if it's not. 字典有一个方便的get方法,可以让你在键存在时得到一个值,如果没有则可以得到默认值。 If we just use the word itself as a default, we get a nifty reduction: 如果我们只使用单词本身作为默认值,我们会得到一个漂亮的减少:

replaced = []
for y in words:
    replacement = word_replacement.get(y, y)
    replaced.append(replacement)
text = ' '.join(replaced)

Which you can just turn into a one-line list-comprehension : 您可以将其转变为单行列表理解

text = ' '.join(word_replacement.get(y, y) for y in words)

And now we're done. 现在我们已经完成了。

It looks like you want something like this as your if statement in the nested loops: 看起来你想要这样的东西作为嵌套循环中的if语句:

if x==y:
    y=word_replacement[x]

When you loop over a dictionary, you get its keys, not key-value pairs: 循环遍历字典时,会获得其键,而不是键值对:

>>> mydict={'Python':'Silly Snake', 'programming':'snake charming', 'system':'table'}
>>> for i in mydict:
...    print i
Python
programming
system

You can then get the value with mydict[i] . 然后,您可以使用mydict[i]获取值。

This doesn't quite work, though, because assigning to y doesn't change that element of words . 但是,这并不常用,因为赋值给y并不会改变words元素。 You can loop over its indices instead of elements to assign to the current element: 您可以遍历其索引而不是元素以分配给当前元素:

for x in word_replacement:    
    for y in range(len(words)):
        if x==words[y]:
            words[y]=word_replacement[x]

I'm using range() and len() here to get a list of indices of words ( [0, 1, 2, ...] ) 我在这里使用range()len()来获取words索引列表( [0, 1, 2, ...]

Your issue is probably here: 你的问题可能在这里:

if word_replacement[x][0]==y:

Here's a small example of what is actually happening, which is probably not what you intended: 这是实际发生的一个小例子,可能不是你想要的:

w = {"Hello": "World", "Python": "Awesome"}
print w["Hello"]
print w["Hello"][0]

Which should result in: 哪个应该导致:

"World"
"W"

You should be able to figure out how to correct the code from here. 您应该能够从这里弄清楚如何更正代码。

You used word_replacement (which is a dictionary) in a wrong way. 你以错误的方式使用word_replacement (这是一个字典)。 You should change for loop to something like this: 您应该将for循环更改为以下内容:

for y in words:
    if y in word_replacement:
        words[words.index(y)] = word_replacement[y]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM