简体   繁体   English

Python生成器对象和.join

[英]Python generator objects and .join

Just a fundamental question regarding python and .join() method: 关于python和.join()方法的一个基本问题:

file1 = open(f1,"r")
file2 = open(f2,"r")
file3 = open("results","w")

diff = difflib.Differ()
result = diff.compare(file1.read(),file2.read())
file3.write("".join(result)),

The above snippet of code yields a nice output stored in a file called "results", in string format, showing the differences between the two files line-by-line. 上面的代码片段产生了一个不错的输出,该输出以字符串格式存储在名为“ results”的文件中,逐行显示了两个文件之间的差异。 However I notice that if I just print "result" without using .join(), the compiler returns a message that includes a memory address. 但是,我注意到,如果使用.join() 仅打印“结果”,则编译器将返回一条包含内存地址的消息。 After trying to write the result to the file without using .join(), I was informed by the compiler that only strings and character buffers may be used in the .join() method, and not generator objects. 尝试使用.join()将结果写入文件后,编译器通知我,.join()方法中只能使用字符串和字符缓冲区,而生成器对象中不能使用。 So based off of all the evidence that I have adduced, please correct me if I am wrong: 因此,根据我引用的所有证据,如果我错了,请纠正我:

  1. result = diff.compare(file1.read(),file2.read()) <---- result is a generator object? result = diff.compare(file1.read(),file2.read()) <----结果是生成器对象?

  2. result is a list of strings, with result itself being the reference to the first string? result是一个字符串列表, result本身是对第一个字符串的引用?

  3. .join() takes a memory address and points to the first, and then iterates over the rest of the addresses of strings in that structure? .join()接受一个内存地址并指向第一个,然后迭代该结构中字符串的其余地址?

  4. A generator object is an object that returns a pointer? 生成器对象是返回指针的对象吗?

I apologize if my questions are unclear, but I basically wanted to ask the python veterans if my deductions were correct. 如果我的问题不清楚,我深表歉意,但我基本上想问一下python的退伍军人,我的推论是否正确。 My question is less about the observable results, and more so about the inner workings of python. 我的问题不是关于可观察的结果,而是关于python的内部工作原理。 I appreciate all of your help. 感谢您的帮助。

join is a method of strings. join是字符串的方法。 That method takes any iterable and iterates over it and joins the contents together. 该方法采用任何可迭代的方法并对其进行迭代,然后将内容连接在一起。 (The contents have to be strings, or it will raise an exception.) (内容必须为字符串,否则将引发异常。)

If you attempt to write the generator object directly to the file, you will just get the generator object itself, not its contents. 如果尝试将生成器对象直接写入文件,则只会获取生成器对象本身,而不是其内容。 join "unrolls" the contents of the generator. join “展开”生成器的内容。

You can see what is going with a simple, explicit generator: 您可以看到一个简单的显式生成器的运行情况:

def gen():
    yield 'A'
    yield 'B'
    yield 'C'

>>> g = gen()
>>> print g
<generator object gen at 0x0000000004BB9090>
>>> print ''.join(g)
ABC

The generator doles out its contents one at a time. 生成器一次分配一个内容。 If you try to look at the generator itself, it doesn't dole anything out and you just see it as "generator object". 如果您尝试查看生成器本身,则它不会破坏任何内容,而只是将其视为“生成器对象”。 To get at its contents, you need to iterate over them. 要了解其内容,您需要对其进行迭代。 You can do this with a for loop, with the next function, or with any of various other functions/methods that iterate over things ( str.join among them). 您可以使用for循环, next函数或迭代事物的其他各种函数/方法(其中的str.join )来执行此操作。

When you say that result "is a list of string" you are getting close to the idea. 当您说结果“是一个字符串列表”时,您已经接近这个想法。 A generator (or iterable) is sort of like a "potential list". 生成器(或可迭代的)有点像“潜在列表”。 Instead of actually being a list of all its contents all at once, it lets you peel off each item one at a time. 它实际上不是一次列出所有内容,而是让您一次剥离每个项目。

None of the objects is a "memory address". 这些对象都不是“内存地址”。 The string representation of a generator object (like that of many other objects) includes a memory address, so if you print it (as above) or write it to a file, you'll see that address. 生成器对象(与许多其他对象一样)的字符串表示形式包含一个内存地址,因此,如果您将其打印(如上)或将其写入文件,则会看到该地址。 But that doesn't mean that object "is" that memory address, and the address itself isn't really usable as such. 但这并不意味着对象“就是”该内存地址,并且地址本身本身并不是真正可用的。 It's just a handy identifying tag so that if you have multiple objects you can tell them apart. 这只是一个方便的识别标签,因此,如果您有多个对象,可以将它们区分开。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM