简体   繁体   English

从文件读取两个字符串

[英]Reading two strings from file

I'm writing a program in python and I want to compare two strings that exist in a text file and are separated by a new line character. 我正在用python编写程序,我想比较文本文件中存在的两个字符串,并用换行符分隔。 How can I read the file in and set each string to a different variable. 如何读取文件并将每个字符串设置为不同的变量。 ie string1 and string2 ? string1string2

Right now I'm using: 现在我正在使用:

file = open("text.txt").read();

but this gives me extra content and not just the strings. 但这给了我更多的内容,而不仅仅是字符串。 I'm not sure what it is returning but this text file just contains two strings. 我不确定返回的内容,但是此文本文件仅包含两个字符串。 I tried using other methods such as ..read().splitlines() but this did not yield the result I'm looking for. 我尝试使用其他方法,例如..read().splitlines()但这没有产生我想要的结果。 I'm new to python so any help would be appreciated! 我是python的新手,所以将不胜感激!

This only reads the first 2 lines, strips off the newline char at the end, and stores them in 2 separate variables. 这只会读取前两行,最后删除换行符char,并将它们存储在2个单独的变量中。 It does not read in the entire file just to get the first 2 strings in it. 它不会仅读取前两个字符串就读取整个文件。

with open('text.txt') as f:
    word1 = f.readline().strip()
    word2 = f.readline().strip()

print word1, word2

# now you can compare word1 and word2 if you like

text.txt : text.txt

foo
bar
asdijaiojsd
asdiaooiasd

Output: 输出:

foo bar

EDIT: to make it work with any number of newlines or whitespace: 编辑:使其与任何数量的换行符或空格一起使用:

with open('text.txt') as f:
    # sequence of all words in all lines
    words = (word for line in f for word in line.split())
    # consume the first 2 items from the words sequence
    word1 = next(words)
    word2 = next(words)

I've verified this to work reliably with various "non-clean" contents of text.txt . 我已经验证了它可以与text.txt各种“不干净”内容可靠地一起使用。

Note: I'm using generator expressions which are like lazy lists so as to avoid reading more than the needed amount of data. 注意:我正在使用类似于惰性列表的生成器表达式,以避免读取超出所需数量的数据。 Generator expressions are otherwise equivalent to list comprehensions except they produce items in the sequence lazily, ie as just as much as asked. 生成器表达式在其他方面等效于列表推导,只是它们懒散地生成序列中的项,即与要求的数量一样多。

with open('text.txt') as f:
    lines = [line.strip() for line in f]
    print lines[0] == lines[1]

I'm not sure what it is returning but this text file just contains two strings. 我不确定返回的内容,但是此文本文件仅包含两个字符串。

Your problem is likely related to whitespace characters (most common being carriage return, linefeed/newline, space and tab). 您的问题可能与空白字符(最常见的是回车符,换行符/换行符,空格和制表符)有关。 So if you tried to compare your string1 to ' expectedvalue ' and it fails, it's likely because of the newline itself. 所以,如果你想你的比较string1到“ expectedvalue ”和失败,很可能是因为换行本身。

Try this: print the length of each string then print each of the actual bytes in each string to see why the comparison fails. 尝试以下操作:打印每个字符串的长度,然后打印每个字符串中的每个实际字节,以查看为什么比较失败。

For example: 例如:

>>> print len(string1), len(expected)
4 3
>>> for got_character, expected_character in zip(string1, expected):
...     print 'got "{}" ({}), but expected "{}" ({})'.format(got_character, ord(got_character), expected_character, ord(expected_character))
... 
got " " (32), but expected "f" (102)
got "f" (102), but expected "o" (111)
got "o" (111), but expected "o" (111)

If that's your problem, then you should strip off the leading and trailing whitespace and then execute the comparison: 如果这是您的问题,则应strip开头和结尾的空格,然后执行比较:

>>> string1 = string1.strip()
>>> string1 == expected
True

If you're on a unix-like system, you'll probably have an xxd or od binary available to dump a more detailed representation of the file. 如果您使用的是类似Unix的系统,则可能会使用xxdod二进制文件来转储文件的更详细表示。 If you're using windows, you can download many different "hex editor" programs to do the same. 如果您使用的是Windows,则可以下载许多不同的“十六进制编辑器”程序来执行相同的操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM