[英]What is the most pythonic way of printing only certain lines of a string?
Suppose I have a string (not a file) that spans many lines: 假设我有一个跨越多行的字符串(不是文件):
multiline_string = '''I met a traveller from an antique land
Who said: Two vast and trunkless legs of stone
Stand in the desert... near them, on the sand,
Half sunk, a shattered visage lies, whose frown,
And wrinkled lip, and sneer of cold command,
Tell that its sculptor well those passions read
Which yet survive, stamped on these lifeless things,
The hand that mocked them and the heart that fed;
And on the pedestal these words appear:
'My name is Ozymandias, king of kings;
Look on my works, ye Mighty, and despair!'
Nothing beside remains. Round the decay
Of that colossal wreck, boundless and bare
The lone and level sands stretch far away.'''
I want to get only certain lines of the string, as a single string (not as a list of strings). 我想只得到字符串的某些行,作为单个字符串(而不是字符串列表)。 One way of doing it is this:
一种方法是:
pedestal_lines = "\n".join(multiline_string.splitlines()[9:12])
print(pedestal_lines)
Output: 输出:
And on the pedestal these words appear:
'My name is Ozymandias, king of kings;
Look on my works, ye Mighty, and despair!'
But that way is not very good: it has to split the string into a list of strings, index this list, then join the lists back together with the str.join()
method. 但是这种方式不是很好:它必须将字符串拆分为字符串列表,索引此列表,然后将列表与
str.join()
方法一起加入。 Not to mention, it's ugly-looking and not very readable. 更不用说,它看起来很难看,而且不太可读。 Is there a more elegant/pythonic way of achieving this?
是否有更优雅/ pythonic的方式实现这一目标?
If you don't want to split the string, you can do the following: 如果你不想分割字符串,你可以做到以下几点:
You'll forgive the one-off errors that I may have done in the code below. 您将原谅我在下面的代码中可能做过的一次性错误。
Regex : 正则表达式 :
import re
print(re.sub("^(.*\n){8}((?:.*\n){3})(.*\n){1,}",r"\2",multiline_string))
(create a group of 8 lines, then a group of 3 lines, then the rest, replace by the second group) (创建一组8行,然后一组3行,然后其余的,由第二组替换)
Position extract + slicing : 位置提取+切片 :
linefeed_pos = [i for i,c in enumerate(multiline_string) if c=="\n"]
print(multiline_string[linefeed_pos[7]:linefeed_pos[11]])
(extract the position of the linefeed chars with list comprehension on the original string, then slice using those line-indexed positions). (在原始字符串中使用列表解析提取换行符的位置,然后使用这些行索引位置进行切片)。 The drawback of this approach is that it computes all the indexes, not only until the upper line bound.
这种方法的缺点是它计算所有索引,不仅直到上限绑定。 That can be easily fixed by wrapping a generator comprehension in a list comprehension to stop just when the indices are no longer needed:
这可以通过将生成器理解包含在列表解析中来直接解决,以便在不再需要索引时停止:
linefeed_pos = [next (i for i,c in enumerate(multiline_string) if c=="\n") for _ in range(12)]
Maybe one slicing/extract is better than splitting & joining for performance (I understand that seeing a big list going to waste just to pick 3 lines is unbearable), but I wouldn't call that pythonic. 也许一个切片/提取比分离和加入性能更好(我明白看到一个大的列表只是为了选择3行而浪费)是无法忍受的,但我不会称之为pythonic。
Both methods explained above should be faster than yours if you have a lot of lines if performance/memory matters. 如果性能/内存很重要,如果你有很多行,那么上面介绍的两种方法都应该比你的更快。 If it doesn't, then stick to your solution.
如果没有,那么坚持你的解决方案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.