简体   繁体   English

在python中获取第n行字符串

[英]get nth line of string in python

How can you get the nth line of a string in Python 3? 如何在Python 3中获得字符串的第n行? For example 例如

getline("line1\nline2\nline3",3)

Is there any way to do this using stdlib/builtin functions? 有没有办法使用stdlib / builtin函数? I prefer a solution in Python 3, but Python 2 is also fine. 我更喜欢Python 3中的解决方案,但Python 2也没问题。

Try the following: 请尝试以下方法:

s = "line1\nline2\nline3"
print s.splitlines()[2]

a functional approach 功能性方法

>>> import StringIO
>>> from itertools import islice
>>> s = "line1\nline2\nline3"
>>> gen = StringIO.StringIO(s)
>>> print next(islice(gen, 2, 3))
line3

Use a string buffer: 使用字符串缓冲区:

import io    
def getLine(data, line_no):
    buffer = io.StringIO(data)
    for i in range(line_no - 1):
        try:
            next(buffer)
        except StopIteration:
            return '' #Reached EOF

    try:
        return next(buffer)
    except StopIteration:
        return '' #Reached EOF

From the comments it seems as if this string is very large. 从评论中看起来好像这个字符串非常大。 If there is too much data to comfortably fit into memory one approach is to process the data from the file line-by-line with this: 如果有太多数据可以轻松适应内存,一种方法是逐行处理文件中的数据:

N = ...
with open('data.txt') as inf:
    for count, line in enumerate(inf, 1):
        if count == N: #search for the N'th line
            print line

Using enumerate() gives you the index and the value of object you are iterating over and you can specify a starting value, so I used 1 (instead of the default value of 0) 使用enumerate()为你提供索引和你迭代的对象的值,你可以指定一个起始值,所以我使用1(而不是默认值0)

The advantage of using with is that it automatically closes the file for you when you are done or if you encounter an exception. 使用with的优点是,当您完成或遇到异常时,它会自动为您关闭文件。

A more efficient solution than splitting the string would be to iterate over its characters, finding the positions of the Nth and the (N - 1)th occurence of '\\n' (taking into account the edge case at the start of the string). 比分割字符串更有效的解决方案是迭代字符,找到第N个位置和第(N-1)个'\\ n'出现的位置(考虑字符串开头的边缘情况) 。 The Nth line is the substring between those positions. 第N行是这些位置之间的子串。

Here's a messy piece of code to demonstrate it (line number is 1 indexed): 这是一个杂乱的代码来演示它(行号为1索引):

def getLine(data, line_no):
    n = 0
    lastPos = -1
    for i in range(0, len(data) - 1):
        if data[i] == "\n":
            n = n + 1
            if n == line_no:
                return data[lastPos + 1:i]
            else:
                lastPos = i;



    if(n == line_no - 1):
        return data[lastPos + 1:]
    return "" # end of string

This is also more efficient than the solution which builds up the string one character at a time. 这也比一次构建一个字符串的解决方案更有效。

Since you brought up the point of memory efficiency, is this any better: 既然你提出了内存效率,那就更好了:

s = "line1\nline2\nline3"

# number of the line you want
line_number = 2

i = 0
line = ''
for c in s:
   if i > line_number:
     break
   else:
     if i == line_number-1 and c != '\n':
       line += c
     elif c == '\n':
       i += 1

Wrote into two functions for readability 写入两个函数以提高可读性

    string = "foo\nbar\nbaz\nfubar\nsnafu\n"

    def iterlines(string):
      word = ""
      for letter in string:
        if letter == '\n':
          yield word
          word = ""
          continue
        word += letter

    def getline(string, line_number):
      for index, word in enumerate(iterlines(string),1):
        if index == line_number:
          #print(word)
          return word

    print(getline(string, 4))
`my_string.strip().split("\n")[-1]`

My solution (effecient and compact): 我的解决方案(有效和紧凑):

def getLine(data, line_no):
    index = -1
    for _ in range(line_no):index = data.index('\n',index+1)
    return data[index+1:data.index('\n',index+1)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM