简体   繁体   English

如何使用python以tail -f方式读取csv文件?

[英]How to read a csv file in tail -f manner using python?

I want to read the csv file in a manner similar to tail -f ie like reading an error log file. 我想以类似于tail -f的方式读取csv文件,即读取错误日志文件。

I can perform this operation in a text file with this code: 我可以使用以下代码在文本文件中执行此操作:

 while 1:
      where = self.file.tell()
      line = self.file.readline()
      if not line:
        print "No line waiting, waiting for one second"
        time.sleep(1)
        self.file.seek(where)
      if (re.search('[a-zA-Z]', line) == False):
        continue
      else:
        response = self.naturalLanguageProcessing(line)
        if(response is not None):
          response["id"] = self.id
          self.id += 1
          response["tweet"] = line
          self.saveResults(response)
        else:
          continue

How do I perform the same task for a csv file? 如何为csv文件执行相同的任务? I have gone through a link which can give me last 8 rows but that is not what I require. 我通过了一个链接,该链接可以给我最后8行,但这不是我所需要的。 The csv file will be getting updated simultaneously and I need to get the newly appended rows. CSV文件将同时更新,我需要获取新添加的行。

Connecting A File Tailer To A csv.reader 将文件csv.reader连接到csv.reader

In order to plug your code that looks for content newly appended to a file into a csv.reader , you need to put it into the form of an iterator. 为了将查找新添加到文件的内容的代码插入csv.reader ,您需要将其放入迭代器的形式。

I'm not intending to showcase correct code, but specifically to show how to adopt your existing code into this form, without making assertions about its correctness. 我并不是要展示正确的代码,而是要展示如何在不声明其正确性的情况下, 将现有的代码采用这种形式。 In particular, the sleep() would be better replaced with a mechanism such as inotify to let the operating system assertively inform you when the file has changed; 特别是,使用诸如inotify之类的机制更好地代替sleep() ,让操作系统在文件更改时断言地通知您; and the seek() and tell() would be better replaced with storing partial lines in memory rather than backing up and rereading them from the beginning over and over. seek()tell()最好替换为将部分行存储在内存中,而不是一开始就反复备份和重新读取它们。

import csv
import time

class FileTailer(object):
    def __init__(self, file, delay=0.1):
        self.file = file
        self.delay = delay
    def __iter__(self):
        while True:
            where = self.file.tell()
            line = self.file.readline()
            if line and line.endswith('\n'): # only emit full lines
                yield line
            else:                            # for a partial line, pause and back up
                time.sleep(self.delay)       # ...not actually a recommended approach.
                self.file.seek(where)

csv_reader = csv.reader(FileTailer(open('myfile.csv')))
for row in csv_reader:
    print("Read row: %r" % (row,))

If you create an empty myfile.csv , start python csvtailer.py , and then echo "first,line" >>myfile.csv from a different window, you'll see the output of Read row: ['first', 'line'] immediately appear. 如果创建一个空的myfile.csv ,启动python csvtailer.py ,然后从另一个窗口echo "first,line" >>myfile.csv ,您将看到Read row: ['first', 'line']的输出Read row: ['first', 'line']立即出现。


Finding A Correct File Tailer In Python 在Python中找到正确的文件尾刀

For a correctly-implemented iterator that waits for new lines to be available, consider referring to one of the existing StackOverflow questions on the topic: 对于等待新行可用的正确实现的迭代器,请考虑参考以下主题中现有的StackOverflow问题之一:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM