简体   繁体   English

Python动态写入大文件,避免100%CPU使用率

[英]Python Write dynamically huge files avoiding 100% CPU Usage

I am parsing a huge CSV approx 2 GB files with the help of this great stuff . 我正在借助这个好东西来解析一个大约2 GB的CSV巨大文件。 Now have to generate dynamic files for each column in a new file where column name as file name. 现在必须为新文件中的每一列生成动态文件,其中列名为文件名。 So I written this code to write the dynamic files: 因此,我编写了以下代码来编写动态文件:

def write_CSV_dynamically(self, header, reader):
  """
  :header - CSVs first row in string format
  :reader - CSVs all other rows in list format  
  """

  try:
    headerlist =header.split(',') #-- string headers 
    zipof = lambda x, y: zip(x.split(','), y.split(','))
    filename = "{}.csv".format(self.dtstamp)
    filename = "{}_"+filename
    filesdct = {filename.format(k.strip()):open(filename.format(k.strip()), 'a')\
    for k in headerlist}
    for row in reader:
      for key, data in zipof(header, row):
        filesdct[filename.format(key.strip())].write( str(data) +"\n" )
    for _, v in filesdct.iteritems():
      v.close()
  except Exception, e:
    print e

Now its taking around 50 secs to write these huge files using 100% CPU .As there are other heavy things running on my server. 现在,使用100% CPU写入这些巨大的文件大约需要50秒。因为服务器上还有其他繁重的工作。 I want to block my program to use only 10% to 20% of the CPU and write these files. 我想阻止我的程序仅使用10%到20%的CPU并写入这些文件。 No matter if it takes 10-15 mins. 无论花费10-15分钟。 How can I optimize my code, so that it should limit 10-20% CPU usage. 如何优化我的代码,使其应限制10-20%的CPU使用率。

There is number of ways to achieve this: 有多种方法可以实现此目的:

  • Nice the process - plain and simple. 尼斯的过程-简单明了。

  • cpulimit - just pass your script and cpu usage as parameters: cpulimit-只需将脚本和cpu用法作为参数传递:

    cpulimit -P /path/to/your/script -l 20 cpulimit -P /路径/到/您的/脚本-l 20

  • Python's resource package to set limits from the script. Python的资源包,用于从脚本设置限制。 Bear in mind it works with absolute CPU time. 请记住,它与绝对CPU时间一起工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM