如何通过ruby中的数组加速迭代

Question

我有多个包含产品名称和价格的 csv 文件。 这两个文件中可能有也可能没有产品。 我必须为每个产品在这些文件中找到最高和最低价格。

我将两个文件中的产品合并到一个数组中：

Dir["./*.csv"].each do |file|
  CSV.foreach(file, headers:true) do |row|
    tmpRow = row.to_s.chomp + "," + file #saving name of the input file
    list.push(tmpRow.chomp.split(","))
  end
end

数组list如下所示：

[["5893105","2.38", "weightOrSomethingIrrelevant", "./FIAT_2.csv"]]

这是主要算法：

while list[0] do
  if list[0] != nil
    tmpPart = list[0][0]
    tmpParts = list.select{ |part, price| part == tmpPart}
    tmpParts.each do |tp|
      tmpPrices.push(tp[1])
    end
    list[0][2].to_f != 0.0 ? tmpWeight = list[0][2].to_s : tmpWeight = "Undefined"
    tmpMaxPrice = tmpParts.select{|part, price| part == tmpPart && price == tmpPrices.max}
    tmpMinPrice = tmpParts.select{|part, price| part == tmpPart && price == tmpPrices.min}
    result.push([tmpPart, tmpWeight, tmpPrices.max, tmpMaxPrice[0].last, tmpPrices.min, tmpMinPrice[0].last)
    tmpPart = ""
    list = list - tmpParts
    tmpParts = []
    tmpPrices = []
    tmpMaxPrice = []
    tmpMinPrice = []
    tmpWeight = ""
  end
end

输入文件很大（超过 200 000 行），所以我的算法效率有问题（因为它在半秒内处理一行）。

我想知道是否有更好的方法来编写这个应用程序。

Answer 1

我将把它分成几个部分：1）我建议你有一个表来表示文件（文件名、位置、行号等）并连接到一个产品表（来自该文件的行数据）2）脚本/函数摄取文件并将行存储为数据库记录 3) 脚本/函数分析行并按名称查找产品，使用数据库并使用最小值/最大值提取价格信息。

这可以在以后改进以处理命名不一致的产品与产品出现等。

如何通过ruby中的数组加速迭代

问题描述

1 个解决方案

解决方案1
1 2018-02-13 01:26:03

如何通过ruby中的数组加速迭代

问题描述

1 个解决方案

解决方案1 1 2018-02-13 01:26:03

解决方案1
1 2018-02-13 01:26:03