简体   繁体   English

如何在Ruby中使用matchdata对象编写条件逻辑?

[英]How do I write conditional logic using a matchdata object in Ruby?

I've written a simple program that parses my banks transaction CSV file. 我编写了一个简单的程序来解析我的银行交易CSV文件。 My expression pushes results to array/hash data structure that will be saved to a database. 我的表达式将结果推送到将保存到数据库的数组/哈希数据结构中。

There are two parts: 有两个部分:

  1. A run method that opens a file, reads each line and pushes it. 一个运行方法,它打开一个文件,读取每一行并将其压入。
  2. A view that pulls out data from the hash. 从哈希中提取数据的视图。

I've included my main parse method below. 我在下面包括了我的主要解析方法。 It checks each line for a keyword, and if the match fails, it SHOULD push to an unclassified Hash. 它检查每一行是否有关键字,如果匹配失败,则应该推送到未分类的哈希。 However, the conditional either pushes ALL or NO transactions based on whether I use elsif or else . 但是,根据我使用的是elsif还是else ,条件elsif会推送ALL或NO事务。

Matchdata objects return strings by default so else should work shouldn't it? Matchdata对象默认情况下返回字符串,所以else应该工作吗? Here's the method that builds the data structure. 这是构建数据结构的方法。 I've commented the portion I'm having trouble with: 我评论了我遇到的问题:

def generateHashDataStructure(fileToParse, wordListToCheckAgainst)
  transactionInfo = Hash.new
  transactionInfo[:transactions] = Hash.new
  transactionInfo[:unclassifiedTransaction] = Hash.new
  transaction = transactionInfo[:transactions]
  unclassifiedTransaction = transactionInfo[:unclassifiedTransaction]

  wordListToCheckAgainst.each do |word|
    transaction[word] = Array.new
    unclassifiedTransaction[:unclassifiedTransaction] = Array.new
    File.open(fileToParse).readlines.each do |line|
       if transaction = /(?<transaction>)#{word}/.match(line)   
        date = /(?<Month>\d{1,2})\D(?<Day>\d{2})\D(?<Year>\d{4})/.match(line).to_s
        transaction = /(?<transaction>)#{word}/.match(line).to_s
        amount =/-+(?<dollars>\d+)\.(?<cents>\d+)/.match(line).to_s
        transactions[word].push({:date => date, 
                                :name => transaction, :amount =>    amount.to_f.round(2)})

        # this is problem: else/elsif don't push only if match fails
        else
         date = /(?<Month>\d{1,2})\D(?<Day>\d{2})\D(?<Year>\d{4})/.match(line).to_s
         transaction = /(?<Middle>)".*"/.match(line).to_s
         amount =/-*(?<dollars>\d+)\.(?<cents>\d+)/.match(line).to_s
         unclassifiedTransaction[:unclassifiedTransaction].push({:date => date, 
                                   :name => transaction, :amount => amount.to_f.round(2)})
         next
        end
     end
     return transactionInfo
   end

Any ideas would be great. 任何想法都很棒。 I've researched this and I feel I've been defeated by reaching out to the community. 我已经对此进行了研究,并且感到与社区接触已被击败。 I realize regex might not be best approach so I'm open to all feedback. 我意识到正则表达式可能不是最好的方法,因此我欢迎所有反馈。

I made your code more idiomatic, which helps reveal some very questionable things. 我使您的代码更加惯用,这有助于揭示一些非常可疑的内容。

  1. Ruby methods and variables are written in snake_case, not CamelCase. Ruby方法和变量是用snake_case 而不是 CamelCase编写的。 While this seems like a matter of personal opinion, it also becomes a case of maintainability/readability. 虽然这似乎是个人观点,但也成为可维护性/可读性的情况。 The _ helps our brains visually separate the word segments from each other in the variable name, rather than seeing a run-together string with mixed case "humps". _帮助我们的大脑从视觉上将变量名称中的单词段彼此分开,而不是看到带有混合大小写“驼峰”的连在一起的字符串。 Try_reading_a_bunch_of_text_that_is_identical exceptForThatAndSeeWhichIsMoreExhausting . Try_reading_a_bunch_of_text_that_is_identical exceptForThatAndSeeWhichIsMoreExhausting
  2. You're assigning to a variable inside a conditional test: 您要在条件测试中分配一个变量:

     if transaction = /(?<transaction>)#{word}/.match(line) 

    Don't do that. 不要那样做 Even if it's intentional, it opens up potential for maintenance errors when someone else doesn't understand why you'd do something like that. 即使是故意的,当其他人不了解您为什么要执行此类操作时,也可能会出现维护错误。 Instead, write it in two steps so it's obvious what was intended: 而是分两步编写它,这样很容易实现预期目的:

     transaction = /(?<transaction>)#{word}/.match(line) if transaction 

    Or, your "assignment then compare" really should be written as: 或者,您的“分配然后进行比较”实际上应该写为:

     if transaction == /(?<transaction>)#{word}/.match(line) 

    Or: 要么:

     if /(?<transaction>)#{word}/.match(line) 

    Which is even more clean/safe/obvious. 哪个更干净/安全/更明显。

  3. Rather than use Hash.new , and Array.new , use the direct assignments {} and [] respectively. 而不是使用Hash.newArray.new ,分别使用直接分配{}[] They're less noisy and more commonly seen. 它们不那么吵杂,更常见。 Also, rather than incrementally define your hash: 同样,而不是增量定义哈希:

     transactionInfo = Hash.new transactionInfo[:transactions] = Hash.new transactionInfo[:unclassifiedTransaction] = Hash.new 

    Use: 采用:

     transaction_info = { :transactions => {}, :unclassified_transaction => {} } 

    Instantly your structure is revealed, making the intention a lot clearer. 即时,您的结构就会显示出来,使意图更加清晰。

  4. File.open(fileToParse).readlines.each do |line| is a convoluted way of doing: 是一种复杂的操作方式:

     File.foreach(fileToParse) do |line| 

    Only foreach doesn't waste memory sucking the entire file into memory all at once. 只有foreach不会浪费内存,一次将整个文件都吸入内存。 There's no appreciable speed improvement to "slurping" your file, only downsides to it if the file grows to "ginormous" proportions. “吸引”文件并没有明显的速度改进,只有当文件增长到“巨大”的比例时,它的不利之处才有。

  5. Instead of using: 而不是使用:

     transactions[word].push({:date => date, :name => transaction, :amount => amount.to_f.round(2)}) 

    Write your code more simply. 编写代码更简单。 push obscures what you're doing, as does the way you formatted your lines: push掩盖了您的工作,格式化行的方式也掩盖了您的意思:

     transactions[word] << { :date => date, :name => transaction, :amount => amount.to_f.round(2) } 

    Note the alignment into columns. 注意对齐成列。 Some people eschew that particular habit, but when you're dealing with a lot of assignments it can make a big difference seeing the variations in each line. 有些人回避这种特殊习惯,但是当您处理大量任务时,看到每一行的变化都会有很大的不同。

Here's more idiomatic Ruby code: 这是更多惯用的Ruby代码:

def generate_hash_data_structure(file_to_parse, word_list_to_check_against)

  transaction_info = {
    :transactions => {},
    :unclassified_transaction => {}
  }

  transaction = transaction_info[:transactions]
  unclassified_transaction = transaction_info[:unclassified_transaction]

  word_list_to_check_against.each do |word|

    transaction[word] = []
    unclassified_transaction[:unclassified_transaction] = []

    File.foreach(file_to_parse) do |line|

      if transaction = /(?<transaction>)#{word}/.match(line)   

        date        = /(?<Month>\d{1,2})\D(?<Day>\d{2})\D(?<Year>\d{4})/.match(line).to_s
        transaction = /(?<transaction>)#{word}/.match(line).to_s
        amount      = /-+(?<dollars>\d+)\.(?<cents>\d+)/.match(line).to_s

        transactions[word] << {
          :date   => date,
          :name   => transaction,
          :amount => amount.to_f.round(2)
        }

        # this is problem: else/elsif don't push only if match fails

      else

        date        = /(?<Month>\d{1,2})\D(?<Day>\d{2})\D(?<Year>\d{4})/.match(line).to_s
        transaction = /(?<Middle>)".*"/.match(line).to_s
        amount      = /-*(?<dollars>\d+)\.(?<cents>\d+)/.match(line).to_s

        unclassified_transaction[:unclassified_transaction] << {
            :date   => date,
            :name   => transaction,
            :amount => amount.to_f.round(2)
          }

        # next
      end

    end

    transaction_info

  end
end

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM