简体   繁体   中英

Chaning Ruby regexp operators

I'm writing a filter program that reads a CSV file containing address data and excludes rows that are located in a crescent (cres), avenue (ave) or place (pl).

Here's some example input:

data = <<CSV
ID,Street address,Town,Valuation date,Value
1,1 Northburn RD,WANAKA,1/1/2015,280000
2,1 Mount Ida PL,WANAKA,1/1/2015,280000
3,1 Mount Linton AVE,WANAKA,1/1/2015,780000
4,1 Centre CRES,WANAKA,1/1/2015,295000
CSV

require 'csv'

elements = []
CSV.parse(data, headers: true, header_converters: :symbol) do |row|
  elements << row.to_h
end
elements
#=> [
#     {:id=>"1", :street_address=>"1 Northburn RD", :town=>"WANAKA", :valuation_date=>"1/1/2015", :value=>"280000"},
#     {:id=>"2", :street_address=>"1 Mount Ida PL", :town=>"WANAKA", :valuation_date=>"1/1/2015", :value=>"280000"},
#     {:id=>"3", :street_address=>"1 Mount Linton AVE", :town=>"WANAKA", :valuation_date=>"1/1/2015", :value=>"780000"},
#     {:id=>"4", :street_address=>"1 Centre CRES", :town=>"WANAKA", :valuation_date=>"1/1/2015", :value=>"295000"}
#   ]

I can use simple regular expressions to filter for one of the three, ie /pl/ , /cres/ and /ave/ , but I can't chain them using && : (nor do they function when I split them into three separate "filters")

elements.select { |e| e[:street_address].downcase! !~ /pl/ && e[:street_address].downcase! !~ /cres/ && e[:street_address].downcase! !~ /ave/ }
#=> [
#     {:id=>"1", :street_address=>"1 northburn rd", :town=>"WANAKA", :valuation_date=>"1/1/2015", :value=>"280000"},
#     {:id=>"3", :street_address=>"1 mount linton ave", :town=>"WANAKA", :valuation_date=>"1/1/2015", :value=>"780000"},
#     {:id=>"4", :street_address=>"1 centre cres", :town=>"WANAKA", :valuation_date=>"1/1/2015", :value=>"295000"}
#   ]

This filters out entry #2 as expected, but not #3 and #4.

Any ideas what I'm missing?

It's because of downcase! - it alters the receiver and it returns nil if no changes were made.

str = 'FOO'
str.downcase! #=> "foo"
str.downcase! #=> nil

Therefore, your second comparison becomes nil !~ /cres/ which is always true .

To fix your code, use downcase (without ! ):

elements[:streetAddress].downcase !~ /pl/

or add a i to your regular expression to make it case-insensitive:

elements[:streetAddress] !~ /pl/i

Furthermore, you can combine your regular expressions and use reject :

elements.reject { |e| e[:streetAddress] =~ /pl|cres|ave/i }

To only match strings that end with "pl", "cres", or "ave", use an appropriate anchor , for example /(pl|cres|ave)$/i

If you want to remove elements from an array based on a condition, the idiomatic way might be to use Array#delete_if

IMO, try not to use regex when you already know which values are accepted. Regex are great at pattern matching (checking email validity and such), but their use should not go farther.

Assuming RD, CRES, AVE are always on the last word, this works :

x = elements.delete_if do |el|
  ['pl', 'cres', 'ave'].include?(el[:streetAddress].downcase.split.last)
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM