简体   繁体   中英

More efficient method than string replace with ruby gsub

I have a third party JSON feed which is huge - lots of data. Eg

{
   "data": [{
     "name": "ABC",
     "price": "2.50"
   },
   ...
   ]
}

I am required to strip the quotation marks from the price as the consumer of the JSON feed requires it in this way.

To do this I am performing a regex to find the prices and then iterating over the prices and doing a string replace using gsub. This is how I am doing it:

price_strings = json.scan(/(?:"price":")(.*?)(?:")/).uniq
price_strings.each do |price|
  json.gsub!("\"#{price.reduce}\"", price.reduce)
end
json

The main bottle neck appears to be on the each block. Is there a better way of doing this?

If this JSON string is going to be serialised into a Hash at some point in your application or in another 3rd-party dependency of your code (ie to be consumed by your colleagues or modules), I suggest negotiating with them to convert the price value from String to Numeric on demand when the json is already a Hash , as this is more efficient, and allows them to...

...handle edge-case where say if "price": "" of which my code below will not work, as it would remove the "" , and will be a JSON syntax error.

However, if you do not have control over this, or are doing once-off mutation for the whole json data, then can you try below?

json =
<<-eos
{
  "data": [{
    "name": "ABC",
    "price": "2.50",
    "somethingsomething": {
      "data": [{
        "name": "DEF",
        "price": "3.25", "someprop1": "hello",
        "someprop2": "world"
      }]
    },
    "somethinggggg": {
      "price": "123.45" },
    "something2222": {
      "price": 9.876, "heeeello": "world"
    }
  }]
}
eos

new_json = json.gsub /("price":.*?)"(.*?)"(.*?,|})/, '\1\2\3'

puts new_json
# =>
# {
#   "data": [{
#     "name": "ABC",
#     "price": 2.50,
#     "somethingsomething": {
#       "data": [{
#         "name": "DEF",
#         "price": 3.25, "someprop1": "hello",
#         "someprop2": "world"
#       }]
#     },
#     "somethinggggg": {
#       "price": 123.45 },
#     "something2222": {
#       "price": 9.876, "heeeello": "world"
#     }
#   }]
# }

DISCLAIMER: I am not a Regexp expert.

This is truly a fools errand.

JSON.parse('{ "price": 2.50 }')
> {price: 2.5}

As you can see from this javascript example the parser on the consuming side will truncate the float to whatever it wants.

Use a string if you want to provide a formatted number or leave formatting up to the client.

In fact using floats to represent money is widely known as a really bad idea since floats and doubles cannot accurately represent the base 10 multiples that we use for money. JSON only has a single number type that represents both floats and integers.

If the client is going to do any kind of calculations with the value you should use an integer in the lowest monetary denomation (cents for euros and dollars) or a string that's interpreted as a BigDecimal equivilent type by the consumer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM