简体   繁体   English

比用 ruby gsub 替换字符串更有效的方法

[英]More efficient method than string replace with ruby gsub

I have a third party JSON feed which is huge - lots of data.我有一个庞大的第三方 JSON 提要 - 大量数据。 Eg例如

{
   "data": [{
     "name": "ABC",
     "price": "2.50"
   },
   ...
   ]
}

I am required to strip the quotation marks from the price as the consumer of the JSON feed requires it in this way.我需要从价格中去掉引号,因为 JSON 提要的消费者需要这种方式。

To do this I am performing a regex to find the prices and then iterating over the prices and doing a string replace using gsub.为此,我正在执行一个正则表达式来查找价格,然后迭代价格并使用 gsub 进行字符串替换。 This is how I am doing it:这就是我的做法:

price_strings = json.scan(/(?:"price":")(.*?)(?:")/).uniq
price_strings.each do |price|
  json.gsub!("\"#{price.reduce}\"", price.reduce)
end
json

The main bottle neck appears to be on the each block.主要瓶颈似乎在每个块上。 Is there a better way of doing this?有没有更好的方法来做到这一点?

If this JSON string is going to be serialised into a Hash at some point in your application or in another 3rd-party dependency of your code (ie to be consumed by your colleagues or modules), I suggest negotiating with them to convert the price value from String to Numeric on demand when the json is already a Hash , as this is more efficient, and allows them to...如果此 JSON 字符串将在您的应用程序中的某个时间点或在您的代码的另一个 3rd 方依赖项中(即由您的同事或模块使用)被序列化为Hash ,我建议与他们协商以转换价格值当 json 已经是Hash时,根据需要从StringNumeric ,因为这样更有效,并允许他们...

...handle edge-case where say if "price": "" of which my code below will not work, as it would remove the "" , and will be a JSON syntax error. ...处理边缘情况,如果"price": ""我下面的代码将不起作用,因为它将删除"" ,并且将是 JSON 语法错误。

However, if you do not have control over this, or are doing once-off mutation for the whole json data, then can you try below?但是,如果您无法控制这一点,或者正在对整个 json 数据进行一次性突变,那么您可以尝试以下方法吗?

json =
<<-eos
{
  "data": [{
    "name": "ABC",
    "price": "2.50",
    "somethingsomething": {
      "data": [{
        "name": "DEF",
        "price": "3.25", "someprop1": "hello",
        "someprop2": "world"
      }]
    },
    "somethinggggg": {
      "price": "123.45" },
    "something2222": {
      "price": 9.876, "heeeello": "world"
    }
  }]
}
eos

new_json = json.gsub /("price":.*?)"(.*?)"(.*?,|})/, '\1\2\3'

puts new_json
# =>
# {
#   "data": [{
#     "name": "ABC",
#     "price": 2.50,
#     "somethingsomething": {
#       "data": [{
#         "name": "DEF",
#         "price": 3.25, "someprop1": "hello",
#         "someprop2": "world"
#       }]
#     },
#     "somethinggggg": {
#       "price": 123.45 },
#     "something2222": {
#       "price": 9.876, "heeeello": "world"
#     }
#   }]
# }

DISCLAIMER: I am not a Regexp expert.免责声明:我不是正则表达式专家。

This is truly a fools errand.这真是一件傻事。

JSON.parse('{ "price": 2.50 }')
> {price: 2.5}

As you can see from this javascript example the parser on the consuming side will truncate the float to whatever it wants.从这个 javascript 示例中可以看出,消费端的解析器会将浮点数截断为它想要的任何值。

Use a string if you want to provide a formatted number or leave formatting up to the client.如果您想提供格式化的数字或将格式化留给客户端,请使用字符串。

In fact using floats to represent money is widely known as a really bad idea since floats and doubles cannot accurately represent the base 10 multiples that we use for money.事实上,使用浮点数来表示货币被广泛认为是一个非常糟糕的主意,因为浮点数和双精度数不能准确地表示我们用于货币的以 10 为底的倍数。 JSON only has a single number type that represents both floats and integers. JSON 只有一个数字类型,表示浮点数和整数。

If the client is going to do any kind of calculations with the value you should use an integer in the lowest monetary denomation (cents for euros and dollars) or a string that's interpreted as a BigDecimal equivilent type by the consumer.如果客户要使用该值进行任何类型的计算,您应该使用最低货币面额(欧元和美元的美分)的 integer 或被消费者解释为 BigDecimal 等效类型的字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM