简体   繁体   English

如何使用他们的开发工具从 elasticsearch 文档中的值中删除 substring?

[英]How do I remove a substring from a value in an elasticsearch document using their devtools?

If each document has a value that is similar to:如果每个文档的值类似于:

https://test.com/MODIF-RRS/D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/test/code.cs and I want to remove the D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/ part so I am left with https://test.com/MODIF-RRS/test/code.cs how would I do that? https://test.com/MODIF-RRS/D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/test/code.cs我想删除 D:/D-KGQLUL34TURWW-MODIF-AGENT04 /_work/1179/s/ 部分所以我剩下https://test.com/MODIF-RRS/test/code.cs我该怎么做?

I have a regex that works using an online tester我有一个可以使用在线测试器工作的正则表达式

(D:/([a-zA-Z0-9_-]+)/_work/([a-zA-Z0-9_-]+)/s/)

but it gave me an error: invalid range: from (95) cannot be > to (93)但它给了我一个错误:无效范围:从(95)不能>到(93)

I used char filter with your regex.我在你的正则表达式中使用了字符过滤器。

POST _analyze
{
  "char_filter": {
    "type":"pattern_replace",
    "pattern":"(D:/([a-zA-Z0-9_-]+)/_work/([a-zA-Z0-9_-]+)/s/)"
  },
  "text": "https://test.com/MODIF-RRS/D:/D-KGQLUL34TURWW-MODIF-AGENT04/_work/1179/s/test/code.cs"
}

Token代币

{
  "tokens": [
    {
      "token": "https://test.com/MODIF-RRS/test/code.cs",
      "start_offset": 0,
      "end_offset": 85,
      "type": "word",
      "position": 0
    }
  ]
}
(D:/([a-zA-Z0-9_-]+)/_work/([a-zA-Z0-9_-]+)/s/)
> invalid range: from (95) cannot be > to (93)

ASCII character 95 is _ and ASCII character 93 is ] . ASCII 字符 95 是_而 ASCII 字符 93 是]
The parser thinks _-] is supposed to be a range of characters (similar to AZ ) and is confused because the ASCII values left and right of - are not in ascending order.解析器认为_-]应该是一个字符范围(类似于AZ )并且感到困惑,因为-左右的 ASCII 值不是按升序排列的。

As you do not want to specify a range there are all, try escaping the - characters with a leading \ , so that the parser knows you mean a literal - , not a range of characters:由于您不想指定所有范围,请尝试 escaping -带有前导\的字符,以便解析器知道您的意思是文字- ,而不是字符范围:

(D:/([a-zA-Z0-9_\-]+)/_work/([a-zA-Z0-9_\-]+)/s/)

Note: Depending on how you specify your regex (in JSON?), you may have to escape the \ itself as well, so you'd have to write \\- instead of \- .注意:根据您指定正则表达式的方式(在 JSON 中?),您可能还必须转义\本身,因此您必须编写\\-而不是\-

Alternatively it's usually possible to specify - as first character in the set, then the parser realizes it cannot be a range.或者通常可以指定-作为集合中的第一个字符,然后解析器意识到它不能是一个范围。

(D:/([-a-zA-Z0-9_]+)/_work/([-a-zA-Z0-9_]+)/s/)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM