简体   繁体   English

沃森对话意图和实体是否支持正则表达式?

[英]Does watson conversation intents and entities support regular expressions?

I'm testing Watson Conversation API with a possible dialog my company wants to create. 我正在使用公司想要创建的可能对话框来测试Watson Conversation API。 We are developing with Brazilian Portuguese. 我们正在与巴西葡萄牙语一起发展。 Given the portugues is a rich language and sometimes the users can make mistakes, we want to predict these possible errors, mainly with special chars and accents. 鉴于葡萄牙语是一种丰富的语言,有时用户可能会犯错误,因此我们希望预测这些可能的错误,主要是使用特殊字符和重音符号。

For sample, the word produção can be written by users like: produção , producao , produçao , producão . 例如,单词produção可以由以下用户编写: produçãoproducaoproduçaoproducão Is possible to have a regular expression on the intents and entities to have something like the picture bellow? 是否可以在意图和实体上使用正则表达式以具有类似图片的东西? Sometimes we have another word to make a sense liek produção final , produção geral , produção passada , etc. 有时我们会用另一个词来使produção finalprodução geralprodução passadaprodução passada

在此处输入图片说明

Another quick question, is possible to create examples on intents merging with entities values, using something like @(producao) (like image)? 另一个快速的问题,是否可以使用诸如@(producao)类的东西(例如图像)来创建与实体值合并的意图的示例?

Thank you 谢谢

You cannot use regular expressions in intents or entities however I think you should still be able to cope with variations. 您不能在意图或实体中使用正则表达式,但是我认为您仍然应该能够应对变化。

There is currently no built in handling of typos or accent normalization when matching intents however if there are enough features in a sentence to match on, the occasional typo shouldn't cause problems. 匹配意图时,目前没有内置的错别字或口音归一化处理功能,但是,如果句子中有足够的特征可以匹配,那么偶尔的错字就不会造成问题。 For very short examples, there may be some value in adding additional examples for common mistakes. 对于非常简短的示例,为常见错误添加其他示例可能会有一定的价值。

For entities, you can include synonyms and I have used that to include common mistakes before. 对于实体,您可以包含同义词,而我之前曾使用它来包含常见错误。

You shouldn't try to include a reference to an entity directly in your intents. 您不应尝试在意图中直接包含对实体的引用。 For example, rather than Qual a @(producao) you should just have Qual a produção , along with other examples of the same intent, perhaps with different entities, or different synonyms for the same entity. 例如,您应该仅使Qual a produção而不是Qual a @(producao) ,以及具有相同意图的其他示例,可能具有不同的实体,或者同一实体的不同同义词。 For example, I might have the following examples for a #directions intent... 例如,对于#directions意图,我可能有以下示例...

  • How do I get to the hotel by car? 我怎么开车去酒店?
  • Can you give me directions to the hotel by road? 你能告诉我乘车去酒店的路线吗?
  • Which is the nearest station if I travel by train 如果我坐火车去最近的车站
  • Which bus route will get me to the hotel? 哪条巴士路线可以带我到酒店?

Along with values like car, bus, train, bicycle, etc. for a @transport entity. 以及@transport实体的值,例如汽车,公共汽车,火车,自行车等。 (Sorry I can't give a Brazilian Portuguese example!) There's no need to explicitly name the entity/entities you're expecting to find in an intent. (很抱歉,我无法提供巴西葡萄牙语示例!)无需明确地命名您希望在意图中找到的实体。

And finally, you can use regular expressions in conditions on dialog nodes, for example... 最后,您可以在对话框节点上的条件中使用正则表达式,例如...

input.text.matches( 'produ[cç][aã]o' )

In this case, just for the complement and more bits of knowledge, a few days ago IBM Watson Conversation released a new Beta version for use Patterns . 在这种情况下,仅出于补充和更多知识的考虑,几天前IBM Watson Conversation发布了一个新的Beta版以供Patterns使用。

With Patterns in @Entities, you can use regular expressions. 通过@Entities中的模式,您可以使用正则表达式。

The Patterns field lets you define specific patterns for an entity value. 模式字段可让您定义实体值的特定模式。 A pattern must be entered as a regular expression in the field. 必须在字段中将模式作为正则表达式输入。

As in this example, for entity "ContactInfo", the patterns for phone, email values can be defined as follows: 如本例所示,对于实体“ ContactInfo”,电话,电子邮件值的模式可以定义如下:

Examples: 例子:

  • localPhone : (\\d{3})-(\\d{4}) , eg 426-4968 localPhone(\\d{3})-(\\d{4}) ,例如426-4968

  • fullUSphone : (\\d{3})-(\\d{3})-(\\d{4}) , eg 800-426-4968 fullUSphone(\\d{3})-(\\d{3})-(\\d{4}) ,例如800-426-4968

  • email : \\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}\\b , eg test@gmail.com 电子邮件\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}\\b ,例如测试@ gmail.com

Often when using pattern entities, it will be necessary to store the text that matches the pattern in a context variable (or action variable), from within your dialog tree. 通常,在使用模式实体时,有必要在对话框树中的上下文变量(或动作变量)中存储与模式匹配的文本。

Imagine a case where you are asking a user for their email address. 设想一下您要向用户询问其电子邮件地址的情况。 The dialog node condition will contain a condition similar to @contactInfo:email . 对话框节点条件将包含类似于@contactInfo:email的条件。 In order to assign the user-entered email as a context variable, the following syntax can be used to capture the pattern match within the dialog node's response section: 为了将用户输入的电子邮件分配为上下文变量,可以使用以下语法在对话框节点的响应部分内捕获模式匹配:

{
    "context" : {
        "email": "@contactInfo.literal"
    }
}

Obs.: The pattern matching engine employed by the Conversation service has some syntax limitations, which are necessary in order to avoid performance concerns which can occur when using other regular expression engines. 观察:会话服务使用的模式匹配引擎具有一些语法限制,这是避免使用其他正则表达式引擎时可能发生的性能问题所必需的。 Notably, entity patterns may not contain: 值得注意的是,实体模式可能不包含:

  • Positive repetitions (eg, x*+) 正重复(例如x * +)
  • Backreferences (eg, \\g1) 反向引用(例如\\ g1)
  • Conditional branches (eg, (?(cond)true)) 条件分支(例如(?(cond)true))

See more about Defining Entities in Watson Conversation (focused in step 7) 查看有关在Watson Conversation中定义实体的更多信息(重点在步骤7中)

you dont need to worry about accent, plural or misspelled word. 您无需担心重音,复数或拼写错误的单词。 Watson, LUIS, API.AI and so on take this as features and works for each word. Watson,LUIS,API.AI等将其作为功能并适用于每个单词。 For example: 例如:

Cartão de Crédito > K artão de Crédito > cart a o de cre b ito 卡唐信贷银行>ķartão信贷银行>车A○德CRE b ITO

All of these works fine ! 所有这些都很好!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM