[英]IBM Watson Discovery crawling issue
We want to index our client website and store all the data in IBM Watson Discovery service . 我们希望索引客户端网站并将所有数据存储在IBM Watson Discovery服务中 。 When user asks question related to client data then (we will connect discovery with Watson Assistant).
当用户询问与客户端数据相关的问题时(我们将使用Watson Assistant连接发现)。 The chatbot should connect to Discovery and fetch the data to respond.
聊天机器人应该连接到Discovery并获取数据以进行响应。
Problem: The client website has multiple links and each link will have further links, we want crawl all the data from website and index and store it in Watson Discovery service. 问题:客户端网站有多个链接,每个链接都有更多链接,我们希望抓取网站和索引中的所有数据并将其存储在Watson Discovery服务中。 We tried crawling the site but Discovery service is taking much time to crawl the site and also its not completed the task after 1 week also.
我们尝试抓取网站,但Discovery服务花了很多时间来抓取网站,并且还在1周后也没有完成任务。 Please let us know how we can achieve this in better and faster way.
请告诉我们如何以更好,更快的方式实现这一目标。
Note that the web crawling is a current beta and the Watson Discovery documentation for web crawl states that, depending on the website, it will not ingest all data. 请注意,网络抓取是当前的测试版, 网页抓取的Watson Discovery文档指出,根据网站的不同,它不会提取所有数据。
I used the web crawl in Discovery in a similar scenario like yours and query my website using a chat built with Watson Assistant. 我在与您类似的场景中使用了Discovery中的Web抓取,并使用使用Watson Assistant构建的聊天来查询我的网站。 What you should do:
你应该做什么:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.