简体繁体 English

文档搜索并添加引擎Web应用程序

[英]Document search and add engine web application

原文 2016-01-04 18:32:34 9 1 c#/ asp.net

I want to develop a asp.net web application which should do the following task a) user should be able to add content to the document. 我想开发一个应执行以下任务的asp.net Web应用程序：a）用户应该能够向文档中添加内容。 Content to be added can include text as well as image, screen shots etc. b) user should be able to search based on some keywords. 要添加的内容可以包括文本以及图像，屏幕截图等。b）用户应该能够基于某些关键字进行搜索。 when searching with the keyword appropriate content along with images(if any) should be shown to user. 使用关键字搜索时，应向用户显示适当的内容以及图像（如果有）。

I am not sure what should be the proper approach for this. 我不确定应该采用哪种正确的方法。 One way i think is to store text content in some xml file and later search for keywords by going though each node of xml and displaying. 我认为一种方法是将文本内容存储在一些xml文件中，然后通过遍历xml的每个节点并显示来搜索关键字。 but i am not sure how to attach image content with xml. 但是我不确定如何用xml附加图像内容。 Also this method doesn't seem to be nice and efficient if with time document size increases a lot. 此外，如果随着时间的推移文档大小增加很多，这种方法似乎也不是一种很好的方法。

Anyone please suggest some proper way to do above requirement. 任何人都可以提出一些满足上述要求的正确方法。 Any hint would be appreciated. 任何提示将不胜感激。

1 个解决方案

Split it to two tasks. 将其拆分为两个任务。 Editation and search. 编辑和搜索。

Full text search is solved problem. 全文搜索解决了问题。 Simply use Sphinx Search and you are done. 只需使用Sphinx Search ，您就完成了。 Sphinx is simple to use and can do everything you will need. Sphinx使用简单，可以满足您的所有需求。 It has MySQL interface (your app connects to sphinx the same way as to second MySQL database). 它具有MySQL界面（您的应用程序以与第二个MySQL数据库相同的方式连接到sphinx）。

Editation is a bit more complicated. 编辑有点复杂。 If I understand correctly, you want multiple users to edit single document concurrently. 如果我理解正确，则希望多个用户同时编辑单个文档。

I recommend using websockets to notify other clients about changes in document. 我建议使用websockets通知其他客户端有关文档更改的信息。 Long-polling and Server Sent Events have ugly side effects, like stopping browser from making another requests to server. 长轮询和服务器发送事件具有丑陋的副作用，例如阻止浏览器向服务器发出另一个请求。 To implement client side in Javascript, I would use React, Angular or similar framework to make updates as easy as possible. 为了在Javascript中实现客户端，我将使用React，Angular或类似框架使更新尽可能容易。

Server side requires modification-friendly representation of a document, so if one user changes one part, and another user another part, your app should be able to merge changes. 服务器端需要文档的易于修改的表示形式，因此，如果一个用户更改一个部分，而另一用户更改另一个部分，则您的应用程序应该能够合并更改。 Changing completely different parts is easy, but it may be tricky to change the same paragraph or document node. 更改完全不同的部分很容易，但是更改相同的段落或文档节点可能很棘手。 Exact representation of each change depends on format of your document. 每次更改的确切表示形式取决于文档的格式。

I do not see much benefits of using XML rather than any other format. 我没有看到使用XML而不是任何其他格式的好处。 It may be practical for document representation, but it will not help with merging of colliding modifications. 这对于文档表示来说可能是实用的，但对合并冲突的修改将无济于事。 I would start with plain array of strings, each representing a single paragraph. 我将从简单的字符串数组开始，每个字符串代表一个段落。 Extending it to full XML document is the easy part, once two users can edit the same paragraph. 一旦两个用户可以编辑相同的段落，将其扩展到完整的XML文档是容易的部分。

To store images in XML, simply store files using their hash as a file name and then use such name to link the file in XML. 要将图像存储为XML，只需使用其哈希作为文件名存储文件，然后使用该名称将文件链接为XML。 Git does the same thing and it works nicely. Git做同样的事情，并且效果很好。 You may want to count references to identify unused files. 您可能需要计数引用以标识未使用的文件。