简体繁体中英

C# Searching PDFs

原文 2017-11-18 20:13:44 7 1 c#/ pdf/ search

I'm using iTextSharp to get the content out of a pdf. I want to allow the user to search for PDFs, much like they do on any search engine. The search should return the most relevant results. I have written a library that performs the TF-IDF algorithm on the documents to return relevant results. While this works, I feel like I may be reinventing the wheel.

This user should be able to search well over 50,000 PDFs. So there's alot of them. I don't want to store the full content of the PDF in my database as I feel that would be SUPER expensive. To mitigate this, I've written my library so that it will accept a frequency distribution when calculating TF-IDF. This allows me to read the PDF when it's added to the system instead of every time a search is performed.

Do libraries exist that already do this sort of thing?

1 answers

Lucene.NET will do what you need.

And there are commercial ones like our 'SearchUnit'

Searching pdfs in a C# winform

Combine PDFs c#

dynamically displaying PDFs in C# MVC with iframe

Is there any way to "sanitize" PDFs in C#?

C#: downloading and attaching PDFs to MailMessage are corrupt

Creating PDFs on the fly using AJAX & C#

C# GhostScript convert multiple PDFs to PostScript

How to render pdfs using C#

c# searching arraylist

C# - Searching strings

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Searching pdfs in a C# winform Combine PDFs c# dynamically displaying PDFs in C# MVC with iframe Is there any way to "sanitize" PDFs in C#? C#: downloading and attaching PDFs to MailMessage are corrupt Creating PDFs on the fly using AJAX & C# C# GhostScript convert multiple PDFs to PostScript How to render pdfs using C# c# searching arraylist C# - Searching strings

Related Tags

C# Searching PDFs

Question

1 answers

solution1 0 ACCPTED 2017-11-19 01:28:59

solution1
0 ACCPTED 2017-11-19 01:28:59