简体   繁体   English

从SWF提取文本

[英]Extract Text from SWF

We currently use print2flash ( http://print2flash.com ) to convert user submitted documents (Word documents, RTF, PowerPoint, etc) into Flash-based documents that can be viewed online (a la docstoc and scribd). 当前,我们使用print2flash( http://print2flash.com )将用户提交的文档(Word文档,RTF,PowerPoint等)转换为可在线查看的基于Flash的文档(例如docstoc和scribd)。

We would like to index the text inside these files for full-text indexing. 我们想索引这些文件中的文本以进行全文索引。 Are there any tools or libraries we can use to accomplish this? 我们是否可以使用任何工具或库来完成此任务?

We are developing in ASP.NET / C# and have tried working with 3rd party tools such as SWFTools ( http://www.swftools.org ) but the results have been inconsistent and subpar. 我们正在ASP.NET / C#中进行开发,并尝试使用第三方工具(如SWFTools( http://www.swftools.org )),但结果不一致且不合格。

PS: We would like to do the indexing after the original document has been converted to flash because that gives us fewer file formats to deal with. PS:我们希望在将原始文档转换为Flash之后进行索引,因为这样可以减少我们要处理的文件格式。

Your best bet is a third-party Flash parsing library. 最好的选择是第三方Flash解析库。 Flash has a very dense format and it's painful to parse. Flash的格式非常密集,很难解析。 Having said that, the format is well-understood. 话虽如此,这种格式还是很容易理解的。 You can find the official specification here: http://www.adobe.com/devnet/swf/ 您可以在这里找到官方规范: http : //www.adobe.com/devnet/swf/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM