简体   繁体   中英

create help file in txt from a ms word document

I need to create a txt file from a ms word document. The txt file will be used as a help document for my user interface and therefore it needs to be in a special format. Is there any third party software that I can use to read a ms word doc and create a text file from it in a certain format? Or, can I use PERL to read a word doc in a way where I can extract the headers, tables and section headings as specified in the word doc. I need to read the word document and while I am parsing it, I need to figure out a way to say if the line parsed from the ms word doc is a table content or a section heading? Or is there any other way of doing it?

I have a lot more familiarity with parsing HTML, so I would suggest that you translate your Word docs into HTML first using MSWord::ToHTML or some equivalent module.

Then you can use one of the myriad of HTML Parsing modules out there, like Mojo::DOM to parse out your data and its styling. There's an 8 minute video on how to use the latter module Mojocast Episode 5 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM