简体   繁体   中英

Parsing simple MIME files from C/C++?

I have searched the web for days now but I can't seem to find a good solution to my problem:

For one of my projects I'm looking for a good (lightweight) MIME parser. My customer provides MIME formatted files (linear, no hierarchy) which contain 3-4 "parts". The application must be able to split those parts and process them independently.

Basically those MIME files are like raw E-Mail messages, but without the SMTP-headers. Instead they begin with the MIME-Header "MIME-Version: 1.0" and after that the parts follow.

I am using C++ for the application, so a C++ library is welcome. A standard C library is welcome, too; but it should fit the following criteria:

  • Be open (at least LGPL), not properiaty
  • Compact - I just need the parser, no SMTP/POP3 support
  • Cross-Platform (targeting Windows, Mac OS X and Linux)

After days of searching I found the following libs and reasons why to not use them:

  • mimetic (C++) --- Although this library seems complete and for C++ usage, it is based on glib , which won't properly compile on Windows.
  • Vmime (C++) --- Seems complete, but there is no official Windows support. Also they provide "dual licensing" ("commerical LGPL" + GPL). Seems to be included with Ubuntu and Debian, but the licensing is confusing.
  • mime++ --- Commerical, no Mac support.
  • Chilkat Software MIME C++ Library --- Commerical and focused on Windows.

I don't really want to write my own MIME parser. MIME is so widespread that there must be some open library to handle this file format in a sane way.

So, do you guys have any ideas, suggestions or links?

Thanks in advance!

GMime is an LGPL mime parser written in C. It does depend on glib, but glib is available on Windows: 32bit and 64bit (and all Unix-based platforms, including Mac OS X). It also builds inside Visual Studio afaict, so I fail to see what the problem is. I know there is at least 1 commercial Windows vendor shipping libgmime.dll and libglib.dll in their product (Kerio Connect, iirc). Nokia even ships it on some of their phones.

There is really no such thing as a "lightweight" mime parser if you actually expect it to do anything more than than split headers on ':' and and do haphazard parsing of the Content-Type header to look for a boundary string and then go on to handle non-nested multiparts (kinda useless outside of parsing http responses and pre-canned mime messages that you control the composition of).

The reason that parsers like GMime are so "large", as far as lines of code goes, is because they are meant for developers that actually want correct and robust mime-part and header parsing/decoding. See my rant about decoding rfc2047 encoded-word tokens for an idea about how complex this can get (btw, other than GMime and MimeKit, I have yet to find any open source mime parsers capable of handling all of the edge cases discussed in my rant).

Even with all of this extra robust processing, it's still as fast or faster than most "lightweight" mime parsers are likely to be, especially considering most of them use a readline approach. I've seen "lightweight" mime parsers purport to parse 25MB email files in 2-3 seconds and consider that to be "fast". My unit tests for GMime parse 2 mbox files full of messages larger than 1.2GB (yes, gigabytes) in less time than that.

My point is that "lightweight" is a bullshit criteria by people who don't know what they are talking about.

How about judging based on something meaningful such as rfc compliance? Or by a combination of rfc compliance and performance? Either way, GMime will come out a winner in any meaningful comparison you make.

It's been a while. So I'll just answer my own question.

After spending some more time on this, I ended up writing my own implementation. MIME is quite simple indeed, and if you read the documentation, you have something up and running in a short time.

However, I think there should be something like vMime, but open source. I can't believe that so few people have to deal with MIME structures as it's a real standard.

I've successfully used mimetic with my MSVC2010. It is working on windows too. And has MIT license.

I have also created a mime library (windows only) with s/mime support. But if you do not want the S/Mime stuff, you can remove Windows specific functions.

http://www.codeproject.com/Articles/1114232/Cplusplus-MIME-A-simple-single-header-parser-an

I would suggest mimecpp , a C++ implementation of MIME.

It is very small, well-encapsulated and easy to use. In fact, the source code only contains 7 files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM