简体   繁体   中英

How to check text file original encoding VC++ or MFC

I use CStdioFile to read text file and output is string but I want to check original encoding file when I choose file in dialog How can I check original encoding?

//This is my code

if(dlg.DoModal() == IDOK)
{

    path = dlg.GetPathName(); //get file path
    CStdioFile pStdioFile1(path, CFile::modeRead);  
    char buff[BUFSIZ];

    while(!feof(pStdioFile1.m_pStream))
        {

            pStdioFile1.ReadString(Buff); //Buff is read text to string  
            msg += Buff;

            if(!feof(pStdioFile1.m_pStream))
            {
                msg += "\n";
            }

        }

You can't. In some cases the data will contain indications of the encoding used, but you can't really depend on it. Windows does provide IstextUnicode to give you a guess at whether some text is unicode (in this case meaning UTF-16) or not, but 1) it's only good for Unicode, and 2) the result is only a guess anyway.

As an aside, I'm not excited about your code for reading the whole file into a string. Assuming the file is expected to be fairly small, I'd normally use something like:

std::ifstream in(dlg.GetPathName());
std::stringstream buffer;
buffer << in.rdbuf();

// now the content of the file is availble as `buffer.str()`.

Check the BOM (Byte Order Mark) of the file (see http://en.wikipedia.org/wiki/Byte_order_mark ).

If the file does not contain a BOM, assume it's an 8-bit ANSI file.

Otherwise, the BOM indicates the format of the file. Check the link, it contains a nice table of the different BOM's and their meaning.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM