简体   繁体   English

将文本文件的编码从ANSI更改为UTF8,而不会影响C#中文件的任何字符!

[英]Change encoding of a text file from ANSI to UTF8 without affecting any chars of the file in C#!

Can anyone help me out? 谁能帮我吗? I tried a lot of different ways, but I have had no luck getting the desired result. 我尝试了许多不同的方法,但是我并没有获得理想的结果。 I just want to change the encoding of an existing text[.txt] file from ANSI to UTF8 which contains chars like ö, ü etc. When I do it manually by opening that text file in edit mode and then FILE=>SAVE AS, it is shows ANSI in the Encoding list. 我只想将现有text [.txt]文件的编码从ANSI更改为UTF8,其中包含ö,ü等字符。当我手动执行此操作时,请在编辑模式下打开该文本文件,然后打开FILE => SAVE AS,在“编码”列表中显示ANSI。 Using this, I am able to change its Encoding from ANSI to UTF8, and it is not changing any contents/chars in this case. 使用此功能,我可以将其编码从ANSI更改为UTF8,在这种情况下,它不会更改任何内容/字符。 But when do it in using CODE, it's not working. 但是,当使用CODE进行操作时,它不起作用。

==> First Way I used to achieve that by following Code: ==>我过去通常通过以下代码来实现这一目标:

if (!System.IO.Directory.Exists(System.Windows.Forms.Application.StartupPath + "\\Temp"))
{
    System.IO.Directory.CreateDirectory(System.Windows.Forms.Application.StartupPath + "\\Temp");
}
string destPath = System.Windows.Forms.Application.StartupPath + "\\Temp\\temporarytextfile.txt";

File.WriteAllText(destPath, File.ReadAllText(path, Encoding.Default), Encoding.UTF8);

==> 2nd Alternative which I used: ==>我使用的第二种选择:

using (Stream fileStream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
    using (Stream destStream = new FileStream(destPath, FileMode.Create, FileAccess.Write, FileShare.ReadWrite))
    {
        using (var reader = new BinaryReader(fileStream, Encoding.Default))
        {
            using (var writer = new BinaryWriter(destStream, Encoding.UTF8))
            {
                var srcBytes = new byte[fileStream.Length];
                reader.Read(srcBytes, 0, srcBytes.Length);
                writer.Write(srcBytes);

            }
        }
    }
}

==> 3rd Alternative I used: ==>我使用的第三个选择:

System.IO.StreamWriter file = new System.IO.StreamWriter(destPath, true, Encoding.Default);
using (StreamReader sr = new StreamReader(path, Encoding.UTF8, true))
{
    String line1;
    while ((line1 = sr.ReadLine()) != null)
    {
        file.WriteLine(line1);
    }
}

file.Close();

But unfortunately, none of the above solutions worked for me. 但不幸的是,以上解决方案都不适合我。

The problem with ANSI is that it's not a specific encoding, it's just a term for "some 8-bit encoding that is the default for the system where it was created". ANSI的问题在于它不是特定的编码,而只是“某些8位编码的术语,这是创建它的系统的默认设置”。

If the file was created on the same system, and the default encoding hasn't changed, you can just use Encoding.Default to read it, so your first and third versions would work. 如果文件是在同一系统上创建的,并且默认编码没有更改,则只需使用Encoding.Default即可读取它,因此第一个和第三个版本都可以使用。 (Your second version just copies the file without any changes.) Otherwise you have to know exactly which encoding was used. (您的第二个版本仅复制文件而没有任何更改。)否则,您必须确切知道使用了哪种编码。

This example uses the windows-1250 code page: 本示例使用Windows-1250代码页:

File.ReadAllText(path, Encoding.GetEncoding(1250))

See the documentation for the Encoding class for a list of available encodings. 有关可用编码的列表,请参见Encoding类的文档。

I had the same need. 我有同样的需求。 Here is how I proceeded: 这是我的处理方式:

    int Encode(string file, Encoding encode)
    {
        int retour = 0;
        try
        {
            using (var reader = new StreamReader(file))
            {
                if (reader.CurrentEncoding != encode)
                {
                    String buffer = reader.ReadToEnd();
                    reader.Close();
                    using (StreamWriter writer = new System.IO.StreamWriter(file, false, encode))
                    {
                        writer.Write(buffer);
                        writer.Close();
                    }
                    message = string.Format("Encode {0} !", file);
                    retour = 2;
                }
                else retour = 1;
            }
        }
        catch(Exception e)
        {
            message = string.Format("{0} ?", e.Message);
        }
        return retour;
    }

    /// <summary>
    /// Change encoding to UTF8
    /// </summary>
    /// <param name="file"></param>
    /// <returns></returns>
    public int toUTF8(string file)
    {
        return Encode(file, Encoding.UTF8);
    }

    public int toANSI(string file)
    {
        return Encode(file, Encoding.Default);
    }

have you tried the below: 您是否尝试过以下方法:

http://msdn.microsoft.com/en-us/library/system.text.encoding.convert%28v=vs.71%29.aspx http://msdn.microsoft.com/zh-CN/library/system.text.encoding.convert%28v=vs.71%29.aspx

using System;
using System.Text;
namespace ConvertExample
{
   class ConvertExampleClass
   {
      static void Main()
      {
         string unicodeString = "This string contains the unicode character Pi(\u03a0)";

         // Create two different encodings.
         Encoding ascii = Encoding.ASCII;
         Encoding unicode = Encoding.Unicode;

         // Convert the string into a byte[].
         byte[] unicodeBytes = unicode.GetBytes(unicodeString);

         // Perform the conversion from one encoding to the other.
         byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);

         // Convert the new byte[] into a char[] and then into a string.
         // This is a slightly different approach to converting to illustrate
         // the use of GetCharCount/GetChars.
         char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
         ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
         string asciiString = new string(asciiChars);

         // Display the strings created before and after the conversion.
         Console.WriteLine("Original string: {0}", unicodeString);
         Console.WriteLine("Ascii converted string: {0}", asciiString);
      }
   }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM