简体   繁体   English

如何将土耳其字符转换为字符串中的英文字符?

[英]How to convert Turkish chars to English chars in a string?

string strTurkish = "ÜST";字符串 strTurkish = "ÜST";

how to make value of strTurkish as "UST" ?如何将 strTurkish 的价值设为“UST”?

"

You can use the following method for solving your problem.您可以使用以下方法来解决您的问题。 The other methods do not convert "Turkish Lowercase I (\ı)" correctly.其他方法无法正确转换“土耳其语小写 I (\ı)”。

public static string RemoveDiacritics(string text)
{
    Encoding srcEncoding = Encoding.UTF8;
    Encoding destEncoding = Encoding.GetEncoding(1252); // Latin alphabet

    text = destEncoding.GetString(Encoding.Convert(srcEncoding, destEncoding, srcEncoding.GetBytes(text)));

    string normalizedString = text.Normalize(NormalizationForm.FormD);
    StringBuilder result = new StringBuilder();

    for (int i = 0; i < normalizedString.Length; i++)
    {
        if (!CharUnicodeInfo.GetUnicodeCategory(normalizedString[i]).Equals(UnicodeCategory.NonSpacingMark))
        {
            result.Append(normalizedString[i]);
        }
    }

    return result.ToString();
}
var text = "ÜST";
var unaccentedText  = String.Join("", text.Normalize(NormalizationForm.FormD)
        .Where(c => char.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark));

I'm not an expert on this sort of thing, but I think you can use string.Normalize to do it, by decomposing the value and then effectively removing an non-ASCII characters:我不是这类事情的专家,但我认为您可以使用string.Normalize来做到这一点,方法是分解值然后有效地删除非 ASCII 字符:

using System;
using System.Linq;
using System.Text;

class Test
{
    static void Main()
    {
        string text = "\u00DCST";
        string normalized = text.Normalize(NormalizationForm.FormD);
        string asciiOnly = new string(normalized.Where(c => c < 128).ToArray());
        Console.WriteLine(asciiOnly);
    }    
}

It's entirely possible that this does horrible things in some cases though.不过,在某些情况下,这完全有可能会做可怕的事情。

This is not a problem that requires a general solution.这不是一个需要通用解决方案的问题。 It is known that there only 12 special characters in Turkish alphabet that has to be normalized.众所周知,土耳其字母表中只有 12 个特殊字符需要标准化。 Those are ı,İ,ö,Ö,ç,Ç,ü,Ü,ğ,Ğ,ş,Ş.这些是 ı,İ,ö,Ö,ç,Ç,ü,Ü,ğ,Ğ,ş,Ş。 You can write 12 rules to replace those with their English counterparts: i,I,o,O,c,C,u,U,g,G,s,S.您可以编写 12 条规则来将它们替换为对应的英文规则:i,I,o,O,c,C,u,U,g,G,s,S。

public string TurkishCharacterToEnglish(string text)
{
    char[] turkishChars = {'ı', 'ğ', 'İ', 'Ğ', 'ç', 'Ç', 'ş', 'Ş', 'ö', 'Ö', 'ü', 'Ü'};
    char[] englishChars = {'i', 'g', 'I', 'G', 'c', 'C', 's', 'S', 'o', 'O', 'u', 'U'};
    
    // Match chars
    for (int i = 0; i < turkishChars.Length; i++)
        text = text.Replace(turkishChars[i], englishChars[i]);

    return text;
}
Public Function Ceng(ByVal _String As String) As String
    Dim Source As String = "ığüşöçĞÜŞİÖÇ"
    Dim Destination As String = "igusocGUSIOC"
    For i As Integer = 0 To Source.Length - 1
        _String = _String.Replace(Source(i), Destination(i))
    Next
    Return _String
End Function
    public static string TurkishChrToEnglishChr(this string text)
    {
        if (string.IsNullOrEmpty(text)) return text;

        Dictionary<char, char> TurkishChToEnglishChDic = new Dictionary<char, char>()
        {
            {'ç','c'},
            {'Ç','C'},
            {'ğ','g'},
            {'Ğ','G'},
            {'ı','i'},
            {'İ','I'},
            {'ş','s'},
            {'Ş','S'},
            {'ö','o'},
            {'Ö','O'},
            {'ü','u'},
            {'Ü','U'}
        };

        return text.Aggregate(new StringBuilder(), (sb, chr) =>
        {
            if (TurkishChToEnglishChDic.ContainsKey(chr))
                sb.Append(TurkishChToEnglishChDic[chr]);
            else
                sb.Append(chr);

            return sb;
        }).ToString();
    }

Hey go through this link, you'll find the code for it.嘿,通过这个链接,你会找到它的代码。 I didn't create it though, just to make sure.我没有创建它,只是为了确保。

Turkish chars to English chars<\/a>土耳其字符到英语字符<\/a>

"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM