用代码替换所有非 ascii 字符

Question

是否可以在 c# 字符串中用代码替换所有非 ASCII 字符。 我有一个使用 ZPL 打印到 Zebra 标签打印机的应用程序。 它需要将所有 UTF-8 字符转换为带有前导下划线的代码。 例如，如果用户想要打印 µ（微型符号），我必须这样做

text = text.replace("µ", "_c2_b5");  //c2b5 is the UTF8 code for µ

示例“Helloµ±”应变为“Hello_c2_b5_c2_b1”

Answer 1

这将有助于：

var source =  "Helloµ±";
var sb = new StringBuilder();
foreach (char c in source)
{
    if (c == '_')
    {
        // special case: Replace _ With _5f
        sb.Append("_5f");
    }
    else if (c < 32 || c > 127)
    {
        // handle non-ascii by using hex representation of bytes
        // TODO: check whether "surrogate pairs" are handled correctly (if required)
        var ba = Encoding.UTF8.GetBytes(new[] { c });
        foreach (byte b in ba)
        {
            sb.AppendFormat("_{0:x2}", b);
        }   
    }
    else
    {
        // in printable ASCII range, so just copy
        sb.Append(c);
    }
}

Console.WriteLine(sb.ToString());

这导致"Hello_c2_b5_c2_b1"

你可以用一个好的方法来包装它。

后期添加：前两个测试可以组合，因为_只需用它的字节表示替换，以避免混淆 _ 在结果中的含义：

if (c == '_' || c < 32 || c > 127)
{
    var ba = Encoding.UTF8.GetBytes(new[] { c });
    foreach (byte b in ba)
    {
        sb.AppendFormat("_{0:x2}", b);
    }
}
else
{
    sb.Append(c);
}

Answer 2

你可以试试这个。

var bytes = System.Text.Encoding.ASCII.GetBytes("søme string");
string result = System.Text.Encoding.UTF8.GetString(bytes);

Answer 3

["

string s = "søme string";
s = Regex.Replace(s, @"[^\u0000-\u007F]+", string.Empty);

用代码替换所有非 ascii 字符

问题描述

3 个解决方案

解决方案1
2 已采纳 2021-07-12 09:34:25

解决方案2
-1 2021-07-11 11:09:58

解决方案3
-2 2021-07-11 05:24:03

用代码替换所有非 ascii 字符

问题描述

3 个解决方案

解决方案1 2 已采纳 2021-07-12 09:34:25

解决方案2 -1 2021-07-11 11:09:58

解决方案3 -2 2021-07-11 05:24:03

解决方案1
2 已采纳 2021-07-12 09:34:25

解决方案2
-1 2021-07-11 11:09:58

解决方案3
-2 2021-07-11 05:24:03