简体   繁体   English

PHP,将 UTF-8 转换为 ASCII 8 位

[英]PHP, convert UTF-8 to ASCII 8-bit

I'm trying to convert a string from UTF-8 to ASCII 8-bit by using the iconv function.我正在尝试使用iconv函数将字符串从 UTF-8 转换为 ASCII 8 位。 The string is meant to be imported into an accounting software (some basic instructions parsed accordingly to SIE standards).该字符串旨在导入会计软件(根据 SIE 标准解析的一些基本指令)。

What I'm running now:我现在正在运行的内容:

iconv("UTF-8", "ASCII", $this->_output)

This works for accounting software #1, but software #2 complains about the encoding.这适用于会计软件 #1,但软件 #2 抱怨编码。 Specified encoding by the standard is: IBM PC 8-bit extended ASCII (Codepage 437) .标准指定的编码是: IBM PC 8-bit extended ASCII (Codepage 437)

My question is, what version of ASCII is PHP encoding my string into, and if other than specified - how can I encode the string accordingly to the standard specification?我的问题是,PHP 将我的字符串编码到哪个版本的 ASCII 中,如果不是指定的 - 我该如何根据标准规范对字符串进行编码?

try this for the software #2 试试这个软件#2

iconv("UTF-8", "CP437", $this->_output);

Extended ASCII is not the same as plain ASCII. 扩展ASCII与普通ASCII不同。 The first one maybe accepts ASCII, but the second software requires Extended ASCII - Codepage 437 第一个可能接受ASCII,但第二个软件需要扩展ASCII - 代码页437

see this link 看到这个链接

I'm looking at this question and what has been posted as an answer and am very disappointed in what I find here in addition to what I have been able to glean so far from other sources such as the PHP documentation in terms of acceptable or better answers.我正在查看这个问题以及作为答案发布的内容,并且对我在这里找到的内容感到非常失望,此外我还能够从其他来源(例如 PHP 文档)中收集到的可接受或更好的内容答案。

I have an input string that is a property of an object.我有一个输入字符串,它是一个对象的属性。 The input is UTF-8 from a database and I am happy that it is good form and valid.输入是来自数据库的 UTF-8,我很高兴它是好的形式和有效的。 Every indication is that this is true.每一个迹象都表明这是真的。 It comes originally from a database where it is prepared and stored by a third party.它最初来自一个由第三方准备和存储的数据库。 I would prefer to not need to have to change the input string before it is processed by this function.我宁愿不需要在此函数处理之前更改输入字符串。 After processing the string is displayed on a webpage with meta-charset of UTF-8.处理后的字符串显示在带有 UTF-8 元字符集的网页上。 That I have already checked.我已经检查过了。

The input string has HTML entities which I want to preserve so the first thing is to decode the HTML entities.输入字符串有我想保留的 HTML 实体,所以第一件事是解码 HTML 实体。 If someone has a better idea this operation could be moved from first place to the end of the function.如果有人有更好的主意,则可以将此操作从第一个位置移到函数的末尾。 In my way of thinking it should not matter in which sequence this takes place.在我看来,这发生的顺序应该无关紧要。

So this brings me back to the original question.所以这让我回到了最初的问题。 How should one convert from UTF-8 to ASCII 8-bit using PHP?应该如何使用 PHP 从 UTF-8 转换为 ASCII 8 位? It really doesn't matter what you answer at this point because I have already started to go my own way, which is evident from the PHP code below.在这一点上你回答什么真的无关紧要,因为我已经开始走自己的路,这从下面的 PHP 代码中可以明显看出。

Essentially what I have begun to do is programmatically decode UTF-8 as problems arise.基本上我已经开始做的是在出现问题时以编程方式解码 UTF-8。 One advantage of this is that I can substitute whatever I choose for each problem as it arises, but I'd really rather rely on the community.这样做的一个优点是,我可以在出现问题时为每个问题选择任何替代方案,但我真的更愿意依赖社区。

    function decodedText($langObject, $keyString) {
    $decodeText = htmlspecialchars_decode($langObject->$keyString);
    
    //$decodeText = iconv("UTF-8", "ISO-8859-1//IGNORE", $decodeText);
    //$decodeText = iconv("UTF-8", "CP437", $decodeText);
    
    $decodeText = str_replace("\204", '"', $decodeText); // quote
    $decodeText = str_replace("\223", '"', $decodeText); // quote
    $decodeText = str_replace("\224", '"', $decodeText); // quote
    $decodeText = str_replace("\302", "", $decodeText); // first byte of a 2-byte utf-8
    if ("p14p1" == $keyString) {
        error_log("BEFORE:");
        error_log($langObject->$keyString);
        error_log(substr($langObject->$keyString, 592, 26));
        //error_log(mb_ord(substr($langObject->$keyString, 36)));
        error_log(ord(substr($langObject->$keyString, 592, 1)));
        error_log(decbin(ord(substr($langObject->$keyString, 592, 1))));
        error_log(decbin(ord(substr($langObject->$keyString, 593, 1))));
        error_log(decbin(ord(substr($langObject->$keyString, 594, 1))));
        error_log(decbin(ord(substr($langObject->$keyString, 595, 1))));
        error_log(decbin(ord(substr($langObject->$keyString, 596, 1))));
        error_log(decbin(ord(substr($langObject->$keyString, 597, 1))));
        error_log($decodeText);
        error_log("AFTER:");
    }
    return $decodeText;
}

Provide a better answer to the original question or ignore this as you wish.为原始问题提供更好的答案,或者根据需要忽略此问题。 It is interesting that at this time this question has been viewed 37,000+ times and so far there has been essentially no helpful answers given.有趣的是,此时这个问题已经被浏览了 37,000 多次,到目前为止基本上没有给出有用的答案。 And the existing answers have a total of 13 upvotes.现有的答案共有 13 个赞成票。 BTW, CP437 did not work for me.顺便说一句,CP437 对我不起作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM