[英]PHP - Laravel - Malformed UTF-8 characters, possibly incorrectly encoded
[英]Correcting incorrectly encoded string (ASCII characters back to UTF-8)
這是我從一個Android“ wifi配置文件”( wpa_supplicant.conf
)中提取的示例WiFi ssid。
我正在嘗試顯示文件中的所有ssid,大多數都可以,因為它們是用引號引起來的普通字符串,例如,
network={
ssid="Linksys"
...
}
但是,有些條目只是想有所不同和特殊,例如,
network={
ssid=e299aa20e6b7a1e5ae9ae69c89e98ca2e589a920e299ab
...
}
現在,問題是,如何將其轉換回可讀的字符串(最好是在JS中)? 我懷疑編碼錯誤(盡管它在本機設備上正確顯示。)
顯然,該字符串未進行十六進制編碼。 通過一些字符串操作將其轉換回二進制,我能夠將其編碼回可讀形式。
function HextoUTF8(txt) {
function HexStringToBytes(str) {
if (str.length % 2) throw TypeError("Not a valid length");
return [].map.call(str, function(e) {
return ("000" + parseInt(e, 16).toString(2)).slice(-4);
}).join("").match(/.{8}/g);
}
function BytesToUTF8(bytes) {
var inExpectationMode = false,
itr = new Iterator(bytes),
byte,
availableBitsTable = {
"1": -7,
"2": -5,
"3": -4,
"4": -3
},
expectingBitsLeft = 0,
currectCharacter = "",
result = "";
while (byte = itr.next(), !byte.ended) {
byte = byte.value;
if (inExpectationMode) {
currectCharacter += byte.slice(-6);
} else {
//First in sequence
expectingBitsLeft = determineSequenceLength(byte);
currectCharacter += byte.slice(availableBitsTable[expectingBitsLeft]);
}
inExpectationMode = true;
expectingBitsLeft--;
if (!expectingBitsLeft) {
inExpectationMode = false;
result += String.fromCharCode(parseInt(currectCharacter, 2));
currectCharacter = "";
}
}
return result;
}
function determineSequenceLength(byte) {
if (byte[0] === "0") return 1;
else if (byte.slice(0, 3) === "110") return 2;
else if (byte.slice(0, 4) === "1110") return 3;
else if (byte.slice(0, 5) === "11110") return 4;
}
function Iterator(array) {
if (this === window) throw TypeError("This is a class");
if (!Array.isArray(array)) throw TypeError("An array is required");
this.i = -1;
this.ended = !array.length;
this.array = function() {
return array;
};
}
Iterator.prototype.next = function() {
if (this.ended || ++this.i == this.array().length) {
this.ended = true;
return {
ended: true
};
} else {
return {
ended: this.ended,
value: this.array()[this.i]
};
}
}
return BytesToUTF8(HexStringToBytes(txt));
}
理想情況下,我應該進行位操作,但是無論如何,
> HextoUTF8("e299aa20e6b7a1e5ae9ae69c89e98ca2e589a920e299ab");
> "♪ 淡定有錢剩 ♫"
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.