[英]How to properly output a string in a Windows console with go?
I have a exe
in go which prints utf-8 encoded strings, with special characters in it. 我有一个
exe
in go打印utf-8编码的字符串,其中包含特殊字符。
Since that exe is made to be used from a console window, its output is mangled because Windows uses ibm850
encoding (aka code page 850
). 由于该exe用于从控制台窗口使用,因此其输出被破坏,因为Windows使用
ibm850
编码(也就是code page 850
)。
How would you make sure the go exe
print correctly encoded strings for a console windows, ie print for instance: 你如何确保go
exe
为控制台窗口打印正确编码的字符串,例如打印:
éèïöîôùòèìë
instead of (without any translation to the right charset ) 而不是(没有任何翻译到正确的字符集 )
├®├¿├»├Â├«├┤├╣├▓├¿├¼├½
// Alert: This is Windows-specific, uses undocumented methods, does not
// handle stdout redirection, does not check for errors, etc.
// Use at your own risk.
// Tested with Go 1.0.2-windows-amd64.
package main
import "unicode/utf16"
import "syscall"
import "unsafe"
var modkernel32 = syscall.NewLazyDLL("kernel32.dll")
var procWriteConsoleW = modkernel32.NewProc("WriteConsoleW")
func consolePrintString(strUtf8 string) {
var strUtf16 []uint16
var charsWritten *uint32
strUtf16 = utf16.Encode([]rune(strUtf8))
if len(strUtf16) < 1 {
return
}
syscall.Syscall6(procWriteConsoleW.Addr(), 5,
uintptr(syscall.Stdout),
uintptr(unsafe.Pointer(&strUtf16[0])),
uintptr(len(strUtf16)),
uintptr(unsafe.Pointer(charsWritten)),
uintptr(0),
0)
}
func main() {
consolePrintString("Hello ☺\n")
consolePrintString("éèïöîôùòèìë\n")
}
The online book " Network programming with Go " ( CC BY-NC-SA 3.0 ) has a chapter on Charsets ( Managing character sets and encodings ), in which Jan Newmarch details the conversion of one charset to another . 在线书籍“ 网络编程与Go ”( CC BY-NC-SA 3.0 )有一章关于Charsets( 管理字符集和编码 ),其中Jan Newmarch详述了一个字符集到另一个字符集的转换 。 But it seems cumbersome.
但这似乎很麻烦。
Here is a solution (I might have missed a much simpler one), using the library go-charset (from Roger Peppe ). 这是一个解决方案(我可能错过了一个更简单的解决方案),使用库go-charset (来自Roger Peppe )。
I translate an utf-8
string to an ibm850
encoded one, allowing me to print in a DOS windows: 我将
utf-8
字符串翻译成ibm850
编码的字符串,允许我在DOS窗口中打印:
éèïöîôùòèìë
The translation function is detailed below: 翻译功能详述如下:
package main
import (
"bytes"
"code.google.com/p/go-charset/charset"
_ "code.google.com/p/go-charset/data"
"fmt"
"io"
"log"
"strings"
)
func translate(tr charset.Translator, in string) (string, error) {
var buf bytes.Buffer
r := charset.NewTranslatingReader(strings.NewReader(in), tr)
_, err := io.Copy(&buf, r)
if err != nil {
return "", err
}
return string(buf.Bytes()), nil
}
func Utf2dos(in string) string {
dosCharset := "ibm850"
cs := charset.Info(dosCharset)
if cs == nil {
log.Fatal("no info found for %q", dosCharset)
}
fromtr, err := charset.TranslatorTo(dosCharset)
if err != nil {
log.Fatal("error making translator from %q: %v", dosCharset, err)
}
out, err := translate(fromtr, in)
if err != nil {
log.Fatal("error translating from %q: %v", dosCharset, err)
}
return out
}
func main() {
test := "éèïöîôùòèìë"
fmt.Println("utf-8:\n", test)
fmt.Println("ibm850:\n", Utf2dos(test))
}
Since 2016, You can now (2017) consider the golang.org/x/text
, which comes with a encoding charmap including the ISO-8859 family as well as the Windows 1252 character set. 自2016年起,您现在可以(2017)考虑
golang.org/x/text
,它带有编码charmap,包括ISO-8859系列以及Windows 1252字符集。
See " Go Quickly - Converting Character Encodings In Golang " 请参阅“ 快速转到 - 在Golang中转换字符编码 ”
r := charmap.ISO8859_1.NewDecoder().Reader(f)
io.Copy(out, r)
That is an extract of an example opening a ISO-8859-1 source text ( my_isotext.txt
), creating a destination file ( my_utf.txt
), and copying the first to the second. 这是打开ISO-8859-1源文本(
my_isotext.txt
)的示例的摘录,创建目标文件( my_utf.txt
),并将第一个复制到第二个。
But to decode from ISO-8859-1 to UTF-8, we wrap the original file reader ( f
) with a decoder. 但是要从ISO-8859-1解码为UTF-8,我们用解码器包装原始文件阅读器(
f
)。
I just tested (pseudo-code for illustration): 我刚刚测试过(伪代码用于说明):
package main
import (
"fmt"
"golang.org/x/text/encoding"
"golang.org/x/text/encoding/charmap"
)
func main() {
t := "string composed of character in cp 850"
d := charmap.CodePage850.NewDecoder()
st, err := d.String(t)
if err != nil {
panic(err)
}
fmt.Println(st)
}
The result is a string readable in a Windows CMD. 结果是Windows CMD中可读取的字符串。
See more in this Nov. 2018 reddit thread . 在2018年11月的reddit主题中查看更多内容。
It is something that Go still can't do out of the box - see http://code.google.com/p/go/issues/detail?id=3376#c6 . Go仍然无法开箱即用 - 请参阅http://code.google.com/p/go/issues/detail?id=3376#c6 。
Alex 亚历克斯
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.