currently I try to read a us-ascii file into golang, but everytime I do so, every special sign, like Ä Ö Ü ß gets replaced with a ? or on my database with the special sign ?.
Is there anything I could do to prevent it?
Here is how I read my file:
file, err := os.Open(path)
if err != nil {
return err
}
var lines []string
r := bufio.NewReader(file)
for {
line, err := r.ReadBytes('\n')
if err != nil {
break
}
lines = append(lines, string(line))
}
fmt.Println(strings.Join(lines, ""))
index.Content = strings.Join(lines, "")
Since the letters Ä Ö Ü ß doesn't exist in US-ASCII, I would make an educated guess that you are actually dealing with the Latin-1 (ISO-8859-1) encoding.
Converting from Latin-1 can be done like this:
runes := make([]rune, len(line))
for i, b := range line {
runes[i] = rune(b)
}
lines = append(lines, string(runes))
Edit:
The example is not optimized, but it shows how a Latin-1 byte can be stored in a rune
as the values of Latin-1 corresponds directly to the Unicode code point. The actual encoding into UTF-8 is then done when converting []rune
to string
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.