读取Haskell中具有“ US-ASCII”编码的文件：hGetContents：无效的参数（无效的字节序列）

Question

I'm using Haskell for programming a parser, but this error is a wall I can't pass. 我正在使用Haskell对解析器进行编程，但是此错误是我无法通过的墙。 Here is my code: 这是我的代码：

main = do
  arguments    <- getArgs
  let fileName = head arguments
  fileContents <- readFile fileName
  converter    <- open "UTF-8" Nothing
  let titleLength           = length fileName
      titleWithoutExtension = take (titleLength - 4) fileName
      allNonEmptyLines      = unlines $ tail $ filter (/= "") $ lines fileContents

When I try to read a file with "US-ASCII" encoding I get the famous error hGetContents: invalid argument (invalid byte sequence). 当我尝试使用“ US-ASCII”编码读取文件时，出现著名的错误hGetContents：无效参数（无效字节序列）。 I've tried to change the "UTF-8" in my code by "US-ASCII", but the error persist. 我试图通过“ US-ASCII”更改代码中的“ UTF-8”，但错误仍然存在。 Is there a way for reading this files, or any kind of file handling encoding problems? 有没有办法读取此文件，或者有任何类型的文件处理编码问题？

Answer 1

You should hSetEncoding to configure the file handle for a specific text encoding, eg: 您应该hSetEncoding为特定的文本编码配置文件句柄，例如：

import System.Environment
import System.IO

main = do
  (path : _) <- getArgs
  h <- openFile path ReadMode
  hSetEncoding h latin1
  contents <- hGetContents h
  -- no need to close h
  putStrLn $ show $ length contents

If your file contains non-ASCII characters and it's not UTF8 encoded, then latin1 is a good bet although it's not the only possibility. 如果您的文件包含非ASCII字符且未使用UTF8编码，则latin1是一个不错的选择，尽管这不是唯一的可能。

读取Haskell中具有“ US-ASCII”编码的文件：hGetContents：无效的参数（无效的字节序列）

问题描述

1 个解决方案

解决方案1
5 已采纳 2015-11-06 21:02:12

读取Haskell中具有“ US-ASCII”编码的文件：hGetContents：无效的参数（无效的字节序列）

问题描述

1 个解决方案

解决方案1 5 已采纳 2015-11-06 21:02:12

解决方案1
5 已采纳 2015-11-06 21:02:12