简体   繁体   English

如何在 PowerShell 中读取 UTF-16 编码的 StdIn

[英]How to read UTF-16 encoded StdIn in PowerShell

I am trying to pass (a large number of [1]) strings from a native Windows host application (C++/WinApi) to a PowerShell script, which the host application launches using CreateProcess .我正在尝试将(大量 [1])字符串从本机 Windows 主机应用程序(C++/WinApi)传递到 PowerShell 脚本,主机应用程序使用CreateProcess启动该脚本。

I use an anonymous pipe in STARTUPINFO::hStdInput as the IPC mechanism.我在STARTUPINFO::hStdInput使用匿名管道作为 IPC 机制。 The data being written to the pipe consists of lines of UTF-16LE strings [2].写入管道的数据由多行 UTF-16LE 字符串组成 [2]。 What is printed by a naive PowerShell script一个简单的 PowerShell 脚本打印了什么

foreach ($line in $input) {
    write-host  $line
}

however, looks like the data from StdIn is being interpreted in an ANSI code page (each UTF 16 code unit from the input shows up as a pair of letters in the output).然而,来自 StdIn 的数据看起来像是在 ANSI 代码页中被解释(输入中的每个 UTF 16 代码单元在输出中显示为一对字母)。

How can I make PowerShell to recognize the data from StdIn as UTF-16?如何让 PowerShell 将 StdIn 中的数据识别为 UTF-16?

I have already tried to我已经尝试过

  • prepend a UTF-16 BOM before the rest of the data on the pipe在管道上的其余数据之前添加 UTF-16 BOM
  • play with PowerShell's $InputEncoding , $OutputEncoding and .Net's [Console]::InputEncoding使用 PowerShell 的$InputEncoding$OutputEncoding和 .Net 的[Console]::InputEncoding

to no avail.无济于事。 Yes, I could write a large text file first and then read it in PowerShell but I would rather not do this.是的,我可以先编写一个大文本文件,然后在 PowerShell 中读取它,但我宁愿不这样做。

[1] This is why I would like to use a pipe and leverage the stream processing capabilities of PowerShell. [1] 这就是为什么我想使用管道并利用 PowerShell 的流处理功能。
[2] Translating the data to a non-Unicode code page is not an option. [2] 无法将数据转换为非 Unicode 代码页。

Just to finally clean up this old question: Setting up the .Net console input encoding (that is what Powershell builds upon) correctly is a pretty nontrivial issue.只是为了最终解决这个老问题:正确设置 .Net 控制台输入编码(这是 Powershell 构建的基础)是一个非常重要的问题。 I finally worked around the problem, because I didn't want to burden the Powershell-script developers with the input encoding setup.我终于解决了这个问题,因为我不想给 Powershell 脚本开发人员增加输入编码设置的负担。 So I ended up所以我结束了

  • Encoding the data as Common Language Infrastructure objects in CLIXML将数据编码为CLIXML 中的公共语言基础结构对象
  • Prefixing the stream with a "#< CLIXML\\r\\n" marker to declare the format to Powershell使用"#< CLIXML\\r\\n"标记为流添加前缀以将格式声明为 Powershell
  • and finally (*cringe*) xml-escape every character in the xml document outside the ASCII range to completely avoid any input encoding ambiguities最后 (*cringe*) xml-escape xml 文档中 ASCII 范围之外的每个字符,以完全避免任何输入编码歧义

The final point turned out to be necessary, because the handling of clixml comes only after the text goes through the fragile console input decoding process.最后一点被证明是必要的,因为 clixml 的处理只有在文本通过脆弱的控制台输入解码过程之后才会出现。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Powershell 将变量转换为 utf-16 - Powershell convert a variable to utf-16 使用PowerShell将xml从UTF-16转换为UTF-8 - Converting xml from UTF-16 to UTF-8 using PowerShell 如何在powershell命令提示符下输出UTF16编码的文件ascii(或UTF8)? - How to output a UTF16 encoded file as ascii (or UTF8) on the powershell command prompt? 如何将 Select-String 编码设置为 UTF-16? - How to set Select-String encoding to UTF-16? 从powershell调用时,Iconv正在转换为UTF-16而不是UTF-8 - Iconv is converting to UTF-16 instead of UTF-8 when invoked from powershell 将十六进制字符串解码为UTF-16? - Decode Hex string to UTF-16? 与简单的Perl正则表达式搜索等效的Powershell替换一根衬板,以在UCS-2LE或UTF-16 Little Endian文件中查找替换 - Powershell equivalent of simple Perl regex search replace one liner to find replace in UCS-2LE or UTF-16 Little Endian file On converting the UFT-8 xml to Unicode in Powershell, $encoding attribute value is showing bigEndianUnicode in the output xml, I want UTF-16 there - On converting the UFT-8 xml to Unicode in Powershell, $encoding attribute value is showing bigEndianUnicode in the output xml, I want UTF-16 there 为什么emacs(用于windows)将xml文件的编码从utf-16 little endian更改为utf-16 big endian? - Why does emacs (for windows) change the encoding of xml files from utf-16 little endian to utf-16 big endian? 在 PowerShell 中进行管道传输时,如何确保 Python 打印 UTF-8(而不是 UTF-16-LE)? - How to ensure Python prints UTF-8 (and not UTF-16-LE) when piped in PowerShell?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM