简体   繁体   English

有没有办法从我的 Node.js 生成的 shell 标准输出文本中挽救这些损坏的 bash 转义序列工件?

[英]Is there a way to salvage these broken bash escape sequence artifacts from my Node.js spawned shell stdout text?

I wrote a class for spawning and interacting with a shell (within an Electron app) like so:我写了一个类来生成一个 shell(在一个 Electron 应用程序中)并与之交互,如下所示:

const { spawn } = require('child_process')
class Shell {
    constructor() {
        this.process = spawn('/bin/bash', []);
        this.process.stdout.on('data', (data) => {
            const out = data.toString()
            console.log('stdout:', out)
            if (this.res) {
                this.res(out)
            }
        });
        this.process.stderr.on('data', function (data) {
            const err = data.toString()
            console.log('stderr:', err)
            if (this.rej) this.rej(err)
        });
    }
    send(command, throwErr = false) {
        return new Promise((resolve, reject) => {
            this.res = resolve
            if (throwErr) this.rej = reject
            else this.rej = resolve
            this.process.stdin.write(command + '\n')
        })
    }
}

And the output I'm getting is like:我得到的输出是这样的:

stdout: ]0;student@linux-opstation-jyrf: ~[01;32mstudent@linux-opstation-jyrf[00m:[01;34m~[00m$

Here's a version where I stringify the outout with JSON to see the escape chars: stdout: "\]0;student@linux-opstation-jyrf: ~\\student@linux-opstation-jyrf\:\~\$ ssh -t -t -oStrictHostKeyChecking=no student@10.5\\r50.30.231\\r\\n"这是我使用 JSON 对输出进行字符串化以查看转义字符的版本: stdout: "\]0;student@linux-opstation-jyrf: ~\\student@linux-opstation-jyrf\:\~\$ ssh -t -t -oStrictHostKeyChecking=no student@10.5\\r50.30.231\\r\\n"

I realize these are artifacts from bash escape sequences for formatting and I'm having trouble figuring how to get rid of it, especially since the escaped characters aren't printed.我意识到这些是来自用于格式化的 bash 转义序列的工件,我无法弄清楚如何摆脱它,特别是因为没有打印转义字符。

Edit: So I wrote the raw stdout Buffer ( data in the code) to a binary file:编辑:所以我将原始标准输出Buffer (代码中的data )写入二进制文件:

fs.createWriteStream(path, { encoding: 'binary'}).write(data);

and found that there seems to be no loss happening in the .toString() method (I think?) so I'm left scratching my head where the rest of the stdout markup is getting truncated.并发现.toString()方法中似乎没有任何损失(我认为?)

00000000: 1b5d 303b 7374 7564 656e 7440 6c69 6e75  .]0;student@linu
00000010: 782d 6f70 7374 6174 696f 6e2d 6a79 7266  x-opstation-jyrf
00000020: 3a20 7e07 1b5b 3031 3b33 326d 7374 7564  : ~..[01;32mstud
00000030: 656e 7440 6c69 6e75 782d 6f70 7374 6174  ent@linux-opstat
00000040: 696f 6e2d 6a79 7266 1b5b 3030 6d3a 1b5b  ion-jyrf.[00m:.[
00000050: 3031 3b33 346d 7e1b 5b30 306d 2420 7373  01;34m~.[00m$ ss
00000060: 6820 2d74 202d 7420 2d6f 5374 7269 6374  h -t -t -oStrict
00000070: 486f 7374 4b65 7943 6865 636b 696e 673d  HostKeyChecking=
00000080: 6e6f 2073 7475 6465 6e74 4031 302e 350d  no student@10.5.
00000090: 3530 2e33 302e 3233 310d 0a              50.30.231..

But maybe I'm not getting the encoding right when I save the file because I think the raw buffer should(?) output text like this \:\ and the \￿ hex chars aren't there.但也许我在保存文件时没有得到正确的编码,因为我认为原始缓冲区应该(?)输出这样的文本\:\并且\￿十六进制字符不存在. Edit: Ah, \￿ is unicode apparently.编辑:啊, \￿显然是 unicode。 Still figuring out how to save that buffer properly as binary, I think the unicode is being lost with {encoding: 'binary'} set as the option.仍在弄清楚如何将该缓冲区正确保存为二进制文件,我认为 unicode 正在丢失, {encoding: 'binary'}设置为选项。 Or else the hex dump only shows utf8, that sounds more likely.否则十六进制转储仅显示 utf8,这听起来更有可能。

Supposing that these artifacts are indeed coming from a customized prompt string, the easiest thing to do would be to change the prompt string.假设这些工件确实来自自定义的提示字符串,最简单的方法是更改​​提示字符串。 There is a variety of alternatives for that, somewhat dependent on where the prompt string is being set.对此有多种选择,这在一定程度上取决于设置提示字符串的位置。

If you are getting a prompt at all, then bash is running as an interactive shell.如果您完全收到提示,则bash正在作为交互式 shell 运行。 It will not be running as a login shell with the command you are using to launch it.它不会使用您用来启动它的命令作为登录 shell 运行。 The relevant initialization steps are thus these:因此,相关的初始化步骤如下:

When an interactive shell that is not a login shell is started, bash reads and executes commands from ~/.bashrc , if that file exists.当一个不是登录 shell 的交互式 shell 启动时, bash~/.bashrc读取并执行命令(如果该文件存在)。 This may be inhibited by using the --norc option.这可以通过使用--norc选项来禁止。 The --rcfile file option will force bash to read and execute commands from file instead of ~/.bashrc . --rcfile file选项将强制bash从文件而不是~/.bashrc读取和执行命令。

( Bash manual page ) Bash 手册页

So, you could pass the --norc option to bash to suppress reading any shell initialization file, thereby getting the default prompt (and also the default everything else).因此,您可以将--norc选项传递给bash以禁止读取任何 shell 初始化文件,从而获得默认提示(以及其他所有内容)。 But that environment might be too sparse for your needs, so as an alternative, you could create a for-purpose shell configuration file that sets up the exact configuration you want, and use the --rcfile option to make the bash instance started by your program use that.但是该环境对于您的需求来说可能过于稀疏,因此作为替代方案,您可以创建一个专用的 shell 配置文件来设置您想要的确切配置,并使用--rcfile选项使bash实例由您的程序使用那个。 This is probably your best option, but it might take some work to get set up as you need.这可能是您的最佳选择,但可能需要一些工作才能根据需要进行设置。

As a quicker and dirtier alternative, you could modify the relevant ~/.bashrc to change the prompt strings (remove definitions of PS1 and PS2 , or else redefine them to defaults: PS1='\\s-\\v\\$ ' PS2='> ' ).作为一种更快更脏的替代方法,您可以修改相关的~/.bashrc以更改提示字符串(删除PS1PS2定义,或者将它们重新定义为默认值: PS1='\\s-\\v\\$ ' PS2='> ' ) You could also combine the previous two with a custom config file that reads the default one and overrides the prompt strings:您还可以将前两个与读取默认配置文件并覆盖提示字符串的自定义配置文件结合起来:

electron.bashrc电子文件

. ~/.bashrc
PS1='\s-\v\$ '
PS2='> '

This is a hacky, ugly, quick-n-dirty solution.这是一个笨拙、丑陋、快速肮脏的解决方案。 I welcome more elegant approaches:我欢迎更优雅的方法:

const { spawn } = require('child_process')
class Shell {
    constructor() {
        // I just threw a bunch of output into a regex tester 
        // and wrote `|` joined matches until all markup in my
        // sample input was detected. That's all there is to it.
        // This will almost certainly not work across various
        // machines depending on how the markup is structured.
        const re = /\\u[0-9 a-f]{4}\[\d\d;\d\dm|\\u[0-9 a-f]{4}]0;|\\u[0-9 a-f]{4}\[\d\dm:\\u[0-9 a-f]{4}\[\d\d;\d\dm|\\u[0-9 a-f]{4}\[\d\dm|\\u[0-9 a-f]{4}/g
        this.process = spawn('/bin/bash', []);

        this.process.stdout.on('data', (data) => {
            const out = data.toString()
            const stringified = JSON.stringify(out)
            console.log('stdout:', stringified)
            const trimmed = stringified.replace(re, "")
            .split('"').join('')
            .split('\\r').join('')
            .split('\\n').join('')
            console.log('parsed stdout:', trimmed)
            if (this.res) {
                this.res(out)
            }
        });
        this.process.stderr.on('data', function (data) {
            const err = data.toString()
            console.log('stderr:', err)
            if (this.rej) this.rej(err)
        });
    }
    send(command, throwErr = false) {
        return new Promise((resolve, reject) => {
            this.res = resolve
            if (throwErr) this.rej = reject
            else this.rej = resolve
            this.process.stdin.write(command + '\n')
        })
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM