简体   繁体   English

在Linux中,如何去除ASCII文件中的所有高位,而保留其余位? -与ISTRIP类似,但带有文件

[英]In Linux, how do I strip all the high bits from an ascii file but keep the rest of the bits? - Like ISTRIP but with files

I have some old wordstar files I want to be able to read, I don't care about formatting but I want the ascii. 我有一些想阅读的旧wordtar文件,我不关心格式,但我想要ascii。 Wordstar had the nasty habit of setting the high bit to mean certain things (end of a text word) and really using only the bottom 7 bits for character representation. Wordstar有一个讨厌的习惯,就是将高位设置为表示某些事物(文本单词的结尾),实际上只使用低7位来表示字符。

I know I could write a program to do this, but can't the shell do this with tr or sed or something. 我知道我可以编写一个程序来执行此操作,但是Shell无法使用tr或sed或其他方式执行此操作。

Also there is an ISTRIP property on communication devices that does this but I don't know how to apply it to a file. 在通讯设备上也有一个ISTRIP属性可以执行此操作,但是我不知道如何将其应用于文件。

I want to read a character, do a logical and with \\o177 on the value of the character, and then write out the character. 我想读取一个字符,对这个字符的值进行逻辑运算并加上\\ o177,然后写出该字符。

Yes, tr can do this: 是的, tr可以做到这一点:

LC_ALL=C tr '\200-\377' '\000-\177'

Here's an example: 这是一个例子:

$ printf '\xE8\xE5\xEC\xEC\xEF world' | LC_ALL=C tr '\200-\377' '\000-\177'
hello world

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM