简体   繁体   English

如何在sed中使用unicode?

[英]How to use unicode in sed?

I want to convert a txt file to html using sed. 我想使用sed将txt文件转换为html。

However, to match html syntax, I need to include tags (and thus < and >). 但是,为了匹配html语法,我需要包含标签(并因此包含<和>)。 When I use these characters in my sed expression, sed thinks I'm specifying the source or target file, even if I escape them with . 当我在sed表达式中使用这些字符时,sed认为我正在指定源文件或目标文件,即使我使用进行转义也是如此。 I keep getting the message "The system cannot find the file specified". 我不断收到消息“系统找不到指定的文件”。

How can I avoid this? 如何避免这种情况? Can I somehow use the unicode number? 我可以以某种方式使用unicode号吗?

Source file: input.txt 源文件:input.txt

Content: 内容:

Hello world!

Desired target file: output.htm 所需的目标文件:output.htm

Content: 内容:

<html><body>Hello world!</body></html>

sed command that doesn't work: sed命令不起作用:

sed -r 's#(.*)#\<html\>\<body\>\1\<\/body\>\<\/html\>#g' <input.txt >output.htm

With simple shell 's printf function: 使用简单的shellprintf函数:

printf "<html><body>%s</body></html>\n" "$(< input.txt)" > output.htm

The output.htm contents: output.htm内容:

<html><body>Hello world!</body></html>

If you still need sed approach (by some purposes): 如果您仍然需要sed方法(出于某些目的):

echo -e "<html><body>\n</body></html>" | sed '1 r input.txt' > output.htm
  • 1 r input.txt - r command here will read and insert the contents of input.txt after the 1 st line of the passed html content (lines delimited by \\n ) 1 r input.txt - r这里命令将读取并插入的内容input.txt的后1传递的HTML内容的第一行(行由分隔\\n

The output.htm contents: output.htm内容:

<html><body>
Hello world!
</body></html>

You could keep it more simple as follows. 您可以使它更加简单,如下所示。

echo "<html><body>" && cat Input_file && echo "</body></html>"

Output will be as follows. 输出如下。

<html><body>
Hello world!
</body></html>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM