Bash shell脚本：如何替换特定字节偏移处的字符

Question

I'm looking to replace characters at specific byte offsets. 我正在寻找替换特定字节偏移的字符。

Here's what is provided: An input file that is simple ASCII text. 这是提供的内容：一个简单的ASCII文本输入文件。 An array within a Bash shell script, each element of the array is a numerical byte-offset value. Bash shell脚本中的数组，数组的每个元素都是一个数字字节偏移值。

The goal: Take the input file, and at each of the byte-offsets, replace the character there with an asterisk. 目标：获取输入文件，并在每个字节偏移集中，用星号替换字符。

So essentially the idea I have in mind is to somehow go through the file, byte-by-byte, and if the current byte-offset being read is a match for an element value from the array of offsets, then replace that byte with an asterisk. 所以基本上我想到的想法是以某种方式逐个字节地浏览文件，如果读取的当前字节偏移量与偏移量数组中的元素值匹配，则将该字节替换为星号。

This post seems to indicate that the dd command would be a good candidate for this action, but I can't understand how to perform the replacement multiple times on the input file. 这篇文章似乎表明dd命令是这个动作的一个很好的候选者，但我无法理解如何在输入文件上多次执行替换。

Input file looks like this: 输入文件如下所示：

00000
00000
00000

The array of offsets looks this: 偏移数组看起来像这样：

offsetsArray=("2" "8" "9" "15")

The output file's desired format looks like this: 输出文件的所需格式如下所示：

0*000
0**00
00*00

Any help you could provide is most appreciated. 非常感谢您提供的任何帮助。 Thank you! 谢谢！

Answer 1

Please check my comment about about newline offset. 请检查我关于换行偏移的评论。 Assuming this is correct (note I have changed your offset array), then I think this should work for you: 假设这是正确的（注意我已经改变了你的偏移数组），那么我认为这应该适合你：

#!/bin/bash

read -r -d ''
offsetsArray=("2" "8" "9" "15")
txt="${REPLY}"
for i in "${offsetsArray[@]}"; do
    txt="${txt:0:$i-1}*${txt:$i}"
done
printf "%s" "$txt"

Explanation: 说明：

read -d '' reads the whole input (redirected file) in one go into the $REPLY variable. read -d ''读取整个输入（重定向文件）进入$REPLY变量。 If you have large files, this can run you out of memory. 如果你有大文件，这可能会让你内存不足。
We then loop through the offsets array, one element at a time. 然后我们循环遍历偏移数组，一次一个元素。 We use each index i to grab i-1 characters from the beginning of the string, then insert a * character, then add the remaining bytes from offset i . 我们使用每个索引i从字符串的开头抓取i-1字符，然后插入一个*字符，然后从offset i添加剩余的字节。 This is done with bash parameter expansion . 这是通过bash参数扩展完成的。 Note that while your offsets are one-based, bash strings use zero-based indexing. 请注意，虽然您的偏移量是基于一的，但是bash字符串使用从零开始的索引。

In use: 正在使用：

$ ./replacechars.sh < input.txt
0*000
0**00
00*00
$

Caveat: 警告：

This is not really a very efficient solution, as it causes the sting containing the whole file to be copied for every offset. 这不是一个非常有效的解决方案，因为它会导致包含整个文件的sting被复制到每个偏移量。 If you have large files and/or a large number of offsets, then this will run slowly. 如果您有大文件和/或大量偏移，那么这将运行缓慢。 If you need something faster, then another language that allows modification of individual characters in a string would be much better. 如果你需要更快的东西，那么允许修改字符串中的单个字符的另一种语言会好得多。

Answer 2

The usage of dd can be a bit confusing at the time, but it's not that hard: dd的使用在当时可能有点令人困惑，但并不难：

outfile="test.txt"

# create some test data
echo -n 0123456789abcde > "$outfile"

offsetsArray=("2" "7" "8" "13")
for offset in "${offsetsArray[@]}"; do
    dd bs=1 count=1 seek="$offset" conv=notrunc of="$outfile" <<< '*'
done

cat "$outfile"

Important for this example is to use conv=notrunc , otherwise dd truncates the file to the length of blocks it seeks over. 对于此示例重要的是使用conv=notrunc ，否则dd会将文件截断为它所寻找的块的长度。 bs=1 specifies that you want to work with blocks of size 1, and seek specifies the offset to satart writing count blocks to. bs=1指定您要使用大小为1的块，并且seek指定要将写入count块的satart设置为。

The above produces 01*3456**9abc*e 以上产生01*3456**9abc*e

Answer 3

With the same offset considerations as @DigitalTrauma's superior solution, here's a GNU awk-based alternative. 与@ DigitalTrauma的卓越解决方案具有相同的偏移考虑因素，这是基于GNU awk的替代方案。 This assumes your file contains no null bytes 假设您的文件不包含空字节

(IFS=','; awk -F '' -v RS=$'\0' -v OFS=''  -v offsets="${offsetsArray[*]}" \
'BEGIN{split(offsets, o, ",")};{for (k in o)  $o[k]="*"; print}' file)

0*000
0**00
00*00

Bash shell脚本：如何替换特定字节偏移处的字符

问题描述

3 个解决方案

解决方案1
4 已采纳 2014-04-19 21:07:26

解决方案2
3 2014-04-19 21:16:17

解决方案3
2 2014-04-19 21:12:45

Bash shell脚本：如何替换特定字节偏移处的字符

问题描述

3 个解决方案

解决方案1 4 已采纳 2014-04-19 21:07:26

解决方案2 3 2014-04-19 21:16:17

解决方案3 2 2014-04-19 21:12:45

解决方案1
4 已采纳 2014-04-19 21:07:26

解决方案2
3 2014-04-19 21:16:17

解决方案3
2 2014-04-19 21:12:45