[英]How to replace a space with backslash and space "\ " so bash shell script can read the file name from a .tsv file and perform rsync copy
I have a script which takes source and destination information from a tsv file separated by space.我有一个脚本,它从由空格分隔的 tsv 文件中获取源和目标信息。 The first column indicates source file path, and second column is destination.第一列表示源文件路径,第二列表示目标。 My rsync command reads source and destination information and performs copy operation.我的 rsync 命令读取源和目标信息并执行复制操作。
But the issue is source and destination files both contain filename with white space (header-background copy.jpg) and as we know that when bash shell reads a file name (with space), it replace the space with backslash followed by space “\\ ”但问题是源文件和目标文件都包含带空格的文件名(header-background copy.jpg),正如我们所知,当 bash shell 读取文件名(带空格)时,它会用反斜杠后跟空格“\\ ”
/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg /mapped-data/data20/data3/header-background copy.jpg
My question is how I can replace the space with “\\ ” so shell can read it.我的问题是如何用“\\”替换空格以便shell可以读取它。 I tried with using below sed command我尝试使用以下 sed 命令
sed -r 's/^\\s+//;s/\\s+/\\\\ /g' test2.tsv
but there is a problem as above sed command also adds a backslash after the source path.但是有一个问题,因为上面的 sed 命令还在源路径后添加了一个反斜杠。 As I have mentioned that my script takes source and destination information from the .tsv file so having a slash added is a problem here.正如我所提到的,我的脚本从 .tsv 文件中获取源和目标信息,因此添加斜杠是一个问题。 Below is the output of the sed command.下面是 sed 命令的输出。
/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background\\ copy.jpg\\ /mapped-data/data20/data3/header-background\\ copy.jpg
what I want is something like covert from我想要的是像隐蔽的东西
/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg /mapped-data/data20/data3/header-background copy.jpg
to到
/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background\\ copy.jpg /mapped-data/data20/data3/header-background\\ copy.jpg
tsv file separated by space tsv 文件以空格分隔
At least use a tab
, because a tab
is less likely to appear in a path than a space
.至少使用tab
,因为与space
相比, tab
不太可能出现在路径中。
Remark : Did you know that only the /
and the NULL
char are forbidden in filenames in Linux filesystems?备注:你知道,只有/
和NULL
字符在文件名中不得在Linux文件系统? That means that everything but the NULL
char can appear in a path...这意味着除了NULL
字符之外的所有内容都可以出现在路径中......
Let's say that your file is now tab
delimited and your paths don't include newlines
nor tabs
.假设您的文件现在是tab
符分隔的,并且您的路径不包含newlines
或tabs
。 Here's how you can read it in BASH:以下是在 BASH 中阅读它的方法:
while IFS=$'\t' read filepath1 filepath2
do
declare -p filepath1 filepath2
done <<< "/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg"$'\t'"/mapped-data/data20/data3/header-background copy.jpg"
Output:输出:
declare -- filepath1="/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg"
declare -- filepath2="/mapped-data/data20/data3/header-background copy.jpg"
If you need to explicitly escape a variable then you can use printf '%q'
如果您需要显式转义变量,则可以使用printf '%q'
filepath1="/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg"
filepath2="/mapped-data/data20/data3/header-background copy.jpg"
rsync -av "$filepath1" user@server:"$(printf '%q' "$filepath2")"
Using sed
, one way would be to group the match and return it with a back reference appending the back slash使用sed
,一种方法是对匹配项进行分组,并使用附加反斜杠的反向引用返回它
sed 's/\([A-Za-z0-9\/][^\.]*\) /\1\\ /g' input_file
/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background\ copy.jpg /mapped-data/data20/data3/header-background\ copy.jpg
Split the columns with parameter substitution:使用参数替换拆分列:
while IFS= read -r line; do
# count raw tabs to validate format
tabs=${line//[!$'\t']}
tabs=${#tabs}
if ((tabs!=1)); then
echo "${line//$'\t'/TAB}: less or more than one raw tab, skipping…" >&2
else
# split by tab
source=${line%$'\t'*}
target=${line#*$'\t'}
rsync "$source" "$target"
fi
done < list.tsv
The loop checks that each line has exactly one tab, and skips it if it doesn't.循环检查每一行是否只有一个选项卡,如果没有则跳过它。 Add in your rsync
options etc.添加您的rsync
选项等。
Make sure the list doesn't have symbolic escape characters (like \\t
or \\n
), or any other escapes/quotes that aren't part of a literal file name.确保列表没有符号转义字符(如\\t
或\\n
),或任何其他不属于文字文件名的转义符/引号。 tsv
and csv
can have all sorts of different non-literal quoting, that needs to be parsed correctly. tsv
和csv
可以有各种不同的非文字引用,需要正确解析。
Also note this usage of rsync
, for copying files names from a list file to a single target: rsync [opts] —files-from="my-source-list" /my/target/
另请注意rsync
这种用法,用于将文件名从列表文件复制到单个目标: rsync [opts] —files-from="my-source-list" /my/target/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.