如何用反斜杠和空格“\\”替换空格，以便 bash shell 脚本可以从 .tsv 文件中读取文件名并执行 rsync 复制

Question

I have a script which takes source and destination information from a tsv file separated by space.我有一个脚本，它从由空格分隔的 tsv 文件中获取源和目标信息。 The first column indicates source file path, and second column is destination.第一列表示源文件路径，第二列表示目标。 My rsync command reads source and destination information and performs copy operation.我的 rsync 命令读取源和目标信息并执行复制操作。

But the issue is source and destination files both contain filename with white space (header-background copy.jpg) and as we know that when bash shell reads a file name (with space), it replace the space with backslash followed by space “\\ ”但问题是源文件和目标文件都包含带空格的文件名（header-background copy.jpg），正如我们所知，当 bash shell 读取文件名（带空格）时，它会用反斜杠后跟空格“\\ ”

/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg /mapped-data/data20/data3/header-background copy.jpg

My question is how I can replace the space with “\\ ” so shell can read it.我的问题是如何用“\\”替换空格以便shell可以读取它。 I tried with using below sed command我尝试使用以下 sed 命令

sed -r 's/^\\s+//;s/\\s+/\\\\ /g' test2.tsv

but there is a problem as above sed command also adds a backslash after the source path.但是有一个问题，因为上面的 sed 命令还在源路径后添加了一个反斜杠。 As I have mentioned that my script takes source and destination information from the .tsv file so having a slash added is a problem here.正如我所提到的，我的脚本从 .tsv 文件中获取源和目标信息，因此添加斜杠是一个问题。 Below is the output of the sed command.下面是 sed 命令的输出。

/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background\\ copy.jpg\\ /mapped-data/data20/data3/header-background\\ copy.jpg

what I want is something like covert from我想要的是像隐蔽的东西

/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg /mapped-data/data20/data3/header-background copy.jpg

to到

/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background\\ copy.jpg /mapped-data/data20/data3/header-background\\ copy.jpg

Answer 1

tsv file separated by space tsv 文件以空格分隔

At least use a tab , because a tab is less likely to appear in a path than a space .至少使用tab ，因为与space相比， tab不太可能出现在路径中。

Remark : Did you know that only the / and the NULL char are forbidden in filenames in Linux filesystems?备注：你知道，只有/和NULL字符在文件名中不得在Linux文件系统？ That means that everything but the NULL char can appear in a path...这意味着除了NULL字符之外的所有内容都可以出现在路径中......

Let's say that your file is now tab delimited and your paths don't include newlines nor tabs .假设您的文件现在是tab符分隔的，并且您的路径不包含newlines或tabs 。 Here's how you can read it in BASH:以下是在 BASH 中阅读它的方法：

while IFS=$'\t' read filepath1 filepath2
do
    declare -p filepath1 filepath2
done <<< "/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg"$'\t'"/mapped-data/data20/data3/header-background copy.jpg"

Output:输出：

declare -- filepath1="/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg"
declare -- filepath2="/mapped-data/data20/data3/header-background copy.jpg"

If you need to explicitly escape a variable then you can use printf '%q'如果您需要显式转义变量，则可以使用printf '%q'

filepath1="/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background copy.jpg"
filepath2="/mapped-data/data20/data3/header-background copy.jpg"

rsync -av "$filepath1" user@server:"$(printf '%q' "$filepath2")"

Answer 2

Using sed , one way would be to group the match and return it with a back reference appending the back slash使用sed ，一种方法是对匹配项进行分组，并使用附加反斜杠的反向引用返回它

sed 's/\([A-Za-z0-9\/][^\.]*\) /\1\\ /g' input_file
/data-prod/bigdata/abc/test/1143-1003-004_1143-1003-905/static/common/images/header-background\ copy.jpg   /mapped-data/data20/data3/header-background\ copy.jpg

Answer 3

Split the columns with parameter substitution:使用参数替换拆分列：

while IFS= read -r line; do
    # count raw tabs to validate format
    tabs=${line//[!$'\t']}
    tabs=${#tabs}

    if ((tabs!=1)); then
        echo "${line//$'\t'/TAB}: less or more than one raw tab, skipping…" >&2
    else
        # split by tab
        source=${line%$'\t'*}
        target=${line#*$'\t'}

        rsync "$source" "$target"
    fi

done < list.tsv

The loop checks that each line has exactly one tab, and skips it if it doesn't.循环检查每一行是否只有一个选项卡，如果没有则跳过它。 Add in your rsync options etc.添加您的rsync选项等。

Make sure the list doesn't have symbolic escape characters (like \\t or \\n ), or any other escapes/quotes that aren't part of a literal file name.确保列表没有符号转义字符（如\\t或\\n ），或任何其他不属于文字文件名的转义符/引号。 tsv and csv can have all sorts of different non-literal quoting, that needs to be parsed correctly. tsv和csv可以有各种不同的非文字引用，需要正确解析。

Also note this usage of rsync , for copying files names from a list file to a single target: rsync [opts] —files-from="my-source-list" /my/target/另请注意rsync这种用法，用于将文件名从列表文件复制到单个目标： rsync [opts] —files-from="my-source-list" /my/target/

如何用反斜杠和空格“\\”替换空格，以便 bash shell 脚本可以从 .tsv 文件中读取文件名并执行 rsync 复制

问题描述

3 个解决方案

解决方案1
0 2021-11-01 23:42:15

解决方案2
0 2021-11-02 00:00:06

解决方案3
0 2021-11-02 10:27:39

如何用反斜杠和空格“\\”替换空格，以便 bash shell 脚本可以从 .tsv 文件中读取文件名并执行 rsync 复制

问题描述

3 个解决方案

解决方案1 0 2021-11-01 23:42:15

解决方案2 0 2021-11-02 00:00:06

解决方案3 0 2021-11-02 10:27:39

解决方案1
0 2021-11-01 23:42:15

解决方案2
0 2021-11-02 00:00:06

解决方案3
0 2021-11-02 10:27:39