简体   繁体   中英

concatenation of strings in bash results in substitution

I need to read a file into an array and concatenate a string at the end of each line. Here is my bash script:

#!/bin/bash

IFS=$'\n' read -d '' -r -a lines < ./file.list
for i in "${lines[@]}"
do
    tmp="$i"
    tmp="${tmp}stuff"
    echo "$tmp"
done

However, when I do this, an action of replace happens, instead of concatenation.

For example, in the file.list , we have:

http://www.example1.com
http://www.example2.com

What I need is:

http://www.example1.comstuff
http://www.example2.comstuff

But after executing the script above, I get things as below on the terminal:

stuff//www.example1.com
stuff//www.example2.com

Btw, my PC is Mac OS.

The problem also occurs while concatenating strings via awk , printf , and echo commands. For example echo $tmp"stuff" or echo "${tmp}""stuff"

The file ./file.lst is, most probably, generated on a Windows system or, at least, it was saved using the Windows convention for end of line.

Windows uses a sequence of two characters to mark the end of lines in a text file. These characters are CR ( \\r ) followed by LF ( \\n ). Unix-like systems (Linux and macOS starting with version 10) use LF as end of line character.

The assignment IFS=$'\\n' in front of read in your code tells read to use LF as line separator. read doesn't store the LF characters in the array it produces ( lines[] ) but each entry from lines[] ends with a CR character.

The line tmp="${tmp}stuff" does what is it supposed to do, ie it appends the word stuff to the content of the variable tmp (a line read from the file).

The first line read from the input file contains the string http://www.example1.com followed by the CR character. After the string stuff is appended, the content of variable tmp is:

http://www.example1.com$'\r'stuff

The CR character is not printable. It has a special interpretation when it is printed on the terminal: it sends the cursor at the start of the line (column 1) without changing the line.

When echo prints the line above, it prints (starting on a new line) http://www.example1.com , then the CR character that sends the cursor back to the start of the line where is prints the string stuff . The stuff fragment overwrites the first 5 characters already printed on that line ( http: ) and the result, as it is visible on screen, is:

stuff//www.example1.com

The solution is to get rid of the CR characters from the input file. There are several ways to accomplish this goal.

A simple way to remove the CR characters from the input file is to use the command:

sed -i.bak s/$'\r'//g file.list

It removes all the CR characters from the content of file file.list , saves the updated string back into the file.list file and stores the original file.list file as file.list.bak (a backup copy in case it doesn't produce the output you expect).

Another way to get rid of the CR character is to ask the shell to remove it in the command where stuff is appended:

tmp="${tmp/$'\r'/}stuff"

When a variable is expanded in a construct like ${tmp/a/b} , all the appearances of a in $tmp are replaced with b . In this case we replace \\r with nothing.

I'm guessing it's have something to do with the Carriage Return character. Did your file.list created on windows? If so, try to use dos2unix before running the script.

Edit

You can check your files using the file command.

Example:

file file.list

If you saved the file in Windows Notepad like this:

在此处输入图片说明

Then it will probably come up like this:

  •  file.list: ASCII text, with no line terminators 

You can use built in tools like iconv to convert the encodings. However for a simple use like this, you can just use a command that works for multiple encodings without any conversion necessary.

You could simply buffer the file through cat , and use a regular expression that applies to either:

  • Carriage return followed by line terminator, or
  • Line terminator on it's own

Then append the string.

Example:

cat file.list | grep -E -v "^$" | sed -E -e "s/(\r?$)/stuff/g" 

Will work with ASCII text, and ASCII text with no line terminators.

If you need to modify a stream to append a fixed string, you can use sed or awk , for instance:

sed 's/$/stuff/'

to append stuff to the end of each line.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM