简体   繁体   中英

Remove specific words from a text file in bash

I want to remove specific words from a txt file in bash. Here is my current script:

echo "Sequenzia Import Tag Sidecar Processor v0.2"
echo "=============================================================="
rootfol=$(pwd)
echo "Selecting files from current folder........"
images=$(ls *.jpg *.jpeg *.png *.gif)
echo "Converting sidecar files to folders........"
for file in $images
do
    split -l 8 "$file.txt" tags-
    for block in tags-*
    do
                foldername=$(cat "$rootfol/$block" | tr '\r\n' ' ')
                FOO_NO_EXTERNAL_SPACE="$(echo -e "${foldername}" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')"
                mkdir "$FOO_NO_EXTERNAL_SPACE" > /dev/null
                cd "$FOO_NO_EXTERNAL_SPACE"
        done
        mv "$rootfol/$file" "$file"
        cd "$rootfol"
        rm tags-* $file.txt
done
echo "DONE! Move files to import folder"

What it does is read the txt file that is named the same as a image and create folders that are interpreted as tags during a import into a Sequenzia image board (based in myimoutobooru) ( https://code.acr.moe/kazari/sequenzia ). What i want to do is remove specific words (actually there symbol combinations) from the sidecar file so that they do not cause issues with the import process.

Combinations like ">_<" and ":o" i want to remove from the file.

What can i add that allows me do this with a list of illegal words considering my current script.

您可以创建其中列出了您的非法串的文件,并通过文件的行迭代,使用正则表达式来删除您输入像每一个这个

Before the line "split -l 8 "$file.txt" tags-" I suggest you clean up the $file.txt using something like:

sef -f sedscript <"$file.txt" >tempfile

sedscript is a file that you create beforehand containing all your unwanted strings, eg

s/>_<//g
s/:o//g

You'd change your split command to use tempfile.

Experimenting with stdin/stdout on my PC suggests that multiple matches in a sed script are executed in the same pass over the input file. Therefore is the file is large, this appraoch avoids reading the file multiple times.

another variant of this approach is:

sed -e s/>_<//g -e s/:o//g <infile >outfile

repeat the

-e s/xxx//g

option as many times as required.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM