简体   繁体   中英

Need help to replace string using regex in shell script

I am working on a script to extract data into text file( fossa_results.txt ) through curl command and the extracted response will be as below

 "license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,

the above response is written to a text file ( fossa_results.txt ) and I am trying to perform replace string operation on that file using sed command and regex pattern and the expected outcome is as below and write back to same file ( fossa_results.txt )

License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

Below is the script I have used for this.

sed -i 's/^[[:space:]]*//' fossa_results.txt -- trying to remove leading spaces
        sed -i 's/[[:space:]]*$//' fossa_results.txt -- trying to remove trailing spaces
        sed -i 's/"\""/""/g' fossa_results.txt -- trying to replace "
        sed -i 's/"\"\\[.*?\\]: "/""/g' fossa_results.txt - trying to remove any unwanted string that comes within [] like date.
        sed -i 's/"\"\\[.*?\\]"/""/g' "fossa_results.txt"
        sed -i 's/"\"license_count:"/"License Count="/g' "fossa_results.txt"
        sed -i 's/"\"todo_count:"/"Todo Count="/g' "fossa_results.txt"
        sed -i 's/"\"  dependency_count:"/"Dependancy Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_issue_count:"/"Unresolved Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_licensing_issue_count:"/"Unresolved Licensing Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_security_issue_count:"/"Unresolved Security Issue Count="/g' "fossa_results.txt"
        sed -i 's/"\"  unresolved_quality_issue_count:"/"Unresolved Quality Issue Count="/g' "fossa_results.txt"
        fossaresults="$(cat fossa_results.txt)"

but when I print fossa_results.txt through cat command it printing the original data and it seems like replace is not working.

Can someone please help me with this script

Using GNU sed

$ sed -Ei.bak ':a;s/ +?([^:]*)_/\1 /;ta;s/:/=/;s/[",]//g;s/[a-z]+/\u&/g' input_file
License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

This answer is off topic, because it proposes an awk solution instead of sed or bash as tagged (but it can still help).

You can use awk to format the content of the file correctly.

d.txt

 "license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,

a.awk

BEGIN {
    FS=":"
}
{
    gsub("\"","",$1)
    gsub(" ","",$1)
    gsub(",","",$2)
    print $1"="$2
}

Usage

awk -f a.awk d.txt 

Output

license_count= 32
dependency_count= 295
todo_count= 9
unresolved_issue_count= 6
unresolved_licensing_issue_count= 2
unresolved_security_issue_count= 4
unresolved_quality_issue_count= 0

An awk alternative:

 awk '{ gsub(":","="); gsub(/^ *|\"|,/,""); gsub("_"," "); for (i=1; i<=NF; ++i) { $i=toupper(substr($i,1,1)) tolower(substr($i,2)); }}1' src.dat
License Count= 32
Dependency Count= 295
Todo Count= 9
Unresolved Issue Count= 6
Unresolved Licensing Issue Count= 2
Unresolved Security Issue Count= 4
Unresolved Quality Issue Count= 0

replace all colons with equal sign gsub(":","=");

replace leading spaces or double quotes or commas with empty string gsub(/^ *|\"|,/,"");

replace underscore with single space gsub("_"," ");

capitalize the first letter of each field for (i=1; i<=NF; ++i) { $i=toupper(substr($i,1,1)) tolower(substr($i,2)); }}1' for (i=1; i<=NF; ++i) { $i=toupper(substr($i,1,1)) tolower(substr($i,2)); }}1'

Input file src.dat contents:

"license_count": 32,
    "dependency_count": 295,
    "todo_count": 9,
    "unresolved_issue_count": 6,
    "unresolved_licensing_issue_count": 2,
    "unresolved_security_issue_count": 4,
    "unresolved_quality_issue_count": 0,

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM