简体   繁体   中英

print specific field from specific line of csv file linux

I am trying to extract a specific field in a specific line of a CSV file.

I'm able to do it according to the number row but sometimes the row number of the file will change so this is not that flexible.

I wanted to try and do it to extract a specific name before the field I'm interested in.

CSV File

[Header]
File,5
Researcher Name,Joe Black
Experiment,Illumina-Project
Date,05/02/2021
Pipeline,RNA_Pipeline

In this case, I want to extract the researcher and the Experiment Name from the CSV file:

Desired Output from CSV File

Joe Black Illumina-Project

The following works but as I said it is not as flexible:

awk -F',' 'NR == 3 { print $2 }' test.csv

So I was trying to do something like from what I've found but have not been successful

 awk -F',' 'Line == "Researcher Name" { print $1 }' test.csv

Whenever your input data contains name-value pairs it's best to first create an array that holds those mappings ( f[] below) and then you can print/test/modify whatever values you like in whatever order you like just by indexing the array by the names.

Look at how easy it is to do what you want with this approach:

$ awk -F, '{f[$1]=$2} END{print f["Researcher Name"], f["Experiment"]}' file
Joe Black Illumina-Project

but also how easy it is to do whatever else you might need in future, eg:

$ awk -F, '
    { f[$1]=$2 }
    END {
        if ( (f["File"] > 3) && (f["Date"] ~ /2021/) ) {
            print f["Experiment"], f["Pipeline"], f["Researcher Name"]
        }
    }
' file
Illumina-Project RNA_Pipeline Joe Black

You can try this

$ awk -F"," '/Name/{name=$2} /Experiment/{print name, $2}' file
Joe Black Illumina-Project

/Name/{name=$2} - Match Name or more specifically $1=="Researcher Name" , create a variable name and store the contents of column 2 $2

/Experiment/{print name, $2}' - Match Experiment, print name variable and $2

If you want to print the second column from the next line after matching Researcher Name you might first check if the first field matches Researcher Name.

If it does, add the second field to a variable for example str and process the next line.

In the next line (if the variable is not 0 or an empty string) you can concat the second field of that line, setting the variable to an empty string again.

awk -F',' '$1 ~/Researcher Name/ {
  str=$2; next
}
{
  if(str){
    print str, $2 
    str=""
  }
}' file

Output

Joe Black Illumina-Project

AWK demo

With your shown samples, please try following awk code, written and tested in GNU awk . Simple explanation would be, set RS as \nDate and set field separators as Researcher Name, OR \nExperiment, then printing 2nd and 3rd field as per required output from OP.

awk -v RS='\nDate' -F'Researcher Name,|\nExperiment,' 'FNR==1{print $2,$3;exit}' Input_file

Use this Perl one-liner:

perl -0777 -lne '( $researcher_name ) = /Researcher Name,([^,\n]*)/; ( $experiment ) = /Experiment,([^,\n]*)/; print "$researcher_name $experiment\n";' test.csv

The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ( "\n" on *NIX by default) before executing the code in-line, and append it when printing.
-0777 : Slurp files whole.

([^,\n]*) : Capture into the and return the pattern in parentheses (...) . The pattern is any character except comma and newline, repeated 0 or more times (the * modifier).

SEE ALSO:
perldoc perlrun : how to execute the Perl interpreter: command line switches
perldoc perlre : Perl regular expressions (regexes)
perldoc perlre : Perl regular expressions (regexes): Quantifiers; Character Classes and other Special Escapes; Assertions; Capture groups
perldoc perlrequick : Perl regular expressions quick start

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM