How to use map and grep in Perl for following data

Question

How to display only the chains (such as A, C, E, G ) which end with a semicolon ;

Data

COMPND    MOL_ID: 1;                                                            
COMPND   2 MOLECULE: JACALIN;                                                   
COMPND   3 CHAIN: A, C, E, G;                                                   
COMPND   4 SYNONYM: JACKFRUIT 
AGGLUTININ;                                       
COMPND   5 MOL_ID: 2;                                                           
COMPND   6 MOLECULE: JACALIN;                                                   
COMPND   7 CHAIN: B, D, F, H;                                                   
COMPND   8 SYNONYM: JACKFRUIT AGGLUTININ

I tried the below code

#!usr/local/bin/perl

open(FILE, "/home/httpd/cgi-bin/r/1JAC.pdb");

while ( $line = <FILE> ) {

    if ( $line =~ /^COMPND/ ) {

        #$line = substr $line,4,21;

        my $line =~ m(/\$:^\w+\$\;/g);
        print $line;
    }
}

Answer 1

perl -nle'print $1 if /^COMPND\s+\S*\s*CHAIN:(.+);/' /home/httpd/cgi-bin/r/1JAC.pdb

This is a fairly simple method of "grepping" part of a line to standard output. It will capture everything in the parentheses and print it.

-n uses a while(<>) loop to read data from your file
-l handles newlines

Answer 2

You can use a single regular expression like the following:

while (my $line = <FILE>) {
    if ($line =~ /^COMPND.+?CHAIN:\s*(.*?)\s*;\s*$/) {
        my $chain = $1;
        print "$chain\n";
    }
}

This uses a regular expression to match COMPND, CHAIN and an ending ; . The \\s* at the end of the regular expression will match any trailing spaces. It will capture the string between CHAIN: and ; excluding trailing and leading spaces in $1 which is set as the value for the $chain variable.

More information on Perldoc: Perlre - Perl regular expressions .

Answer 3

You may like this one-line solution

perl -le 'print for map /CHAIN:\s*([^;]+)/, <>' /home/httpd/cgi-bin/r/1JAC.pdb

output

A, C, E, G
B, D, F, H

Answer 4

Using GNU grep with perl regular expressions: find the text between "CHAIN:" and the semicolon

$ grep -oP '(?<=CHAIN: ).*?(?=;)' filename
A, C, E, G
B, D, F, H

Answer 5

Try this

use warnings;
use strict;
open my $nis,"<1jac.pdb";
my @ar = grep{ m/^COMPND/g} <$nis>;
my $s = join("",@ar);
my @dav;
my @mp2  = map{split(/,\s|,/, $_)} grep{ s/(COMPND\s+\d+\s+(CHAIN\:\s+)?)|(\n|;)//g} @dav= $s =~m/(COMPND\s+\d+\s+CHAIN\:.+?(?:.|\n)+?\;)/g;
$, = ", ";
print @mp2;

Output

A, C, E, G, B, D, F, H

How to use map and grep in Perl for following data

Question

5 answers

solution1
2 2015-06-13 11:22:58

solution2
1 2015-06-13 09:51:24

solution3
0 2015-06-13 11:59:26

solution4
0 2015-06-15 12:02:37

solution5
-1 2015-06-13 11:34:30

How to use map and grep in Perl for following data

Question

5 answers

solution1 2 2015-06-13 11:22:58

solution2 1 2015-06-13 09:51:24

solution3 0 2015-06-13 11:59:26

solution4 0 2015-06-15 12:02:37

solution5 -1 2015-06-13 11:34:30

solution1
2 2015-06-13 11:22:58

solution2
1 2015-06-13 09:51:24

solution3
0 2015-06-13 11:59:26

solution4
0 2015-06-15 12:02:37

solution5
-1 2015-06-13 11:34:30