简体   繁体   中英

Using awk or sed in Mac OS

I have a file of about 150 lines, where each line is part of a URL. I wanted to extract 4 different parameters from each of the lines and put them into a file. Something like:

/secure/domain/new.aspx?id=620&utm_source=1034&utm_medium=cpc&utm_term=term1&try=1&v=3&utm_account=account_name&utm_campaign=campaign_name&utm_adgroup=adgroup&keyword=keyword1&pkw=pkw1&idimp=id&premt=premt1&gclid=id

As a trial, I did

awk '/pkw/,/&idimp/' file > output.txt

thinking that this would atleast get me value1, but it just returned the input file as is. What am I doing wrong? Also, how to make it return all four values? I'm looking to get keyword, pkw, idimp and premt.

Edit: The expected output is a file containing the 4 values for each of the 150 lines in the input file. So

 keyword pkw1 idi premt1

Even if I just get the 4 values in 4 different files, it would suffice.

You can use this awk:

awk -F'[=&]' '{print $2, $4, $6, $8}' file
value1 value2 value3 value4

To redirect the output to a file:

awk -F'[=&]' '{print $2, $4, $6, $8}' file > output.txt

EDIT: Based on your edited question you can use:

awk -F'[=&]' '{n=1; for (i=1; i<=NF; i++) {if ($i=="interested") {n=i+3; break}}
      for (i=0; i<8; i+=2) printf $(n+i) " "; print ""}' file
value1 value2 value3 value4 
s='/helloworld/some/other/standard/URL/mumbo/jumbo/page.aspx?strings&that&I&am&not&interested&in&param1=value1&param2=value2&param3=value3&param4=value4&some&more&uninteresting&strings'
echo "$s" | grep -o 'param[1234]=[^&]*' | cut -d= -f2- | paste -d " " - - - -
value1 value2 value3 value4

Keeping up with the clarifications to the question:

s='/secure/domain/new.aspx?id=620&utm_source=1034&utm_medium=cpc&utm_term=term1&try=1&v=3&utm_account=account_name&utm_campaign=campaign_name&utm_adgroup=adgroup&keyword=keyword&pkw=pkw1&idimp=id&premt=premt1&gclid=id'
echo "$s" |  grep -o '\<\(keyword\|pkw\|idimp\|premt\)=[^&]*' | cut -d= -f2- | paste -d " " - - - -
keyword pkw1 id premt1

The \\< is a "start of word" anchor to avoid matching parameters like "fookeyword"

With awk, I'd write:

awk -F '[?=&]' '
    BEGIN {
        # initialize the parameters you want
        p["keyword"] = p["pkw"] = p["idimp"] = p["premt"] = 1
    } 
    {
        for (i=2; i<NF; i+=2) 
            if ($i in p) 
                printf "%s ", $(i+1)
        print ""
    }
'

或者只是grep -P ,但这可能需要安装GNU grep。

grep -oP '[?&][^&?=]+=\K[^&?]+'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM