简体   繁体   中英

Populate a CSV column according to the value of another column

In a Bash script, I want to populate one currently empty column (column 5) according to the values of another (column 1).

I think I can use awk in order to achieve the desired result, but I'm having problems with the syntax:

awk -F, '
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_f_[0-9]{3}[a-z]\.tif$/{$5="Text"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_[a-b]_[0-9]{1,3}[a-z]?\.tif$/{$5="Front matter"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_y_[0-9]{1,3}[a-z]?\.tif$/{$5="Back matter"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_z_1[a-z]?\.tif$/{$5="Back matter"}
' file.csv

My input looks like this:

File Name,Item Sequence,Visibility,Title
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0001_a_1.tif,1,discovery,Front Board Outside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0002_a_1a.tif,2,discovery,Front Board Outside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0003_b_000.tif,3,discovery,Front Board Inside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0009_b_003v.tif,9,discovery,Flyleaf 003v
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0010_f_001r.tif,10,discovery,f. 001r
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0060_y_001r.tif,60,discovery,Flyleaf 001r
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0070_y_999.tif,70,discovery,Back Board Inside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0071_z_1.tif,71,discovery,Back Board Outside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0072_z_1a.tif,72,discovery,Back Board Outside
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0073_z_2.tif,73,discovery,Spine
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0074_z_3.tif,74,discovery,Fore edge

And the desired result should look like what appears below where the fifth column ( IIIF Range ) has been populated by the values I assigned above ( Front matter , Text , Back matter , and blank) based on the values of column 1 ( File Name ):

File Name,Item Sequence,Visibility,Title,IIIF Range
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0001_a_1.tif,1,discovery,Front Board Outside,Front matter
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0002_a_1a.tif,2,discovery,Front Board Outside,Front matter
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0003_b_000.tif,3,discovery,Front Board Inside,Front matter
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0009_b_003v.tif,9,discovery,Flyleaf 003v,Front matter
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0010_f_001r.tif,10,discovery,f. 001r,Text
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0060_y_001r.tif,60,discovery,Flyleaf 001r,Back matter
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0070_y_999.tif,70,discovery,Back Board Inside,Back matter
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0071_z_1.tif,71,discovery,Back Board Outside,Back matter
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0072_z_1a.tif,72,discovery,Back Board Outside,Back matter
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0073_z_2.tif,73,discovery,Spine,,  
Masters/sinaimasters/ara/arabic_0695/sld_arb0695_0074_z_3.tif,74,discovery,Fore edge,,

You can use the ~ operator to match a string against a regex pattern:

awk -F, 'BEGIN{OFS=","}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_f_[0-9]{3}[a-z]\.tif$/{$5="Text"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_[a-b]_[0-9]{1,3}[a-z]?\.tif$/{$5="Front matter"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_y_[0-9]{1,3}[a-z]?\.tif$/{$5="Back matter"}
$1~/sld_[a-z]{3}[0-9]{4}_[0-9]{4}_z_1[a-z]?\.tif$/{$5="Back matter"}
1' file.csv

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM