简体   繁体   English

在bash脚本中遍历csv文件中的数组的问题

[英]Problems with traversing Array in csv file in bash script

So what I'm trying to do in my code is basically read in a spreadsheet that has this format 因此,我尝试在代码中执行的操作基本上是在具有这种格式的电子表格中进行的

username,   lastname,   firstname,    x1,      x2,       x3,      x4
user1,       dudette,    mary,         7,       2,                 4
user2,       dude,       john,         6,       2,        4,
user3,       dudest,     rad,
user4,       dudaa,      pad,          3,       3,        5,       9

basically, it has usernames, the names those usernames correspond to, and values for each x. 基本上,它具有用户名,这些用户名所对应的名称以及每个x的值。 What I want to do is read in this from a csv file and then find all of the blank spaces and fill them in with 5s. 我想做的是从一个csv文件中读取内容,然后找到所有空白并用5s填充它们。 My approach to doing this was to read in the whole array and then substitute all null spaces with 0s. 我这样做的方法是读取整个数组,然后将所有空空间替换为0。 This is the code so far... 这是到目前为止的代码...

#!/bin/bash

while IFS=$'\t' read -r -a myarray
do
echo $myarray
done < something.csv

for e in ${myarray[@]
do
echo 'Can you see me #1?'
if [[-z $e]]
echo 'Can you see me #2?'
sed 's//0'
fi
done

The code isn't really changing my csv file at all. 该代码根本没有改变我的csv文件。 EDITED NOTE: the data is all comma separated. 编辑说明:数据全部用逗号分隔。

What I've figured out so far: 到目前为止,我已经弄清楚了:

Okay, the 'Can you see me' and the echo myarray are test code. 好的,“您能看到我吗”和echo myarray是测试代码。 I wanted to see if the whole csv file was being read in from echo myarray (which according to the output of the code seems to be the case). 我想看看是否从echo myarray中读取了整个csv文件(根据代码的输出,情况似乎是这样)。 It doesn't seem, however, that the code is running through the for loop at all...which I can't seem to understand. 但是,似乎代码根本没有通过for循环运行...我似乎无法理解。

Help is much appreciated! 非常感谢帮助! :) :)

The format of your .csv file is not comma separated, it's left aligned with a non-constant number of whitespace characters separating each field. .csv文件的格式不是逗号分隔的,它与不固定数量的将每个字段分隔的空白字符对齐。 This makes it difficult to be accurate when trying to find and replace empty columns which are followed by non-empty columns. 这使得在尝试查找和替换后跟非空列的空列时变得很难准确。

Here is a Bash only solution that would be entirely accurate if the fields were comma separated. 这是仅使用Bash的解决方案,如果字段之间用逗号分隔,则将完全准确。

#!/bin/bash

n=5
while IFS=, read username lastname firstname x1 x2 x3 x4; do
    ! [[ $x1 ]] && x1=$n
    ! [[ $x2 ]] && x2=$n
    ! [[ $x3 ]] && x3=$n
    ! [[ $x4 ]] && x4=$n
    echo $username,$lastname,$firstname,$x1,$x2,$x3,$x4
done < something.csv > newfile.csv && mv newfile.csv something.csv

Output: 输出:

username,lastname,firstname,x1,x2,x3,x4
user1,dudette,mary,7,2,5,4
user2,dude,john,6,2,4,5
user3,dudest,rad,5,5,5,5
user4,dudaa,pad,3,3,5,9

I realize you asked for bash, but if you don't mind perl in lieu of bash, perl is a great tool for record-oriented files. 我知道您要求使用bash,但是如果您不介意用perl代替bash,则perl是面向记录的文件的绝佳工具。

#!/usr/bin/perl 
open (FILE, 'something.csv');   
open (OUTFILE, '>outdata.txt'); 
while(<FILE>) {         
        chomp;          
        ($username,$lastname,$firstname,$x1,$x2,$x3,$x4) = split("\t");
        $x1 = 5 if $x1 eq "";
        $x2 = 5 if $x2 eq "";
        $x3 = 5 if $x3 eq "";
        $x4 = 5 if $x4 eq "";
        print OUTFILE "$username\t$lastname\t$x1\t$x2\t$x3\t$x4\n";
}
close (FILE);
close (OUTFILE);
exit;

This reads your infile, something.csv which is assumed to have tab-separated fields, and writes a new file outdata.txt with the re-written records. 这种读取您的infile, something.csv假定它有制表符分隔的字段,并用重书面记录写入新文件outdata.txt。

I'm sure there's a better or more idiomatic solution, but this works: 我敢肯定,有一个更好或更惯用的解决方案,但这可行:

#!/bin/bash

infile=bashcsv.csv     # Input filename
declare -i i           # Iteration variable
declare -i defval=5    # Default value for missing cells
declare -i n_cells=7   # Total number of cells per line
declare -i i_start=3   # Starting index for numeric cells
declare -a cells       # Array variable for cells

# We'd usually save/restore the old value of IFS, but there's no need here:
IFS=','

# Convenience function to bail/bug out on error:
bail () {
    echo $@ >&2
    exit 1
}

# Strip whitespace and replace empty cells with `$defval`:
sed -s 's/[[:space:]]//g' $infile | while read -a cells; do

    # Skip empty/malformed lines:
    if [ ${#cells[*]} -lt $i_start ]; then
        continue
    fi

    # If there are fewer cells than $n_cells, pad to $n_cells
    # with $defval; if there are more, bail:
    if [ ${#cells[*]} -lt $n_cells ]; then
        for ((i=${#cells[*]}; $i<$n_cells; i++)); do
            cells[$i]=$defval
        done
    elif [ ${#cells[*]} -gt $n_cells ]; then
        bail "Too many cells."
    fi

    # Replace empty cells with default value:
    for ((i=$i_start; $i<$n_cells; i++)); do
        if [ -z "${cells[$i]}" ]; then
            cells[$i]=$defval
        fi
    done

    # Print out whole line, interpolating commas back in:
    echo "${cells[*]}"
done

Here's a gratuitous awk one-liner that gets the job done: 这是完成工作的免费awk单行代码:

awk -F'[[:space:]]*,[[:space:]]*' 'BEGIN{OFS=","} /,/ {NF=7; for(i=4;i<=7;i++) if($i=="") $i=5; print}' infile.csv

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM