I am quite new with these things and would really need some help with this.
Am trying to make a shell script that will extract data from one or multiple databases, export it to CSV, have that data merged into one file, and apply some formulas to the file like SUM or to check the difference between the numbers. I should be able to update or replace the file as long as the formulas will still get applied to the new file.
What I got so far:
mysql -h host -u user -ppassword -P port
"query" |tee file1.csv
# I didn't know how to have multiple queries for the same DB
mysql -h host2 -u user2 -ppassword2 -P port
"query2" |tee file2.csv
sed -i 'li\FILE1' file1.csv #just to add a title
echo '' >> file1.csv #just to add a space at the end
sed -i 'li\FILE2' file2.csv
echo '' >> file2.csv
cat file1.csv file2.csv > file.csv
The is an example of how my file.csv looks like but in fact contains more similar cells:
A B C
1 C.Installs
2 date
3 2019-02-01 100
4 2019-02-02 131
5 2019-02-03 222
6 2019-02-04 180
7 2019-02-05 213
8
9 A.Installs
10 Date
11 2019-02-01 23
12 2019-02-02 42
13 2019-02-03 34
14 2019-02-04 35
15 2019-02-05 21
Now everytime I run the shell command it should update/replace the file.csv while maintaining or re-adding the formulas for the specific cells. An example for BEFORE and AFTER:
First run of the shell script:
A B C
1 C.Installs
2 date
3 2019-02-01 100
4 2019-02-02 131
5 2019-02-03 222
6 2019-02-04 180
7 2019-02-05 213
8 846 #Formula of SUM for the 5 values
9 A.Installs
10 Date
11 2019-02-01 23
12 2019-02-02 42
13 2019-02-03 34
14 2019-02-04 35
15 2019-02-05 21
16 155 #Formula of SUM for the 5 values
17
18 691 #Formula of the difference between the two totals
Second run of the Shell script:
A B C
1 C.Installs
2 date
3 2019-02-02 131
4 2019-02-03 222
5 2019-02-04 180
6 2019-02-05 213
7 2019-02-06 158
8 904 #Formula of SUM for the 5 values
9 A.Installs
10 Date
11 2019-02-02 42
12 2019-02-03 34
13 2019-02-04 35
14 2019-02-05 21
15 2019-02-06 31
16 163 #Formula of SUM for the 5 values
17
18 741 #Formula of the difference between the two totals
So I would think that first step is to find a way to apply the formulas to the csv file
So I need to build on top of what I have, maybe something with awk am not sure how to proceed, to be honest totally new at this.
Please keep it simple.
Thanks
You could use csvkit https://csvkit.readthedocs.io/en/latest/scripts/csvsql.html
Starting from
$ cat one.csv
2019-02-01,100
2019-02-02,131
2019-02-03,222
2019-02-04,180
2019-02-05,213
$ cat two.csv
2019-02-01,23
2019-02-02,42
2019-02-03,34
2019-02-04,35
2019-02-05,21
you could run
#!/bin/bash
# add header
sed -i '1s/^/data,value\n/' one.csv
sed -i '1s/^/data,value\n/' two.csv
one=$(csvsql --query "select sum(value) as sumOne from one" one.csv | tail -n +2)
two=$(csvsql --query "select sum(value) as sumOne from two" two.csv | tail -n +2)
echo "$one-$two" | bc
to have 691
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.