[英]Using bash, awk or sed to templatize a CSV file into a SQL file
I will have a CSV file (say, ids.csv
) that I need to ETL into a SQL script (say, update_products.sql
). 我将需要将ETL转换为SQL脚本(例如
update_products.sql
)的CSV文件(例如ids.csv
)。 The CSV will be headerless and will consist of comma-delimited numbers (product IDs in a database), for instance: CSV将是无标题的,并且将由逗号分隔的数字(数据库中的产品ID)组成,例如:
29294848,29294849,29294850,29294851,29294853,29294857,29294858,29294860,29294861,29294863,29294887,29294888,
29294889,29294890,29294891,29294892,29294895,29294897,29294898,29294899,29294901,29294903,29294912,29294916
Starting with a SQL "template" file ( template.sql
) that looks something like this: 用SQL的“模板”文件(开始
template.sql
),看起来是这样的 :
UPDATE products SET quantity = 0 WHERE id = %ID%;
I'm looking for a way via bash
, awk
, sed
(or any other type of shell scripting tool), to templatize %IDS%
with the values in the CSV, hence turning the generated SQL into something like: 我正在寻找一种通过
bash
, awk
, sed
(或任何其他类型的Shell脚本工具)的方法,以CSV中的值对%IDS%
进行模板化,从而将生成的SQL转换为类似以下内容的方法:
UPDATE products SET quantity = 0 WHERE id = 29294848;
UPDATE products SET quantity = 0 WHERE id = 29294849;
UPDATE products SET quantity = 0 WHERE id = 29294850;
... etc, for all the IDs in the CSV...
Super flexible here: 这里超级灵活:
awk
, sed
, bash, whatever...as long as I can run it from the command line) awk
, sed
,bash等等……只要我可以从命令行运行它) template.sql
) to start with, perhaps the solution can just "inject" this template into the script as an argument template.sql
),也许解决方案可以将该模板作为参数“注入”到脚本中 update_products.sql
) for me, but if we're limited to console output thats OK to (just not preferred) update_products.sql
对我来说),但如果我们仅限于控制台输出这就是确定 (只是没有优先) Any ideas how I might be able to accomplish this? 有什么想法我可能能够做到这一点吗?
I'd probably start with 我可能会开始
$: sed "s/ *= %ID%/ IN ( $(echo $(<ids.csv) ) )/" template.sql > update_products.sql
but if it's a lot of id's I'm not sure what your limits are, and I honestly don't remember whether that's an ANSI standard structure... 但是如果有很多id,我不确定您的限制是什么,老实说,我不记得这是否是ANSI标准结构...
SO ... 所以 ...
$: while IFS=, read -a ids
> do for id in ${ids[@]}
> do echo "UPDATE products SET quantity = 0 WHERE id = $id;"
> done
> done < ids.csv > update_products.sql
$: cat update_products.sql
UPDATE products SET quantity = 0 WHERE id = 29294848;
UPDATE products SET quantity = 0 WHERE id = 29294849;
UPDATE products SET quantity = 0 WHERE id = 29294850;
UPDATE products SET quantity = 0 WHERE id = 29294851;
UPDATE products SET quantity = 0 WHERE id = 29294853;
UPDATE products SET quantity = 0 WHERE id = 29294857;
UPDATE products SET quantity = 0 WHERE id = 29294858;
UPDATE products SET quantity = 0 WHERE id = 29294860;
UPDATE products SET quantity = 0 WHERE id = 29294861;
UPDATE products SET quantity = 0 WHERE id = 29294863;
UPDATE products SET quantity = 0 WHERE id = 29294887;
UPDATE products SET quantity = 0 WHERE id = 29294888;
UPDATE products SET quantity = 0 WHERE id = 29294889;
UPDATE products SET quantity = 0 WHERE id = 29294890;
UPDATE products SET quantity = 0 WHERE id = 29294891;
UPDATE products SET quantity = 0 WHERE id = 29294892;
UPDATE products SET quantity = 0 WHERE id = 29294895;
UPDATE products SET quantity = 0 WHERE id = 29294897;
UPDATE products SET quantity = 0 WHERE id = 29294898;
UPDATE products SET quantity = 0 WHERE id = 29294899;
UPDATE products SET quantity = 0 WHERE id = 29294901;
UPDATE products SET quantity = 0 WHERE id = 29294903;
UPDATE products SET quantity = 0 WHERE id = 29294912;
UPDATE products SET quantity = 0 WHERE id = 29294916;
不需要使用%ID%-ids.txt只需要像这样以SQL为前缀,将输出写入product_updates.sql
输出文件即可:
awk -F, '{printf "%s (%s)\n", "UPDATE products SET quantity = 0 WHERE id IN ", $0}' ids.txt > product_updates.sql
I propose to be safe rather than sorry. 我建议保持安全而不是后悔。
May be deemed pedantic, but working with business database is serious matter. 可能被认为是书呆子,但是使用业务数据库是很重要的事情。
So here it is based on @Paul Hodges's answer 所以这里是基于@Paul Hodges的回答
#!/usr/bin/env bash
{
# Use the prepared statements `zeroproduct`
# to protect against SQL injections
printf 'PREPARE zeroproduct FROM '\''%s'\'';\n' \
'UPDATE products SET quantity = 0 WHERE id = ?'
# Work inside a transaction, so if something goes wrong,
# like the sql file is incomplete, it can be rolled-back.
printf 'START TRANSACTION;\n'
while IFS=, read -r -a ids; do
for id in "${ids[@]}"; do
# Set the value of the @id argument in SQL
# And execute the SQL statement with the @id argument
# that will replace the '?'
printf 'SET @id='\''%8d'\''; EXECUTE zeroproduct USING @id;\n' \
"$((id))" # Ensure id is an integer
done
done <ids.csv
# Now commit all these changes since we are finally here
printf 'COMMIT;\n'
# Deallocate the prepared statement once we are done
printf 'DEALLOCATE PREPARE zeroproduct;\n'
} >update_products.sql
# Good to have if this is transmitted remotely
sha512sum update_products.sql >update_products.sql.sha512sum
# can later check with:
sha512sum -c update_products.sql.sha512sum
From the provided sample csv, here is the content of update_products.sql
: 在提供的示例csv中,这是
update_products.sql
的内容:
PREPARE zeroproduct FROM 'UPDATE products SET quantity = 0 WHERE id = ?';
START TRANSACTION;
SET @id='29294848'; EXECUTE zeroproduct USING @id;
SET @id='29294849'; EXECUTE zeroproduct USING @id;
SET @id='29294850'; EXECUTE zeroproduct USING @id;
SET @id='29294851'; EXECUTE zeroproduct USING @id;
SET @id='29294853'; EXECUTE zeroproduct USING @id;
SET @id='29294857'; EXECUTE zeroproduct USING @id;
SET @id='29294858'; EXECUTE zeroproduct USING @id;
SET @id='29294860'; EXECUTE zeroproduct USING @id;
SET @id='29294861'; EXECUTE zeroproduct USING @id;
SET @id='29294863'; EXECUTE zeroproduct USING @id;
SET @id='29294887'; EXECUTE zeroproduct USING @id;
SET @id='29294888'; EXECUTE zeroproduct USING @id;
SET @id='29294889'; EXECUTE zeroproduct USING @id;
SET @id='29294890'; EXECUTE zeroproduct USING @id;
SET @id='29294891'; EXECUTE zeroproduct USING @id;
SET @id='29294892'; EXECUTE zeroproduct USING @id;
SET @id='29294895'; EXECUTE zeroproduct USING @id;
SET @id='29294897'; EXECUTE zeroproduct USING @id;
SET @id='29294898'; EXECUTE zeroproduct USING @id;
SET @id='29294899'; EXECUTE zeroproduct USING @id;
SET @id='29294901'; EXECUTE zeroproduct USING @id;
SET @id='29294903'; EXECUTE zeroproduct USING @id;
SET @id='29294912'; EXECUTE zeroproduct USING @id;
SET @id='29294916'; EXECUTE zeroproduct USING @id;
COMMIT;
DEALLOCATE PREPARE zeroproduct;
In addition to the answer by @suspectus which provides a nice use of printf
to output each line wanted, a slightly more procedural use of awk
incorporating a for
loop over the fields would be: @suspectus的答案除了可以很好地使用
printf
来输出所需的每一行外,还可以在awk
过程上使用更多的方法,并在字段中使用for
循环:
awk -F, '{
for (i=1;i<=NF;i++)
print "UPDATE products SET quantity = 0 WHERE id = " $i ";"
}' file.csv
Where the single rule simply loops over each of the comma-separated fields using string-concatenation to form the desired output. 单个规则使用字符串连接简单地遍历每个逗号分隔的字段以形成所需的输出。 In detail the
awk
command: 详细的
awk
命令:
awk -F,
sets the field-separator ( FS
) equal to a comma to split the input, awk -F,
将字段分隔符 ( FS
)设置为等于逗号以分隔输入, for (i=1;i<=NF;i++)
simply loops over each field, and for (i=1;i<=NF;i++)
仅遍历每个字段,并且 print "UPDATE products SET quantity = 0 WHERE id = " $i ";"
outputs the wanted text incorporating the field within using string-concatenation. Example Use/Output 使用/输出示例
With your data in file.csv
(presumed to be a single line, but it really doesn't matter) your output would be: 将数据保存在
file.csv
(假定为单行,但这并不重要),您的输出将是:
$ awk -F, '{
> for (i=1;i<=NF;i++)
> print "UPDATE products SET quantity = 0 WHERE id = " $i ";"
> }' file.csv
UPDATE products SET quantity = 0 WHERE id = 29294848;
UPDATE products SET quantity = 0 WHERE id = 29294849;
UPDATE products SET quantity = 0 WHERE id = 29294850;
UPDATE products SET quantity = 0 WHERE id = 29294851;
UPDATE products SET quantity = 0 WHERE id = 29294853;
UPDATE products SET quantity = 0 WHERE id = 29294857;
UPDATE products SET quantity = 0 WHERE id = 29294858;
UPDATE products SET quantity = 0 WHERE id = 29294860;
UPDATE products SET quantity = 0 WHERE id = 29294861;
UPDATE products SET quantity = 0 WHERE id = 29294863;
UPDATE products SET quantity = 0 WHERE id = 29294887;
UPDATE products SET quantity = 0 WHERE id = 29294888;
UPDATE products SET quantity = 0 WHERE id = 29294889;
UPDATE products SET quantity = 0 WHERE id = 29294890;
UPDATE products SET quantity = 0 WHERE id = 29294891;
UPDATE products SET quantity = 0 WHERE id = 29294892;
UPDATE products SET quantity = 0 WHERE id = 29294895;
UPDATE products SET quantity = 0 WHERE id = 29294897;
UPDATE products SET quantity = 0 WHERE id = 29294898;
UPDATE products SET quantity = 0 WHERE id = 29294899;
UPDATE products SET quantity = 0 WHERE id = 29294901;
UPDATE products SET quantity = 0 WHERE id = 29294903;
UPDATE products SET quantity = 0 WHERE id = 29294912;
UPDATE products SET quantity = 0 WHERE id = 29294916;
Look things over and let me know if you have further questions. 仔细检查一下,如果您还有其他问题,请告诉我。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.