简体   繁体   English

使用sed / grep将数字包装成双引号

[英]Wrapping numbers into quotes using sed / grep

I'm trying to wrap integers into quotes in a SQL file. 我正在尝试将整数包装在SQL文件中的引号中。 The dumped SQL contains an integer column that I would like to convert to string, since it truncates the leading zero for zip codes. 转储的SQL包含一个我想转换为字符串的整数列,因为它会截断邮政编码的前导零。

How do I know the zero has been truncated? 我怎么知道零已被截断? All zip codes should be 5 characters. 所有邮政编码应为5个字符。 The ones that have 4 have the leading zero truncated. 具有4的那些前导零被截断。 For example: 例如:

INSERT INTO cities VALUES(21919,'MD','Maryland','Earleville',39.427105,-75.94031);
INSERT INTO cities VALUES(21921,'MD','Maryland','Elkton',39.626434,-75.84584);
INSERT INTO cities VALUES(1001,'MA','Massachusetts','Agawam',42.070206,-72.622739);
INSERT INTO cities VALUES(1002,'MA','Massachusetts','Cushman',42.377017,-72.51565);

Wanted result: 想要的结果:

INSERT INTO cities VALUES('21919','MD','Maryland','Earleville',39.427105,-75.94031);
INSERT INTO cities VALUES('21921','MD','Maryland','Elkton',39.626434,-75.84584);
INSERT INTO cities VALUES('01001','MA','Massachusetts','Agawam',42.070206,-72.622739);
INSERT INTO cities VALUES('01002','MA','Massachusetts','Cushman',42.377017,-72.51565);

The first two should simply be wrapped. 前两个应该简单地包装起来。 The other two should have the leading zero added. 其他两个应该添加前导零。

Two requirements: 两个要求:

  1. Wrap all zip code values into strings 将所有邮政编码值包装到字符串中
  2. Add the leading zero to 4 digit ones. 将前导零添加到4位数字。

I was able to get all the 4 digit ones using 我能够使用所有4位数字

grep "([[:digit:]]\{4\}," cities.sql

Or the pattern 或模式

\([0-9]{4},

but I'm not sure how to either wrap the values into quotes, nor to add the leading zero using sed. 但是我不确定如何将这些值包装在引号中,或者如何使用sed添加前导零。

Does it have to be sed? 它必须被镇静吗? If you can use awk, you could do: 如果可以使用awk,则可以执行以下操作:

cat cities.sql | awk -F'[,(]' '{printf "%s('\''%05d'\'',%s,%s,%s,%s,%s\n", $1, $2, $3, $4, $5, $6, $7 }'

Using gnu-awk it is pretty simple: 使用gnu-awk非常简单:

awk 'match($0, /^(.+?)(\<[0-9]{4,5})(,.+)$/, a) { 
       printf "%s\047%05d\047%s\n", a[1], a[2], a[3] }' file
INSERT INTO cities VALUES('21919','MD','Maryland','Earleville',39.427105,-75.94031);
INSERT INTO cities VALUES('21921','MD','Maryland','Elkton',39.626434,-75.84584);
INSERT INTO cities VALUES('01001','MA','Massachusetts','Agawam',42.070206,-72.622739);
INSERT INTO cities VALUES('01002','MA','Massachusetts','Cushman',42.377017,-72.51565);

You could do it one after the other: First, add the leading zero, then wrap with single quotes: 您可以一个接一个地执行此操作:首先,添加前导零,然后用单引号引起来:

cat cities.sql \
    | sed -e "s/(\([0-9]\{4\}\)/(0\1/" \
    | sed -e "s/(\([0-9]\{5\}\)/(\'\1\'/" \
    > cities2.sql

As you can see, I used the fact that the integers are always preceded by a "(" so that the other numbers are not affected. If this is not always the case, you need to adapt the regex accordingly. 如您所见,我使用了这样一个事实,即整数始终以“(”开头,这样其他数字就不会受到影响。如果并非总是如此,则需要相应地调整正则表达式。

In order to wrap something you can use grouping by wrapping the bits you want to extract into \\( ... \\), then you can reference it chronologically in your replace string with \\1, \\2 etc. 为了包装某些东西,可以通过将要提取的位包装到\\(... \\)中来使用分组 ,然后可以在替换字符串中按时间顺序使用\\ 1,\\ 2等引用它。

Best regards, smuecke 最好的问候,smuecke

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM