I have a Postgresql dump (created with pg_dump, custom compressed format). I would like to pg_restore it to STDOUT, but replace the tab separations with pipes. I've tried just using piping through tr, but I do have a large number of text fields that actually contain tabs, and tr obviously does not respect quotes. There are tens of billions of output rows (the compressed file is > 500 GB), so I do need a relatively efficient solution.
If it has to be fast, use C. Save
#include <stdio.h>
int main()
{
int c, quoted = 0;
while (c = getchar(), c != EOF)
{
if (c == '"') quoted = !quoted;
if (c == '\t' && !quoted) c = '|';
putchar(c);
}
return quoted;
}
eg as bartab.c, compile with gcc bartab.c -o bartab
, and pipe through the resulting program.
Your best option is to use
COPY tablename TO STDOUT WITH CSV DELIMITER '|';
Do this from inside a db rather than using your dump files.
A second option would be to do new dumps using the --inserts switch and then parse the lines starting with INSERT. That too would be slow.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.