I need to import a large CSV file into postgresql. The file uses two delimiters "," (comma), and "_" (underscore).
postgres copy
command is unable to use two delimiter characters so I process the file in bash before I load it to the database:
cat large_file.csv \
| sed -e 's/_/,/' \
| psql -d db -c "COPY large_table FROM STDIN DELIMITER ',' CSV header"
```
I'm trying to reproduce this command in python and I am having a hard time finding the python equivalent for sed
.
Using psycopg I can copy from STDIN using python:
with unzip('large_zip.zip', 'large_file.csv') as file:
cr.copy_expert('''
COPY large_table
FROM STDIN
DELIMITER ',' CSV HEADER
''', file)
The file is very big and is loaded directly form the zip file. I'm trying to avoid saving a local copy.
What is the best way to process the file line by line and creating a file like object I can send as standard input to another command in python?
I did this recently and I can tell you there are a few ugly parts but it is definitely possible. I can't paste the code here verbatim because it's company internal.
The basic idea is this:
start the program which consumes data from stdin
by spawning it like this:
command = subprocess.Popen(command_list, stdin=subprocess.PIPE)
.
command.stdin
) start a threading.Thread
which writes to or reads from it. If you have multiple pipes you need multiple threads. command.wait()
in the main thread. return
ing from their target
function. import shutil
import subprocess
import sys
import threading
lots_of_data = StringIO.StringIO()
import_to_db = subprocess.Popen(["import_to_db"], stdin=subprocess.PIPE)
# Make sure your input stream is at pos 0
lots_of_data.seek(0)
writer = threading.Thread(target=shutil.copyfileobj,
args=(lots_of_data, import_to_db.stdin))
writer.start()
return_code = import_to_db.wait()
if return_code:
print "Error"
sys.exit(1)
writer.join()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.