简体   繁体   English

不使用pg_dump从PostgreSQL数据库中提取数据

[英]Extract data from PostgreSQL DB without using pg_dump

There is a PostgreSQL database on which I only have limited access (eg, I can't use pg_dump). 有一个PostgreSQL数据库,在该数据库上我只有有限的访问权限(例如,我不能使用pg_dump)。 I am trying to create a local "mirror" by exporting certain tables from the database. 我正在尝试通过从数据库导出某些表来创建本地“镜像”。 I do not have the permissions needed to just dump a table as SQL from within psql. 我没有从psql内将表转储为SQL所需的权限。 Right now, I just have a Python script that iterates through my table_names, selects all fields and then exports them as a CSV: 现在,我只有一个Python脚本可以遍历table_names,选择所有字段,然后将它们导出为CSV:

for table_name, file_name in zip(table_names, file_names):
    cmd = """echo "\\\copy (select * from %s)" to stdout WITH CSV HEADER | psql -d remote_db | gzip > ./%s/%s.gz"""%(table_name,dir_name,file_name)
    os.system(cmd)

I would like to not use CSV if possible, as I lose the field types and the encoding can get messed up. 如果可能,我不希望使用CSV,因为我丢失了字段类型,并且编码可能变得混乱。 First best would probably be some way of getting the generating SQL code for the table using \\copy. 最好的办法可能是某种使用\\ copy获取表生成SQL代码的方法。 Next best would be XML, ideally with some way of preserving the field types. 最好的选择是XML,最好是保留字段类型的某种方式。 If that doesn't work, I think the final option might be two queries---one to get the field data types, the other to get the actual data. 如果那不起作用,我认为最后的选择可能是两个查询-一个用于获取字段数据类型,另一个用于获取实际数据。

Any thoughts or advice would be greatly appreciated - thanks! 任何想法或建议将不胜感激-谢谢!

It puzzles me the bit about "I do not have the permissions needed to just dump a table as SQL from within psql." 这使我有些困惑, “我没有从psql内将表作为SQL转储所需的权限”。 pg_dump runs standalone, outside psql (both are clients) and if you have permission to connect to the database and select a table, I'd guess you'd also be able to dump it using pg_dump -t <table> . pg_dumppsql之外(都是客户端)独立运行,如果您有权连接到数据库并选择一个表,我想您也可以使用pg_dump -t <table>来转储它。 Am I missing something? 我想念什么吗?

If you use psycopg2 you can use cursor.description to check column names, and use fetched data type to convert it to required string like data to acceptable format. 如果使用psycopg2 ,则可以使用cursor.description来检查列名,并使用获取的数据类型将其转换为所需的字符串,例如将数据转换为可接受的格式。

This code creates INSERT statements that you can use not only with PostgreSQL, but also with other databases (then you probably will have to change date format): 这段代码创建了INSERT语句,您不仅可以将它们用于PostgreSQL,还可以与其他数据库一起使用(然后您可能必须更改日期格式):

cursor.execute("SELECT * FROM %s" % (table_name))
column_names = []
columns_descr = cursor.description
for c in columns_descr:
    column_names.append(c[0])
insert_prefix = 'insert into %s (%s) values ' % (table_name, ', '.join(column_names))
rows = cursor.fetchall()
for row in rows:
    row_data = []
    for rd in row:
        if rd is None:
            row_data.append('NULL')
        elif isinstance(rd, datetime.datetime):
            row_data.append("'%s'" % (rd.strftime('%Y-%m-%d %H:%M:%S') ))
        else:
            row_data.append(repr(rd))
    print('%s (%s);' % (insert_prefix, ', '.join(row_data)))

In psycopg2 there is even support for COPY . 在psycopg2中甚至支持COPY Look at: COPY-related methods on their docs 查看: 文档中与COPY相关的方法

If you prefer using metadata then you can use my recipe: Dump PostgreSQL db schema to text . 如果您更喜欢使用元数据,则可以使用我的配方:将PostgreSQL db模式转储到text It is based on Extracting META information from PostgreSQL by Lorenzo Alberton 它基于Lorenzo Alberton 从PostgreSQL提取META信息的基础上

You could use these queries (gotten by using "psql --echo-hidden" and "\\d ") to get the base metadata: 您可以使用以下查询(通过使用“ psql --echo-hidden”和“ \\ d”来获得)来获取基本元数据:

-- GET OID
SET oid FROM pg_class WHERE relname = <YOUR_TABLE_NAME>

-- GET METADATA
SELECT a.attname,
  pg_catalog.format_type(a.atttypid, a.atttypmod),
  (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
   FROM pg_catalog.pg_attrdef d
   WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),
   a.attnotnull, a.attnum
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = <YOUR_TABLES_OID_FROM_PG_CLASS> AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;

This gives you the name, data type, default, null flag and field order within the row. 这为您提供了行中的名称,数据类型,默认值,空标志和字段顺序。 To get the actual data, your best bet is still CSV--the built in COPY table TO STDOUT WITH CSV HEADER is very robust. 为了获得实际数据,最好的选择还是CSV-内置的COPY表到CSV HEADER的STDOUT非常强大。 But if you are worried about encoding, be sure to get the value of server_encoding and client_encoding just before dumping the CSV data. 但是,如果您担心编码,请确保在转储CSV数据之前获取server_encoding和client_encoding的值。 That combined with the metadata from the above query should give enough information to properly interpret a CSV dump. 结合以上查询中的元数据,应提供足够的信息以正确解释CSV转储。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM