简体   繁体   English

SQL Server - 将大量记录导出到多个 csv 文件中

[英]SQL Server - Export huge amount of record into multiple csv files

I'm running a query which gives me more than 8 millions of records.我正在运行一个查询,它给了我超过 800 万条记录。 I've exported the result into a .csv file but the file is way too big to be processed (8GB...).我已将结果导出到 .csv 文件中,但该文件太大而无法处理(8GB...)。

What I'm trying to do is to split the result csv into multiple files but with one condition :我想要做的是将结果 csv 拆分为多个文件,但有一个条件:

There is a column name "Locator" which represents an ID.有一个列名“Locator”代表一个 ID。 I've managed to group my records in order to have that kind of result :我已经设法对我的记录进行分组以获得那种结果:

Locator | Name | LastName
___________________________ 
ABCDEFH | Foo  | Oof 
ABCDEFH | Foo2 | Oof2 
ABCDEFH | Foo3 | Oof3 
TUVWXYZ | Mark | Mark
TUVWXYZ | Mark2| Mark2
...     | ...  | ...

So what I want to do is basically to split the records without splitting the groupings... Is is possible to do that?所以我想要做的基本上是在拆分分组的情况下拆分记录......是否可以这样做?

EDIT : Here's the query with the NTILE :编辑:这是 NTILE 的查询:

with locator as 
(
    select distinct
    pnrlctrnum,
    NTILE(8) OVER(ORDER BY pnrlctrnum) as Tile_Num

from ttddocseg, ttddoc, ttdhdr
where ttddocseg.tdtrxnum = ttddoc.tdtrxnum and ttdhdr.tdtrxnum = ttddoc.tdtrxnum
)
select ttdhdr.pnrlctrnum, ttddoc.*, ttddocseg.*
from ttddocseg, ttddoc, ttdhdr
inner join locator on locator.pnrlctrnum = ttdhdr.pnrlctrnum
where Tile_num = 7 and ttddocseg.tdtrxnum = ttddoc.tdtrxnum and ttdhdr.tdtrxnum = ttddoc.tdtrxnum

You could use the ntile function to generate groups of groups.您可以使用 ntile 函数生成组组。 This would ensure the groups are kept together and you can then split your exports out into however many partitions you like.这将确保组保持在一起,然后您可以将您的导出分成您喜欢的任意多个分区。

Change thr ntile number to change the number of groupings you would like, you can then integrate this into your export process however you like.更改 thr ntile number 以更改您想要的分组数量,然后您可以将其集成到您的导出过程中。

with locator as 
(
    select ditinct
    Locator,
    NTILE(4) OVER(ORDER BY Locator) as Tile_Num
    from tbl 
)
select *
from tbl
inner join locator on locator.Locator = tbl.Locator
where Tile_num = 3

There is a mistake in this approach.这种方法有一个错误。 Reproducing the same example on sample AdventureWorks DB in Person.Address table在 Person.Address 表中的示例 AdventureWorks DB 上重现相同的示例

with ps as 
(
SELECT distinct PostalCode, ntile(6) over (order by PostalCode) as part
  FROM [Person].[Address]
)

 select * from( select PostalCode, count(distinct part) as part_count from ps group by PostalCode ) as tmp where part_count>1

The result is:结果是:

PostalCode邮政编码 part_count部分计数
3977 3977 2 2
78400 78400 2 2
92118 92118 2 2
97301 97301 2 2
GA10 GA10 2 2

There are some postal codes that belong in more than 1 group.This is because the separation is being executed in first place.有一些邮政编码属于 1 个以上的组。这是因为分离是首先执行的。 The ntile tries to separate in groups of the same size without taking into account the values and by using distinct in select you end end up with distinct values in each group but some values may are repeated as we saw. ntile 尝试在不考虑值的情况下将相同大小的组分开,并且通过在 select 中使用 distinct ,您最终会在每个组中得到不同的值,但正如我们所见,某些值可能会重复。

To avoid this recurrence you should get first the distinct postal codes, then split them in groups and in the end perform the join with the primary table.为避免这种情况再次发生,您应该首先获得不同的邮政编码,然后将它们分成几组,最后执行与主表的连接。

 with ps as
 (
 select PostalCode, ntile(6) over (order by PostalCode) as part   
 from (select distinct PostalCode from Person.Address ) as tmp
 )
 
 select * from Person.Address a inner join ps on a.PostalCode= ps.PostalCode

This way you can keep groups of postal code in the same partition.As you can see there is no max value in a partition equal to a min value of the next partition.通过这种方式,您可以将邮政编码组保留在同一分区中。正如您所看到的,分区中没有最大值等于下一个分区的最小值。

 select ps.part, min(a.PostalCode)  as minimum, max(a.PostalCode) as maximum from Person.Address a inner join ps 
        on a.PostalCode= ps.PostalCode group by ps.part
part部分 minimum最低限度 maximum最大值
1 1 01071 01071 32960 32960
2 2 33000 33000 59100 59100
3 3 59101 59101 84070 84070
4 4 84074 84074 94066 94066
5 5 94070 94070 EM15 EM15
6 6 G1R G1R YO15 YO15

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM