[英]SSIS OPENROWSET query flat file
I currently have a variable name called InvoiceFileName that is creating .csv files through a foreach loop.我目前有一个名为 InvoiceFileName 的变量名,它通过 foreach 循环创建 .csv 文件。 A list of .csv is then outputted to a folder.
然后将 .csv 列表输出到文件夹中。
I will then need to query off of each .csv file to select the header and the first row of data for each .csv.然后我需要查询每个 .csv 文件以选择每个 .csv 的标题和第一行数据。 I believe I need to use the
OPENROWSET
to query off of the .csv.我相信我需要使用
OPENROWSET
来查询 .csv。 I have 2 questions.我有2个问题。
OPENROWSET
without inserting into a table.OPENROWSET
。 Below is a simple OPENROWSET
that only provides the header of the file.下面是一个简单的
OPENROWSET
,它只提供文件的标题。
SELECT
top 1 *
FROM OPENROWSET(BULK N'\\myservername\f$\reports\Invoices\CokeFiles\54ASBSd.csv', SINGLE_CLOB) AS Report
What kind of privs do you have on the database?你对数据库有什么样的权限? If you have or can get slightly elevated privs, you can use
BULK INSERT
and xp_cmdShell
to accomplish this, but like @scsimon said, you will have to use dynamic sql.如果您拥有或可以获得稍微提升的权限,您可以使用
BULK INSERT
和xp_cmdShell
来完成此操作,但就像@scsimon 所说的那样,您将不得不使用动态 sql。 Here's a quick example:这是一个快速示例:
-----------------------------------------------------------------------------------------------------
-- Set up your variables
-----------------------------------------------------------------------------------------------------
DECLARE
@folderPath AS VARCHAR(100) = '\\some\folder\path\here\',
@cmd AS VARCHAR(150), -- Will populate this with a command to get a list of files in a directory
@InvoiceFileName AS VARCHAR(100), -- Will be used in cursor loop
@targetTable AS VARCHAR(50) = 'SomeTable',
@fieldTerminator AS CHAR(1) = ',',
@rowTerminator AS CHAR(2) = '\n'
-----------------------------------------------------------------------------------------------------
-- Create a temp table to store the file names
-----------------------------------------------------------------------------------------------------
IF OBJECT_ID('tempdb..#FILE_LIST') IS NOT NULL
DROP TABLE #FILE_LIST
--
CREATE TABLE #FILE_LIST(FILE_NAME VARCHAR(255))
-----------------------------------------------------------------------------------------------------
-- Get a list of the files and store them in the temp table:
-- NOTE: this DOES require elevated permissions
-----------------------------------------------------------------------------------------------------
SET @cmd = 'dir "' + @folderPath + '" /b'
--
INSERT INTO #FILE_LIST(FILE_NAME)
EXEC Master..xp_cmdShell @cmd
--------------------------------------------------------------------------------
-- Here we remove any null values
--------------------------------------------------------------------------------
DELETE #FILE_LIST WHERE FILE_NAME IS NULL
-----------------------------------------------------------------------------------------------------
-- Set up our cursor and loop through the files
-----------------------------------------------------------------------------------------------------
DECLARE c1 CURSOR FOR SELECT FILE_NAME FROM #FILE_LIST
OPEN c1
FETCH NEXT FROM c1 INTO @InvoiceFileName
WHILE @@FETCH_STATUS <> -1
BEGIN -- Begin WHILE loop
BEGIN TRY
-- Bulk insert won't take a variable name, so dynamically generate the
-- SQL statement and execute it instead:
SET @sql = 'BULK INSERT ' + @targetTable + ' FROM ''' + @InvoiceFileName + ''' '
+ ' WITH (
FIELDTERMINATOR = ''' + @fieldTerminator + ''',
ROWTERMINATOR = ''' + @rowTerminator + ''',
FIRSTROW = 1,
LASTROW = 2
) '
EXEC (@sql)
END TRY
BEGIN CATCH
-- Handle errors here
END CATCH
-- Continue your loop
FETCH NEXT FROM c1 INTO @path,@filename
END -- End WHILE loop
-- Do what you need to do here with the data in your target table
A few disclaimers:一些免责声明:
BULK INSERT
and xp_cmdShell
.BULK INSERT
和xp_cmdShell
。xp_cmdShell
(and for good reason) but this is a quick and dirty solution making a lot of assumptions about what your environment is like.xp_cmdShell
使用xp_cmdShell
(并且有充分的理由),但这是一个快速而肮脏的解决方案,对您的环境做出了很多假设。 For doing this through SSIS, ideally you'd probably need to use a format file for the bulk operation , but you'd have to have consistently formatted files and remove the SINGLE_CLOB option as well.为了通过 SSIS 执行此操作,理想情况下您可能需要使用格式文件进行批量操作,但您必须具有一致的格式文件并删除 SINGLE_CLOB 选项。 A really hacky and non-ideal way to do this would be to do something like this:
这样做的一个非常hacky和非理想的方法是做这样的事情:
Let's say your file contains this data:假设您的文件包含以下数据:
Col1,Col2,Col3,Col4
Here's,The,First,Line
Here's,The,Second,Line
Here's,The,Third,Line
Here's,The,Fourth,Line
Then you could basically just parse the data doing something like this:然后你基本上可以像这样解析数据:
SELECT SUBSTRING(OnlyColumn, 0, CHARINDEX(CHAR(10), OnlyColumn, CHARINDEX(CHAR(10), OnlyColumn, 0)+1) )
FROM OPENROWSET(BULK '\\location\of\myFile.csv', SINGLE_CLOB) AS Report (OnlyColumn)
And your result would be this:你的结果是这样的:
Col1,Col2,Col3,Col4 Here's,The,First,Line
This is obviously dependent on your line endings being consistent, but if you want the results in a single column and single row (as is the behavior of the bulk operation with the SINGLE_CLOB option), that should get you what you need.这显然取决于您的行尾是否一致,但是如果您希望结果在单列和单行中(就像使用 SINGLE_CLOB 选项的批量操作的行为一样),那应该可以满足您的需求。
You can take a look at the solution on this SO post for info on how to pass the SSIS variable value as a parameter to your query.您可以查看此 SO 帖子上的解决方案,了解有关如何将 SSIS 变量值作为参数传递给查询的信息。
Use a Foreach Loop container to query all files in a folder.使用 Foreach 循环容器查询文件夹中的所有文件。 You can use wildcards for the file name, or user the variables in your DTS to set the properties of the components.
您可以使用通配符作为文件名,或使用 DTS 中的变量来设置组件的属性。
Inside the loop container you place a Data Flow Task with your source file connection, your transformations, and your destination.在循环容器内,您可以将数据流任务与源文件连接、转换和目标一起放置。
You can modify the file names and paths of all these objects by setting their properties to variables in your DTS.您可以通过将所有这些对象的属性设置为 DTS 中的变量来修改所有这些对象的文件名和路径。
With an Expresion Task inside the loop, you can change the path of the CSV file connection.通过循环内的表达任务,您可以更改 CSV 文件连接的路径。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.