简体   繁体   English

将SQL二进制内容转换为文件

[英]Converting SQL Binary Content to File

I have a SQL Server database that is storing the contents of files in a table. 我有一个SQL Server数据库,它将文件的内容存储在一个表中。 Specifically, there are 2 fields: 具体来说,有两个领域:

  • Contents: varbinary(max) field that always starts with '0x1F.....' 内容: varbinary(max)字段始终以'0x1F .....'开头

  • FileType: varchar(5) field that has the type of file, such as PDF, docx, etc. FileType: varchar(5)字段,具有文件类型,例如PDF,docx等。

How can I convert the contents back into a file? 如何将内容转换回文件? I am trying to use Coldfusion, if that is possible, to convert it. 我试图使用Coldfusion,如果可能的话,转换它。 If not, what are the steps to convert the binary into a file? 如果没有,将二进制文件转换为文件的步骤是什么?

I tried the following (assuming a docx filetype) but it didn't produce a valid word file: 我尝试了以下(假设docx文件类型),但它没有生成有效的word文件:

<cfset DecodedValue = BinaryDecode(contents,"hex")>
<cffile action="WRITE" output="#DecodedValue#" file="C:\decodedfile.docx">

Thanks to User Ageax, the first 4 size bytes of 31,-117,8,0 show the content is stored in GZIP format instead. 感谢用户Ageax,31,-117,8,0的前4个字节字节显示内容以GZIP格式存储。

I first save the content as a gzip then extract the file. 我首先将内容保存为gzip然后解压缩文件。 My code is as follows: 我的代码如下:

<cfquery name="getfile" datasource="tempdb">
select content from table
</cfquery>

<cfset FileWrite("C:\mygzipfile.gzip", getfile.content)>

To extract gzip to a file using coldfusion, I used the solution at: http://coldfusion-tip.blogspot.com/2012/04/unzip-gz-file-in-coldfusion.html 要使用coldfusion将gzip提取到文件,我使用了以下解决方案: http//coldfusion-tip.blogspot.com/2012/04/unzip-gz-file-in-coldfusion.html

tldr; tldr;

The data is already binary, so ditch the binaryX() functions and save the content directly to a file. 数据已经是二进制的,因此抛弃binaryX()函数并将内容直接保存到文件中。 Read the first few bytes of the binary to verify the file type. 读取二进制文件的前几个字节以验证文件类型。 In this case, turns out the document was actually stored in GZIP format, not raw DOCX. 在这种情况下,结果文件实际上是以GZIP格式存储的,而不是原始的DOCX。


Don't be misled by how SSMS chooses to display it. 不要被SSMS如何选择显示它所误导。 SSMS displays binary in user friendly hex format, but it's still stored as binary. SSMS以用户友好的十六进制格式显示二进制文件,但它仍然以二进制形式存储。 Just write the binary directly to the file, without any BinaryX functions. 只需将二进制文件直接写入文件,无需任何BinaryX函数。

<cfset FileWrite("C:\decodedfile.docx", contents)>

Also, check your DSN settings and ensure the " BLOB - Enable binary large object retrieval (BLOB) " setting is enabled, so binary values aren't truncated at 64K (default buffer size). 此外,检查DSN设置并确保启用BLOB - 启用二进制大对象检索(BLOB) ”设置,因此二进制值不会在64K(默认缓冲区大小)处截断。

Update 1: 更新1:

The FileWrite() code above works correctly IF the "contents" column contains the binary of a valid .docx file. 如果“contents”列包含有效.docx文件的二进制文件,则上面的FileWrite()代码可以正常工作。 Perhaps the data is being stored differently than we're thinking? 也许数据的存储方式与我们的想法不同? Run a query to retrieve the binary of a single document and output the first four bytes. 运行查询以检索单个文档的二进制文件并输出前四个字节。 What is the result? 结果是什么? Typically, the first four bytes of .docx files should be 80, 75, 3, 4 . 通常情况下, 前四个字节的.DOCX文件应该是80, 75, 3, 4

<!--- print size and first 4 bytes --->
<cfoutput>
    size in bytes = #arrayLen(qYourQuery.contents)#<br>
    <cfloop from="1" to="4" index="x">
        byte #x# = #qYourQuery.contents[1][x]#<br>
    </cfloop>
</cfoutput>

Update 2: 更新2:

Closest I could find to 1F 8B 08 is GZIP. 我能找到的最近的1F 8B 08是GZIP。 Try using probeContentType() on the saved file. 尝试在保存的文件上使用probeContentType() What does it report? 它报道了什么?

<cfscript>
    paths = createObject("java", "java.nio.file.Paths");
    files = createObject("java", "java.nio.file.Files");
    input = paths.get("c:/yourFileName.docx", []);
    writeDump(files.probeContentType(input));
</cfscript>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM