简体   繁体   English

将open xml string转换为byte []

[英]Convert open xml string to byte[]

so, I am editing a word document, using OpenXML. 所以,我正在使用OpenXML编辑word文档。 And for some reasons, I convert it all into a string : 由于某些原因,我将它全部转换为string

//conversion du byte en memorystream
using (var file = new MemoryStream(text))
using (var reader = new StreamReader(file))
{
    WordprocessingDocument wordDoc = WordprocessingDocument.Open(file, true);
    using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
    {
        docText = sr.ReadToEnd();
    }
}

And then, I convert it as a byte. 然后,我将其转换为字节。

But, a simple convert will not work: 但是,简单的转换不起作用:

byte[] back2Byte = System.Text.Encoding.ASCII.GetBytes(docText );

Because the string is a open xml string. 因为字符串是一个打开的xml字符串。

Tried this, but always got a corrupted file when I tried to open it with Word: 试过这个,但是当我尝试用Word打开它时,总是有一个损坏的文件:

var repo = new System.IO.MemoryStream(System.Text.Encoding.UTF8.GetBytes(docText));

byte[] buffer = new byte[16 * 1024];
MemoryStream ms = new MemoryStream();

int read;
while ((read = repo.Read(buffer, 0, buffer.Length)) > 0)
{
    ms.Write(buffer, 0, read);
}

byte[] back2Byte = ms.ToArray();

So, this doesn't work either: 所以,这也不起作用:

byte[] back2Byte = new byte[docText.Length * sizeof(char)];
System.Buffer.BlockCopy(docText.ToCharArray(), 0, back2Byte, 0, back2Byte.Length);

edit : After some checkings, it seems it is write as a openxml document into the database, and so, word cannot read it. 编辑:经过一些检查,它似乎是作为openxml文档写入数据库,因此,word无法读取它。 There is no error when i open it with notepad 用记事本打开它时没有错误

How can I correct this? 我怎么能纠正这个?

So, the real issue is, how can I convert a OpenXML string to a byte that can be open in word? 所以,真正的问题是,如何将OpenXML字符串转换为可以在word中打开的字节?

You cannot do this sort of thing. 你不能做这种事情。 You are getting the bytes for only one part of an OpenXML document. 您只获取OpenXML文档的一部分的字节。 By definition, all Microsoft Office documents are multi-part OpenXML documents. 根据定义,所有Microsoft Office文档都是多部分OpenXML文档。 You could theoretically capture the bytes for all the parts using a technique like you're currently using, but you would also have to capture all the part/relationship information necessary to reconstruct the multi-part document. 理论上,您可以使用当前使用的技术捕获所有部件的字节,但您还必须捕获重建多部件文档所需的所有部件/关系信息。 You'd be better off just reading all the bytes of the file and storing them as-is: 你最好只读取文件的所有字节并按原样存储它们:

// to read the file as bytes
var fileName = @"C:\path\to\the\file.xlsx";
var fileBytes = File.ReadAllBytes(fileName);

// to recreate the file from the bytes
File.WriteAllBytes(fileName, fileBytes)

If you need a string form of those bytes, try this: 如果您需要这些字节的字符串形式,请尝试以下方法:

// to convert bytes to a (non-readable) text form
var fileContent = Convert.ToBase64String(fileBytes);

// to convert base-64 back to bytes
var fileBytes = Convert.FromBase64String(fileContent);

Either way, there is absolutely no need to use the OpenXML SDK for your use case. 无论哪种方式,都绝对不需要将OpenXML SDK用于您的用例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM