[英]Removing blank lines in the xml document
I need to read the body of msg file and convert it into xml file. 我需要阅读msg文件的正文并将其转换为xml文件。 I used the below code to convert msg file to xml file. 我使用下面的代码将msg文件转换为xml文件。 I can able to get the xml file but problem is empty lines are displaying in the output xml file. 我可以获取xml文件,但问题是在输出xml文件中显示空行。 I used RegEx to remove blank lines from string. 我使用RegEx从字符串中删除空白行。 I can able to see that blank lines are deleted from the string while debugging. 我可以看到调试时从字符串中删除了空行。 But after loading that string as xml file i am getting blank lines in xml file. 但是在将该字符串加载为xml文件后,我在xml文件中得到了空行。 Attached the image of sample xml file. 附加了示例xml文件的图像。
string[] filePaths = Directory.GetFiles(@"C:\Projects\Userdata\Source Folder\", "*.msg");
for (int i = 0; i < filePaths.Length; ++i)
{
string path = filePaths[i];
string fname = System.IO.Path.GetFileName(path);
_Application outlook = new ApplicationClass();
MailItem item = (MailItem)outlook.CreateItemFromTemplate(path, Type.Missing);
string b = item.Body;
string formatbody = System.Text.RegularExpressions.Regex.Replace(b, @"^\s+$[\r\n]*", "", RegexOptions.Multiline);
XDocument doc1 = XDocument.Parse(formatbody,LoadOptions.PreserveWhitespace);
var xs = doc1.Elements();
string test = string.Empty;
foreach (var x in xs)
{
test += x.ToString();
}
XmlDocument doc = new XmlDocument();
doc.LoadXml(test);
doc.Save(@"C:\Projects\Destination Folder\" + fname + ".xml");
}
Body of .msg file looks like this .msg文件的主体如下所示
<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type="text/xsl" href="gateway_transaction_display.xsl"?>
<File>
<File_Type>AP PAYMENTS</File_Type>
<File_Header_Record>
<File_Format_Version>0002</File_Format_Version>
<Creation_Module>0286-14</Creation_Module>
</File_Header_Record>
<Transaction>
<Transaction_Type>FT_TRANS_IMP</Transaction_Type>
<Transaction_Header>
<Record_Number>1</Record_Number>
<Urgent>Y</Urgent>
</Transaction_Header>
<Model_Info>
<Model_ID><![CDATA[FF DOM INT PAY]]></Model_ID>
</Model_Info>
<Transfer_Info>
<Charges>15</Charges>
</Transfer_Info>
<Amounts>
<Transaction_Amount>
<Amount>4665786.22</Amount>
<Currency>CAD</Currency>
</Transaction_Amount>
</Amounts>
<Dates>
<Trusted_Source>Y</Trusted_Source>
<Value_Date>2014-03-31</Value_Date>
</Dates>
<Bank_Account>
<Bank_Account_Type>DR</Bank_Account_Type>
<Bank>
<Bank_Route_Code>
<Code_Type>Y</Code_Type>
</Bank_Route_Code>
</Bank>
<Account>
<Account_ID>FF01</Account_ID>
</Account>
</Bank_Account>
<Bank_Account>
<Bank_Account_Type>CR</Bank_Account_Type>
<Bank>
<Bank_Route_Code>
<Code_Type>Y</Code_Type>
</Bank_Route_Code>
</Bank>
<Account>
<Account_ID>D039</Account_ID>
</Account>
</Bank_Account>
<Payment_Details_Or_Addenda>
<Details_Text><![CDATA[Unapplied
cash & intercompany settlemet]]></Details_Text>
</Payment_Details_Or_Addenda>
</Transaction>
<File_Trailer_Record>
<File_Name>AP PAYMENTS</File_Name>
</File_Trailer_Record>
</File>
you don't need to use Regex for removing blank spaces. 您无需使用Regex删除空格。 Instead 代替
1. Trim the message content before parsing as XDocument 1.在解析为XDocument之前修剪消息内容
string result = item.Body.Trim()
2.specify loadoptions as none instead of PreserveWhitespace. 2.将loadoptions指定为none而不是PreserveWhitespace。
XDocument.Parse(result,LoadOptions.None);
--SJ --SJ
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.