简体   繁体   中英

XML to CSV using c# without using xslt

I have been searching all over the web, i'm assuming some one has to have needed this before me and done it better, for an xml to csv converter. I have a very standard xml below:

<ArrayOfDealer xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
    <Dealer>
        <Cmf>76066699</Cmf>
        <DealerNumber/>
        <DealershipName>BROWARD MOTORSPORTS - WPB</DealershipName>
    </Dealer>
    <Dealer>
        <Cmf>76071027</Cmf>
        <DealerNumber/>
        <DealershipName>BROWARD MOTORSPORTS OF FT LAUDERDALE LLC</DealershipName>
    </Dealer>
    <Dealer>
        <Cmf>76014750</Cmf>
        <DealerNumber/>
        <DealershipName>Jet Ski of Miami</DealershipName>
    </Dealer>
    <Dealer>
        <Cmf>76066987</Cmf>
        <DealerNumber/>
        <DealershipName>BROWARD MOTORSPORTS - Davie</DealershipName>
    </Dealer>
</ArrayOfDealer>

And I would like to parse this to something like

cmf      dealernumber    dealershipname
76066699                 BROWARD MOTORSPORTS - WPB
76014750                 Jet Ski of Miami
76066987                 BROWARD MOTORSPORTS - Davie

XML is coming in from an api which I have stored as a string.

Thoughts?

Edit: To clarify, I know the structure will be like the above, with no further nesting of nodes, but the actual tag names can very.

Would something like this get you what you need:

Func<string, string> csvFormat =
    t => String.Format("\"{0}\"", t.Replace("\"", "\"\""));

var xml = XDocument.Parse(/* xml text here */);

Func<XDocument, IEnumerable<string>> getFields =
    xd =>
        xd
            .Descendants("Dealer")
            .SelectMany(d => d.Elements())
            .Select(e => e.Name.ToString())
            .Distinct();

var headers =
    String.Join(",",
        getFields(xml)
            .Select(f => csvFormat(f)));

var query =
    from dealer in xml.Descendants("Dealer")
    select string.Join(",",
        getFields(xml)
            .Select(f => dealer.Elements(f).Any()
                ? dealer.Element(f).Value
                : "")
            .Select(x => csvFormat(x)));

var csv =
    String.Join(Environment.NewLine,
        new [] { headers }.Concat(query));

This still assumes that the <ArrayOfDealer><Dealer> structure says the same, but the fields below might change.

This is rough but it will create a CSV file or a tab delimited text file as long as structure stays with

Root -> Main Element (for each row) -> Child Elements (any number)

I included 2 different test sets of XML so you can see the result generates correctly.

Working sample:

namespace XmlToCsv
{
    class Program
    {
        const int TabSpaces = 8;

        static void GenerateCsvFromXml(string xmlString, string resultFileName, bool isTabDelimited)
        {
            XDocument xDoc = XDocument.Parse(xmlString);

            var tabsNeededList = new List<int>(); // only used for TabDelimited file

            string delimiter = isTabDelimited
                ? "\t"
                : ",";

            // Get title row 
            var titlesList = xDoc.Root
                .Elements()
                .First()
                .Elements()
                .Select(s => s.Name.LocalName)
                .ToList();

            // Get the values
            var masterValuesList = xDoc.Root
                .Elements()
                .Select(e => e
                    .Elements()
                    .Select(c => c.Value)
                    .ToList())
                .ToList();

            // Add titles as first row in master values list
            masterValuesList.Insert(0, titlesList);

            // For tab delimited, we need to figure out the number of tabs
            // needed to keep the file uniform, for each column
            if (isTabDelimited)
            {
                for (var i = 0; i < titlesList.Count; i++)
                {
                    int maxLength =
                        masterValuesList
                            .Select(vl => vl[i].Length)
                            .Max();

                    // assume tab is 4 characters
                    int rem;
                    int tabsNeeded = Math.DivRem(maxLength, TabSpaces, out rem);
                    tabsNeededList.Add(tabsNeeded);
                }
            }

            // Write the file
            using (var fs = new FileStream(resultFileName, FileMode.Create))
            using (var sw = new StreamWriter(fs))
            {
                foreach (var values in masterValuesList)
                {
                    string line = string.Empty;

                    foreach (var value in values)
                    {
                        line += value;
                        if (titlesList.IndexOf(value) < titlesList.Count - 1)
                        {
                            if (isTabDelimited)
                            {
                                int rem;
                                int tabsUsed = Math.DivRem(value.Length, TabSpaces, out rem);
                                int tabsLeft = tabsNeededList[values.IndexOf(value)] - tabsUsed + 1; // one tab is always needed!

                                for (var i = 0; i < tabsLeft; i++)
                                {
                                    line += delimiter;
                                }
                            }
                            else // comma delimited
                            {
                                line += delimiter;
                            }
                        }
                    }

                    sw.WriteLine(line);
                }
            }
        }

        static void Main(string[] args)
        {
            String xmlString = @"<ArrayOfDealer xmlns:i=""http://www.w3.org/2001/XMLSchema-instance"">
                <Dealer>
                    <Cmf>76066699</Cmf>
                    <DealerNumber/>
                    <DealershipName>BROWARD MOTORSPORTS - WPB</DealershipName>
                </Dealer>
                <Dealer>
                    <Cmf>76071027</Cmf>
                    <DealerNumber/>
                    <DealershipName>BROWARD MOTORSPORTS OF FT LAUDERDALE LLC</DealershipName>
                </Dealer>
                <Dealer>
                    <Cmf>76014750</Cmf>
                    <DealerNumber/>
                    <DealershipName>Jet Ski of Miami</DealershipName>
                </Dealer>
                <Dealer>
                    <Cmf>76066987</Cmf>
                    <DealerNumber/>
                    <DealershipName>BROWARD MOTORSPORTS - Davie</DealershipName>
                </Dealer>
            </ArrayOfDealer>";

            String xmlString2 = @"<ArrayOfUnicorn xmlns:i=""http://www.w3.org/2001/XMLSchema-instance"">
                <Unicorn>
                    <UnicornColor>Red</UnicornColor>
                    <Cmf>76066699</Cmf>
                    <UnicornNumber/>
                    <UnicornshipName>BROWARD MOTORSPORTS - WPB</UnicornshipName>
                </Unicorn>
                <Unicorn>
                    <UnicornColor>Red</UnicornColor>
                    <Cmf>76071027</Cmf>
                    <UnicornNumber/>
                    <UnicornshipName>BROWARD MOTORSPORTS OF FT LAUDERDALE LLC</UnicornshipName>
                </Unicorn>
                <Unicorn>
                    <UnicornColor>Red</UnicornColor>
                    <Cmf>76014750</Cmf>
                    <UnicornNumber/>
                    <UnicornshipName>Jet Ski of Miami</UnicornshipName>
                </Unicorn>
                <Unicorn>
                    <UnicornColor>Red</UnicornColor>
                    <Cmf>76066987</Cmf>
                    <UnicornNumber/>
                    <UnicornshipName>BROWARD MOTORSPORTS - Davie</UnicornshipName>
                </Unicorn>
            </ArrayOfUnicorn>";

            // Comma delimited
            GenerateCsvFromXml(xmlString, "Dealer.csv", false);
            GenerateCsvFromXml(xmlString2, "Unicorn.csv", false);

            // Tab delimited
            GenerateCsvFromXml(xmlString, "Dealer.txt", true);
            GenerateCsvFromXml(xmlString2, "Unicorn.txt", true);

        }
    }
}

Results:

Dealer.csv:

Cmf,DealerNumber,DealershipName
76066699,,BROWARD MOTORSPORTS - WPB,
76071027,,BROWARD MOTORSPORTS OF FT LAUDERDALE LLC,
76014750,,Jet Ski of Miami,
76066987,,BROWARD MOTORSPORTS - Davie,

Unicorn.csv:

UnicornColor,Cmf,UnicornNumber,UnicornshipName
Red,76066699,,BROWARD MOTORSPORTS - WPB,
Red,76071027,,BROWARD MOTORSPORTS OF FT LAUDERDALE LLC,
Red,76014750,,Jet Ski of Miami,
Red,76066987,,BROWARD MOTORSPORTS - Davie,

Unicorn.txt (tab delimited):

UnicornColor    Cmf         UnicornNumber   UnicornshipName
Red             76066699                    BROWARD MOTORSPORTS - WPB           
Red             76071027                    BROWARD MOTORSPORTS OF FT LAUDERDALE LLC    
Red             76014750                    Jet Ski of Miami                
Red             76066987                    BROWARD MOTORSPORTS - Davie         

Deserialize your xml to an object using XmlSerializer . Then use the approach in this question to serialize your object to a csv file.

The code below transforms a simple xml content (as provided by the question) to a sequence of string (header + rows) in a csv format in a generic way, it is just necessary to implement a more reliable function to escape values (identity transformation in the example below).

string csvSeparator = ",";
Func<string, string> escapeValue = val => val;

string xml = "xml content";
XDocument doc = XDocument.Parse(xml);

var headers = doc.Root
                .Elements()
                .First()
                .Elements()
                .Select(el => el.Name.LocalName);

var headerRow = string.Join(csvSeparator, headers);

var rows = from el in doc.Root.Elements()
            let values = from prop in el.Elements()
                        select escapeValue(prop.Value)
            let row = string.Join(csvSeparator, values)
            select row;


IEnumerable<string> csvLines = new[] { headerRow }.Concat(rows);

If you need the entire content of the csv file, you can do:

string csvContent = string.Join(Environment.NewLine, csvLines);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM