简体   繁体   中英

How to copy a list in a Word table cell to Excel cell

I have the following test table in Word, with one cell having a multilevel list:

在此输入图像描述

Using the code below, I can copy cells from the Word Table to a corresponding cell in an Excel worksheet:

foreach (Microsoft.Office.Interop.Word.Table table in objDoc.Tables)
{
   for (int row = 1; row <= table.Rows.Count; row++)
   {
      for (int col = 1; col <= table.Columns.Count; col++)
      {
         string text = table.Cell(row, col).Range.Text;
         worksheet.Cells[row, col] = text;
       }
    }
 }

However, I get the following result where the Word cell containing the list is not copied properly into Excel:

在此输入图像描述

I have also tried the following:

worksheet.Cells[row, col] = table.Cell(row, col).Range.FormattedText;

But I get the same results.

I also tried converting the list in the Word file by copying and pasting with Keep Text Only to remove Word's automatic formatting, and manually deleting the tabs. That yielded this result:

在此输入图像描述

Although I can get the text with the list numbers, I do not get a carriage return, line break, or line feed to separate the items the list.

At the very least, I would like to preserve the list numbering and line breaks without having to manually cut/paste with Keep Text Only; and I want to avoid having to parse the text for the list numbers (which could be numbers or letters) and inserting line feeds.

There are multiple problems involved with achieving the stated result:

  1. Excel doesn't use the same character as Word for new lines or new paragraphs. (In this case it must be new paragraphs since the numbering is being generated.) Excel wants ANSI 10; Word is using ANSI 13. So that needs to be converted.

  2. Automatic Line numbering is formatting. Passing a string loses formatting; it can only be carried across using Copy. Or the numbering has to be converted to plain text.

  3. Another issue is the "dot" at the end of the cell content, which is again ANSI 13 in combination with ANSI 7 (end-of-cell marker). This should also be removed.

The following bit of sample code takes care of all three conversions. (Note: this is VBA code that I've converted off the top of my head, so watch out for small syntax "gotchas")

    Word.Range rng = table.Cell[rowCounter, colCounter].Range;
    //convert the numbers to plain text, then undo the conversion
    rng.ListFormat.ConvertNumbersToText();
    string cellContent = rng.Text;
    objDoc.Undo(1);
    //remove end-of-cell characters
    cellContent = TrimCellText2(cellContent);
    //replace remaining paragraph marks with the Excel new line character
    cellContent.Replace((char)13, (char)10);
    worksheet.Cells[rowCounter, colCounter].Value = cellContent;

//cut off ANSI 13 + ANSI 7 from the end of the string coming from a 
//Word table cell
private string TrimCellText2(s As String)
{
    int len = s.Length;
    while (len > 0 && s.Substring(len - 1) == (char)13 || s.Substring(len - 1) == (char)7);
        s = s.Substring(0, Math.Min(len-1, len));   
    return s;
}

With the help of Cindy Meister, combined with the answer from Paul Walls in this other question for replacing characters in a C# string , here is the resulting answer.

foreach (Microsoft.Office.Interop.Word.Table table in objDoc.Tables)
{             
    for (int row = 1; row <= table.Rows.Count; row++)
    {
        for (int col = 1; col <= table.Columns.Count; col++)
        {
            // Convert the formatted list number to plain text, then undo the conversion                   
            table.Cell(row, col).Range.ListFormat.ConvertNumbersToText();
            string cellContent = table.Cell(row, col).Range.Text;
            objDoc.Undo(1);

            // remove end-of-cell characters
            cellContent = trimCellText2(cellContent);

            // Replace remaining paragraph marks with the excel newline character     
            char[] linefeeds = new char[] { '\r', '\n' };
            string[] temp1 = cellContent.Split(linefeeds, StringSplitOptions.RemoveEmptyEntries);
            cellContent = String.Join("\n", temp1);

            // Replace tabs from the list format conversion with spaces
            char[] tabs = new char[] { '\t', ' ' };
            string[] temp2 = cellContent.Split(tabs, StringSplitOptions.RemoveEmptyEntries);
            cellContent = String.Join(" ", temp2);

            worksheet.Cells[row, col] = cellContent;
        }
    }
}

private static string trimCellText2(string myString)
{
    int len = myString.Length;
    string charString13 = "" + (char)13;
    string charString7 = "" + (char)7;

    while ((len > 0 && myString.Substring(len - 1) == charString13) || (myString.Substring(len - 1) == charString7))
        myString = myString.Substring(0, Math.Min(len - 1, len));
    return myString;
}

And here is the resulting output in Excel: Excel Output

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM