简体   繁体   中英

OpenXML Excel how to change value of a cell when value is in SharedStringTable

I am looking for a safe and efficient way to update the value of a cell where the text may be in the SharedStringTable (this appears to be the case of any spreadsheet created by MS Excel).

As the name implies SharedStringTable contains strings that may be used in multiple cells.

So just finding the item in the string table and update the value is NOT the way to go as it may be in use by other cells as well.

As far as I understand one must do the following:

  1. Check if the cell is using string table

  2. If so, check if the new string is already there in which case just use it (remember to remove the item with the old string if it is no longer in use by any other cells!)

  3. If not, check if the item with old string is refered to by any other cells in the spreadsheet

  4. If so, create new item with the new string and refer to it

  5. If not, just update existing item with new string

Are there any easier solution to this using the OpenXML SDK?

Also consider that one may want to update not only one cell but rather set new (different) values for several cells. So we may be calling the update cell method in a loop ...

First take on this. Appears to work for my particular case. But it must be possible to improve on or, even better, do totally different:

private static void UpdateCell(SharedStringTable sharedStringTable, 
   Dictionary<string, SheetData> sheetDatas, string sheetName, 
   string cellReference, string text)
{
   Cell cell = sheetDatas[sheetName].Descendants<Cell>()
    .FirstOrDefault(c => c.CellReference.Value == cellReference);
   if (cell == null) return;
   if (cell.DataType == null || cell.DataType != CellValues.SharedString)
   {
    cell.RemoveAllChildren();
    cell.AppendChild(new InlineString(new Text { Text = text }));
    cell.DataType = CellValues.InlineString;
    return;
   }
   // Cell is refering to string table. Check if new text is already in string table, if so use it.
   IEnumerable<SharedStringItem> sharedStringItems 
    = sharedStringTable.Elements<SharedStringItem>();
   int i = 0;
   foreach (SharedStringItem sharedStringItem in sharedStringItems)
   {
    if (sharedStringItem.InnerText == text)
    {
       cell.CellValue = new CellValue(i.ToString());
       // TODO: Should clean up, ie remove item with old text from string table if it is no longer in use.
       return;
    }
    i++;
   }
   // New text not in string table. Check if any other cells in the Workbook referes to item with old text.
   foreach (SheetData sheetData in sheetDatas.Values)
   {
    var cells = sheetData.Descendants<Cell>();
    foreach (Cell cell0 in cells)
    {
       if (cell0.Equals(cell)) continue;
       if (cell0.DataType != null 
       && cell0.DataType == CellValues.SharedString 
       && cell0.CellValue.InnerText == cell.CellValue.InnerText)
       {
        // Other cells refer to item with old text so we cannot update it. Add new item.
        sharedStringTable.AppendChild(new SharedStringItem(new Text(text)));
        cell.CellValue.Text = (i).ToString();
        return;
       }
    }
   }
   // No other cells refered to old item. Update it.
   sharedStringItems.ElementAt(int.Parse(cell.CellValue.InnerText)).Text = new Text(text);
}

....

private static void DoIt(string filePath)
{
   using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(filePath, true))
   {
    SharedStringTable sharedStringTable 
       = spreadSheet.WorkbookPart.GetPartsOfType<SharedStringTablePart>()
        .First().SharedStringTable;
    Dictionary<string, SheetData> sheetDatas = new Dictionary<string, SheetData>();
    foreach (var sheet in spreadSheet.WorkbookPart.Workbook.Descendants<Sheet>())
    {
       SheetData sheetData 
        = (spreadSheet.WorkbookPart.GetPartById(sheet.Id) as WorksheetPart)
           .Worksheet.GetFirstChild<SheetData>();
       sheetDatas.Add(sheet.Name, sheetData);
    }
    UpdateCell(sharedStringTable, sheetDatas, "Sheet1", "A2", "Mjau");
   }
}

WARNING: Do NOT use the above as is, it works with a particular spreadsheet. It is very likely things not handled if one use it in other situations. This is my first attempt at OpenXML for spreadsheet. Ended up following the suggestion made by George Polevoy. Much easier and appears to have no ill side-effects (That said there are a million other issues to handle when manipulating spreadsheets which may be edited outside your control...)

As you can see the update operation of the shared string table really keeps developers busy.

In my experience shared string table does not add anything in terms of performance and file size economy. OpenXml format is compressed inside a packaging container anyway, so even if you have massively duplicated strings it won't affect the file size.

Microsoft Excel writes everything in shared string tables, even there's no duplication.

I'd recommend just to convert everything to InlineStrings before modifying the document, and the further operation becomes as simple as it gets.

You can write it simply as InlineStrings , and that would be a functionally equal document file.

Microsoft Excel would convert it back to shared string tables when the file is edited, but who cares.

I would suggest the shared string table feature removed in future versions of the standard, unless justified by some sound benchmarks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM