简体   繁体   中英

How do I convert an HTML table into JSON in Logic Apps

I am building a Logic App to deal with Call Before You Dig email replies we receive. The first email is a confirmation email and it includes a table indicating which utility providers have been notified. I would like to add the contents of that table to an excel spreadsheet and add our own reference number in the process. I found a possible solution in this answer which was taken from John Dyer's Blog .

    var data = [];

    // first row needs to be headers
    var headers = [];
    for (var i=0; i<table.rows[0].cells.length; i++) {
        headers[i] = table.rows[0].cells[i].innerHTML.toLowerCase().replace(/ /gi,'');
    }

    // go through cells
    for (var i=1; i<table.rows.length; i++) {

        var tableRow = table.rows[i];
        var rowData = {};

        for (var j=0; j<tableRow.cells.length; j++) {

            rowData[ headers[j] ] = tableRow.cells[j].innerHTML;

        }

        data.push(rowData);
    }       

    return data;
}

I tried to use this code in a Azure Function but it was getting to complicated for me when it made me download VS. I could not find a way to just add the code in the portal. I installed VS but it started to get beyond me very quickly.

I found an online converter and used it to convert the code into C++ so I could maybe use it in a .net function.

#include <stdio.h>
int main()
{
      printf("function tableToJson(table) {
    var data = [];

    // first row needs to be headers
    var headers = [];
    for (var i=0; i<table.rows[0].cells.length; i++) {
        headers[i] = table.rows[0].cells[i].innerHTML.toLowerCase().replace(/ /gi,'');
    }

    // go through cells
    for (var i=1; i<table.rows.length; i++) {

        var tableRow = table.rows[i];
        var rowData = {};

        for (var j=0; j<tableRow.cells.length; j++) {

            rowData[ headers[j] ] = tableRow.cells[j].innerHTML;

        }

        data.push(rowData);
    }       

    return data;
}\n");
      return 0;
}

I have used two compose actions to trim the mails content down to just the table in question:

 <table class="MsoNormalTable" border="1" cellspacing="0" cellpadding="0" width="100%" style="width:100.0%; border-collapse:collapse; border:none"> <tbody> <tr> <td colspan="3" valign="top" style="border:solid gray 1.0pt; background:#9CCC6B; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal"> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black">MEMBERS NOTIFIED: The following owners of underground infrastructure in the area of your excavation site have been notified.</span> </b> </p> </td> </tr> <tr> <td width="50%" valign="top" style="width:50.0%; border-top:none; border-left:solid gray 1.0pt; border-bottom:solid gray 1.0pt; border-right:none; background:#9CCC6B; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black">Member name</span> </b> </p> </td> <td width="25%" valign="top" style="width:25.0%; border-top:none; border-left:solid gray 1.0pt; border-bottom:solid gray 1.0pt; border-right:none; background:#9CCC6B; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black">Station Code</span> </b> </p> </td> <td width="25%" valign="top" style="width:25.0%; border:solid gray 1.0pt; border-top:none; background:#9CCC6B; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black">Initial Status</span> </b> </p> </td> </tr> <tr> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">G-TEL FOR ENBRIDGE GAS (LEGACY UNION GAS) (ENOW01)</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">ENOW01</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> </tr> <tr> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">CITY OF STRATFORD (STRATWS01)</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">STRATWS01</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> </tr> <tr> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">FESTIVAL HYDRO (LOCAL HYDRO) (FESTH01)</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">FESTH01</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> </tr> <tr> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">WIGHTMAN TELECOM - FIBRE - LIMITED (WT01)</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">WT01</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> </tr> <tr> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">CLI FOR ROGERS (ROGWAT01)</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">ROGWAT01</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Cleared</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> </tr> <tr> <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:solid gray 1.0pt; border-right:none; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">G-TEL FOR BELL CANADA (BCOW01)</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:solid gray 1.0pt; border-right:none; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">BCOW01</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> <td valign="top" style="border:solid gray 1.0pt; border-top:none; padding:3.75pt.75pt 3.75pt 3.75pt"> <p class="MsoNormal" align="center" style="text-align:center"> <span class="value"> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span> </span> <b> <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span> </b> </p> </td> </tr> </tbody> </table>

This HTML becomes the input to Azure Function action and the Function I created using the C++ code that was output by the conversion and referenced above.

I am getting an Internal Server Error 500 on the output.

I am hoping someone can point me in the right direction to solve this. I have obviously done something wrong!

I hope I'm on the right track but this code below (although quite specific to your use case) will read the HTML table and return a JSON representation of the data.

Just create a new Azure Function in .NET called ConvertHtmlTableToJson and paste it in.

#r "Newtonsoft.Json"

using System.Net;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Primitives;
using System.Collections.Generic;
using Newtonsoft.Json;
using System.Xml;

public static async Task<IActionResult> Run(HttpRequest req, ILogger log)
{
    var outputTable = new List<List<String>>();

    string requestBody = String.Empty;

    using (StreamReader streamReader = new StreamReader(req.Body))
    {
        requestBody = await streamReader.ReadToEndAsync();
    }

    dynamic data = JsonConvert.DeserializeObject(requestBody);
    string xmlString = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String((string)data?.Content));;

    var xmlDocument = new XmlDocument();
    xmlDocument.LoadXml(xmlString);

    // Get the rows
    var xmlRows = xmlDocument.DocumentElement.SelectNodes("//tr");

    foreach (XmlNode xmlRow in xmlRows)
    {
        // Now get the columns.
        var xmlColumns = xmlRow.SelectNodes(".//td");
        var row = new List<string>();

        foreach (XmlNode xmlColumn in xmlColumns)
        {
            var value = xmlColumn.SelectSingleNode(".//span[@class='value']");

            if (value != null)
                row.Add(value?.InnerText);
        }

        if (row.Count > 0)
            outputTable.Add(row);
    }
   
    return new OkObjectResult(outputTable);
}

It accepts a Base64 string that is the HTML data you provided in your example.

A few things to note...

  • It's hardcoded to look for span elements with an attribute of class="value" , that's in line with the email you receive and the HTML you provided.
  • It doesn't check for imbalanced columns, ie if a row is missing a value, you may get two columns for one row and three columns for another.
  • Headers are ignored because it only searches within the td elements where there's a span element with the attribute criteria as specified in the first point.

As long as your email stays the same and you pass in the same structure that you provided as an example, this will extract the data for you. Beyond that, it would need to be enhanced.

From there, you should be able to use the 2D array it returns to load your data to your Excel table.

This is how I represented it in LogicApps...

HTML Variable

HTML 变量

This is the body in the request...

{
  "Content": "@{base64(variables('HTML'))}"
}

Result

结果

[
  [
    "G-TEL FOR ENBRIDGE GAS (LEGACY UNION GAS) (ENOW01)",
    "ENOW01",
    "Notification sent"
  ],
  [
    "CITY OF STRATFORD (STRATWS01)",
    "STRATWS01",
    "Notification sent"
  ],
  [
    "FESTIVAL HYDRO (LOCAL HYDRO) (FESTH01)",
    "FESTH01",
    "Notification sent"
  ],
  [
    "WIGHTMAN TELECOM - FIBRE - LIMITED (WT01)",
    "WT01",
    "Notification sent"
  ],
  [
    "CLI FOR ROGERS (ROGWAT01)",
    "ROGWAT01",
    "Cleared"
  ],
  [
    "G-TEL FOR BELL CANADA (BCOW01)",
    "BCOW01",
    "Notification sent"
  ]
]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM