简体   繁体   English

将 Scala 代码转换为 Python 以进行报告

[英]Convert Scala Code to Python for Reporting

I have this code in Scala and not massively familiar with Python to be able to convert it:我在 Scala 中有这个代码,并且不太熟悉 Python 能够转换它:

val formatterComma = java.text.NumberFormat.getIntegerInstance

def createTD(value: String) : String = {
  return s"""<td align="center" style="border:1px solid">${value}</td>"""
}

def createTD(value: BigInt) : String = {
  return createTD(value.toString)
}

def createTDDouble(value: Double) : String = {
  return createTD("$" + formatterComma.format(value))
}

def createTheLink(productId: String) : String = {
  return s"""<td align="center" style="border:1px solid"><a href="https://productLink/$product>Link Here</a></td>"""
}

def createTH(value: String) : String = {
  return s"""<th class="gmail-highlight-red gmail-confluenceTh gmail-tablesorter-header gmail-sortableHeader gmail-tablesorter-headerUnSorted" tabindex="0" scope="col" style="width:1px;white-space:nowrap;border:1px solid #000000;padding:7px 15px 7px 10px;vertical-align:top;text-align:center;background:100% 50% no-repeat">
            <div class="gmail-tablesorter-header-inner" style="margin:0px;padding:0px"><h2 title="" style="margin:0.2px 0px 0px;padding:0px;font-size:20px;font-weight:normal;line-height:1.5;letter-spacing:-0.008em;border-bottom-color:rgb(50,199,208))"><strong>${value}</strong></h2>
            </div>
            </th>"""
}

final case class resultsOfReport (name: String, email: String, phone:String, productId : String, product: String,  cost : Double, reduction : Double);

def runReport(elements: Array[resultsOfReport]): String = {
  return elements.map {
        case resultsOfReport (name, email, phone, productId, product, cost, stillInStock)
         => s"""<tr>${createTD(name)}${createTD(email)}${createTD(phone)}${createTD(productId)}${createTD(product)}${createTDDouble(cost)}${createTDDouble(reduction)}${createTheLink(productId)}${createTD("Link to product")}</tr>"""
      }.mkString(s"""<table class="gmail-relative-table gmail-confluenceTable gmail-tablesorter gmail-tablesorter-default" style="border-collapse:collapse; margin:0px;overflow-x:auto;width:1200px"><tr>
               ${createTH("Name")}
                 ${createTH("Email")}
                 ${createTH("Phone")}
               ${createTH("ProductId")}
                ${createTH("Product")}
                 ${createTH("Cost")}
             ${createTH("Reduction")}
             ${createTH("Link")}
</tr>""","",
        "</table>")
}

It takes in data passed through the runReport method and maps it to the appropriate columns.它接收通过 runReport 方法传递的数据并将其映射到适当的列。 Creating a table with the data which I send out.用我发送的数据创建一个表。

I need to be able to use a python method inside this and cannot call a python method in Scala in databricks.我需要能够在其中使用 python 方法,并且不能在数据块中的 Scala 中调用 python 方法。

I've started to convert it but then got stuck on how to make it work like the scala method:我已经开始转换它,但后来陷入了如何让它像 scala 方法一样工作:

from dataclasses import dataclass

@dataclass
class runReport:
  name: str
  email: str
  phone: str
  productId: str
  product: str
  cost: float
  reduction: float
        
def runReport(runReport):

Edit: So from trying out things.编辑:所以从尝试的东西。 I guess the only thing I need to be able to do is work out how to do this part in python:我想我唯一需要做的就是弄清楚如何在 python 中执行此部分:

 return elements.map {
        case resultsOfReport (name, email, phone, productId, product, cost, stillInStock)
         => s"""<tr>${createTD(name)}${createTD(email)}${createTD(phone)}${createTD(productId)}${createTD(product)}${createTDDouble(cost)}${createTDDouble(reduction)}${createTheLink(productId)}${createTD("Link to product")}</tr>"""
      }.mkString(s"""<table class="gmail-relative-table gmail-confluenceTable gmail-tablesorter gmail-tablesorter-default" style="border-collapse:collapse; margin:0px;overflow-x:auto;width:1200px"><tr>
               ${createTH("Name")}
                 ${createTH("Email")}
                 ${createTH("Phone")}
               ${createTH("ProductId")}
                ${createTH("Product")}
                 ${createTH("Cost")}
             ${createTH("Reduction")}
             ${createTH("Link")}
</tr>""","",
        "</table>")

The data comes in as [ Row(Name='name' etc for example. Need to know how to map out these Row key values to column headers as it's in Scala above.例如,数据以 [ Row(Name='name'等形式出现。需要知道如何将这些 Row 键值输出到列标题中,因为它在上面的 Scala 中。

Edit:编辑:

Expected input:预期输入:

data = spark.sql("select * from test") 

data from sql example dataframe:来自 sql 示例 dataframe 的数据:

name=jon, email=email.com, phone=324234, productId=1234, product=new, cost=500, stillInStock=y)名称=jon,电子邮件=email.com,电话=324234,productId=1234,产品=新,成本=500,stillInStock=y)

Calling the resultsOfReport method as written above in Scala:调用上面Scala中写的resultsOfReport方法:

html_returned=resultsOfReport(data)

The expected output will give me the html format as I have given above in scala.预期的 output 将为我提供 html 格式,正如我在上面的 scala 中给出的那样。

I suppose it is quite simple to convert the first functions to their Python equivalents: create_td , create_td_double , create_the_link and create_th .我想将第一个函数转换为它们的 Python 等效项非常简单: create_tdcreate_td_doublecreate_the_linkcreate_th

The function runReport can be written as bellow. function runReport可以写成下面这样。 You could use the type List[Row] as you can not convert DataFrame into dataclass as in Scala to case class:您可以使用List[Row]类型,因为您无法将 DataFrame 转换为数据类,如 Scala 到案例 class:

from typing import List
from pyspark.sql.types import Row

def run_report(elements: List[Row]) -> str:
    table_header = f"""<table class="gmail-relative-table gmail-confluenceTable gmail-tablesorter gmail-tablesorter-default" style="border-collapse:collapse; margin:0px;overflow-x:auto;width:1200px"><tr>
                   {create_th("Name")}
                   {create_th("Email")}
                   {create_th("Phone")}
                   {create_th("ProductId")}
                   {create_th("Product")}
                   {create_th("Cost")}
                   {create_th("Reduction")}
                   {create_th("Link")}</tr>"""

    tables_tds = [
        f"""<tr>{create_td(el.name)}{create_td(el.email)}{create_td(el.phone)}{create_td(el.productId)}{create_td(el.product)}{create_td_double(el.cost)}{create_td_double(el.reduction)}{create_the_link(el.productId)}{create_td("Link to product")}</tr>"""
        for el in elements
    ]

    return table_header + "".join(tables_tds) + "</table>"

Using it:使用它:

data = spark.sql("select * from test").collect()
html_returned = run_report(data)

Note that collect should not be used for large DataFrames (I assume it's not very large for this use case).请注意, collect不应该用于大型 DataFrame(我认为对于这个用例来说它不是很大)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM