I have this code in Scala and not massively familiar with Python to be able to convert it:
val formatterComma = java.text.NumberFormat.getIntegerInstancedef createTD(value: String) : String = {return s"""<td align="center" style="border:1px solid">${value}</td>"""}def createTD(value: BigInt) : String = {return createTD(value.toString)}def createTDDouble(value: Double) : String = {return createTD("$" + formatterComma.format(value))}def createTheLink(productId: String) : String = {return s"""<td align="center" style="border:1px solid"><a href="https://productLink/$product>Link Here</a></td>"""}def createTH(value: String) : String = {return s"""<th class="gmail-highlight-red gmail-confluenceTh gmail-tablesorter-header gmail-sortableHeader gmail-tablesorter-headerUnSorted" tabindex="0" scope="col" style="width:1px;white-space:nowrap;border:1px solid #000000;padding:7px 15px 7px 10px;vertical-align:top;text-align:center;background:100% 50% no-repeat"><div class="gmail-tablesorter-header-inner" style="margin:0px;padding:0px"><h2 title="" style="margin:0.2px 0px 0px;padding:0px;font-size:20px;font-weight:normal;line-height:1.5;letter-spacing:-0.008em;border-bottom-color:rgb(50,199,208))"><strong>${value}</strong></h2></div></th>"""}final case class resultsOfReport (name: String, email: String, phone:String, productId : String, product: String, cost : Double, reduction : Double);def runReport(elements: Array[resultsOfReport]): String = {return elements.map {case resultsOfReport (name, email, phone, productId, product, cost, stillInStock)=> s"""<tr>${createTD(name)}${createTD(email)}${createTD(phone)}${createTD(productId)}${createTD(product)}${createTDDouble(cost)}${createTDDouble(reduction)}${createTheLink(productId)}${createTD("Link to product")}</tr>"""}.mkString(s"""<table class="gmail-relative-table gmail-confluenceTable gmail-tablesorter gmail-tablesorter-default" style="border-collapse:collapse; margin:0px;overflow-x:auto;width:1200px"><tr>${createTH("Name")}${createTH("Email")}${createTH("Phone")}${createTH("ProductId")}${createTH("Product")}${createTH("Cost")}${createTH("Reduction")}${createTH("Link")}</tr>""","","</table>")}
It takes in data passed through the runReport method and maps it to the appropriate columns. Creating a table with the data which I send out.
I need to be able to use a python method inside this and cannot call a python method in Scala in databricks.
I've started to convert it but then got stuck on how to make it work like the scala method:
from dataclasses import dataclass@dataclassclass runReport:name: stremail: strphone: strproductId: strproduct: strcost: floatreduction: floatdef runReport(runReport):
Edit: So from trying out things. I guess the only thing I need to be able to do is work out how to do this part in python:
return elements.map {case resultsOfReport (name, email, phone, productId, product, cost, stillInStock)=> s"""<tr>${createTD(name)}${createTD(email)}${createTD(phone)}${createTD(productId)}${createTD(product)}${createTDDouble(cost)}${createTDDouble(reduction)}${createTheLink(productId)}${createTD("Link to product")}</tr>"""}.mkString(s"""<table class="gmail-relative-table gmail-confluenceTable gmail-tablesorter gmail-tablesorter-default" style="border-collapse:collapse; margin:0px;overflow-x:auto;width:1200px"><tr>${createTH("Name")}${createTH("Email")}${createTH("Phone")}${createTH("ProductId")}${createTH("Product")}${createTH("Cost")}${createTH("Reduction")}${createTH("Link")}</tr>""","","</table>")
The data comes in as [Row(Name='name'
etc for example. Need to know how to map out these Row key values to column headers as it's in Scala above.
Edit:
Expected input:
data = spark.sql("select * from test")
data from sql example dataframe:
name=jon, email=email.com, phone=324234, productId=1234, product=new, cost=500, stillInStock=y)
Calling the resultsOfReport method as written above in Scala:
html_returned=resultsOfReport(data)
The expected output will give me the html format as I have given above in scala.
Best Answer
I suppose it is quite simple to convert the first functions to their Python equivalents : create_td
, create_td_double
, create_the_link
and create_th
.
The function runReport
can be written as bellow. You could use the type List[Row]
as you can not convert DataFrame into dataclass as in Scala to case class:
from typing import Listfrom pyspark.sql.types import Rowdef run_report(elements: List[Row]) -> str:table_header = f"""<table class="gmail-relative-table gmail-confluenceTable gmail-tablesorter gmail-tablesorter-default" style="border-collapse:collapse; margin:0px;overflow-x:auto;width:1200px"><tr>{create_th("Name")}{create_th("Email")}{create_th("Phone")}{create_th("ProductId")}{create_th("Product")}{create_th("Cost")}{create_th("Reduction")}{create_th("Link")}</tr>"""tables_tds = [f"""<tr>{create_td(el.name)}{create_td(el.email)}{create_td(el.phone)}{create_td(el.productId)}{create_td(el.product)}{create_td_double(el.cost)}{create_td_double(el.reduction)}{create_the_link(el.productId)}{create_td("Link to product")}</tr>"""for el in elements]return table_header + "".join(tables_tds) + "</table>"
Using it:
data = spark.sql("select * from test").collect()html_returned = run_report(data)
Note that collect
should not be used for large DataFrames (I assume it's not very large for this use case).