简体   繁体   English

是否有一个函数可以在 pandas 样式器(DataFrame.style.to_latex)中格式化索引名称,以便可以转义乳胶?

[英]Is there a function to format the index name in a pandas styler (DataFrame.style.to_latex) so can escape latex?

I am trying to format the index name so it can escape latex when using .to_latex() .我正在尝试格式化索引名称,以便在使用.to_latex()时可以转义乳胶。 Using .format_index() works only for the index values but not for the index names.使用.format_index()仅适用于索引值,但不适用于索引名称。

失败表

Here is a Minimal, Reproducible Example.这是一个最小的,可重现的例子。

import pandas as pd
import numpy as np
import pylatex as pl

dict1= {
    'employee_w': ['John_Smith','John_Smith','John_Smith', 'Marc_Jones','Marc_Jones', 'Tony_Jeff', 'Maria_Mora','Maria_Mora'],
    'customer&client': ['company_1','company_2','company_3','company_4','company_5','company_6','company_7','company_8'],
    'calendar_week': [18,18,19,21,21,22,23,23],
    'sales': [5,5,5,5,5,5,5,5],
}

df1 = pd.DataFrame(data = dict1)

ptable = pd.pivot_table(
    df1,
    values='sales',
    index=['employee_w','customer&client'],
    columns=['calendar_week'],
    aggfunc=np.sum
)

mystyler = ptable.style
mystyler.format(na_rep='-', precision=0, escape="latex") 
mystyler.format_index(escape="latex", axis=0)
mystyler.format_index(escape="latex", axis=1)

latex_code1 = mystyler.to_latex(
    column_format='|c|c|c|c|c|c|c|',
    multirow_align="t",
    multicol_align="r",
    clines="all;data",
    hrules=True,
)

# latex_code1 = latex_code1.replace("employee_w", "employee")
# latex_code1 = latex_code1.replace("customer&client", "customer and client")
# latex_code1 = latex_code1.replace("calendar_week", "week")

doc = pl.Document(geometry_options=['a4paper'], document_options=["portrait"], textcomp = None) 

doc.packages.append(pl.Package('newtxtext,newtxmath')) 
doc.packages.append(pl.Package('textcomp')) 
doc.packages.append(pl.Package('booktabs'))
doc.packages.append(pl.Package('xcolor',options= pl.NoEscape('table')))
doc.packages.append(pl.Package('multirow'))

doc.append(pl.NoEscape(latex_code1))
doc.generate_pdf('file1.pdf', clean_tex=False, silent=True)

When I replace them using .replace() it works.当我使用.replace()替换它们时,它可以工作。 such as the commented lines.例如注释行。 (desired result): (期望的结果): 期望表

But I'm dealing with houndreds of tables with unknown index/column names.但我正在处理数百个索引/列名未知的表。

The scope is to generate PDF files using Pylatex automatically.范围是使用 Pylatex 自动生成 PDF 文件。 So any html option is not helpful for me.所以任何 html 选项对我都没有帮助。

Thanks in advance!提前致谢!

I coded all the Styler.to_latex features and I'm afraid the index names are currently not formatted, which also means that they are not escaped.我编写了所有Styler.to_latex功能,恐怕索引名称目前没有格式化,这也意味着它们没有被转义。 So there is not a direct function to do what you desire.所以没有直接的功能可以做你想做的事。 (by the way its great to see an example where many of the features including the hrules table styles definition is being used). (顺便说一句,很高兴看到一个例子,其中包括 hrules 表样式定义在内的许多功能正在被使用)。 I actually just created an issue on this on Pandas Github.实际上,我只是在 Pandas Github 上创建了一个关于此的问题。

However, the code itself contains an _escape_latex(s) method in pandas.io.formats.styler_render.py但是,代码本身在pandas.io.formats.styler_render.py中包含一个_escape_latex(s)方法

def _escape_latex(s):
    r"""
    Replace the characters ``&``, ``%``, ``$``, ``#``, ``_``, ``{``, ``}``,
    ``~``, ``^``, and ``\`` in the string with LaTeX-safe sequences.

    Use this if you need to display text that might contain such characters in LaTeX.

    Parameters
    ----------
    s : str
        Input to be escaped

    Return
    ------
    str :
        Escaped string
    """
    return (
        s.replace("\\", "ab2§=§8yz")  # rare string for final conversion: avoid \\ clash
        .replace("ab2§=§8yz ", "ab2§=§8yz\\space ")  # since \backslash gobbles spaces
        .replace("&", "\\&")
        .replace("%", "\\%")
        .replace("$", "\\$")
        .replace("#", "\\#")
        .replace("_", "\\_")
        .replace("{", "\\{")
        .replace("}", "\\}")
        .replace("~ ", "~\\space ")  # since \textasciitilde gobbles spaces
        .replace("~", "\\textasciitilde ")
        .replace("^ ", "^\\space ")  # since \textasciicircum gobbles spaces
        .replace("^", "\\textasciicircum ")
        .replace("ab2§=§8yz", "\\textbackslash ")
    )

So your best bet is to reformat the input dataframe and escape the index name before you do any styling to it:因此,最好的办法是在对其进行任何样式设置之前重新格式化输入数据框并转义索引名称:

df.index.name = _escape_latex(df.index.name)
# then continue with your previous styling code

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM