简体   繁体   English

将 HTML 转换为 PDF 时,UTF-8 个字符显示为框

[英]UTF-8 characters are showing as boxes when converting HTML to PDF

I wanted to convert HTML to PDF having special characters but the output is not showing the special characters.我想将 HTML 转换为具有特殊字符的 PDF,但 output 未显示特殊字符。

from io import BytesIO
from django.http import HttpResponse
from django.template.loader import get_template
from xhtml2pdf import pisa

def html2pdf(template_source,context_dict={}):
    template=get_template(template_source)
    html=template.render(context_dict)
    result=BytesIO()
    pdf=pisa.CreatePDF(BytesIO(html.encode('utf-8')),result)
if not pdf.err:
    return HttpResponse(result.getvalue(),content_type="application/pdf")
return None

is my pdf.py and I have a HTML file which is pdf.html是我的pdf.py我有一个 HTML 文件是pdf.html

<!DOCTYPE html>
<html lang="en">
<meta charset="UTF-8">
<head>
    <style>
        body {font-family: 'Josefin Slab';
        font-size: large;
        background-color: beige;}
        </style>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document</title>
</head>
<body>
    <h2 class="utf">This is myŐ, Ű, ő or ű✅✅ pdf file with special char</h2>
</body>
</html>

When I am converting this into a PDF it is showing当我将其转换为 PDF 时,它显示

This is my■, ■, ■ or ■■■■■■■■■■■■■ pdf file with special......这是我的■、■、■ 或■■■■■■■■■■■■■■ pdf 文件,带有特殊……

What to do now?现在要做什么?

As noted in comments your using characters that do not exist in the font so use a different font !如注释中所述,您使用的字符在字体中不存在,因此请使用不同的字体! However also see notes below但是也请参阅下面的注释

在此处输入图像描述

Here we can see that a PDF of the characters when correctly embedded will still work in the browser pdf view but are not handled well in a conventional pdf viewer.在这里我们可以看到,PDF 的字符在正确嵌入后仍可在浏览器 pdf 视图中使用,但在传统的 pdf 查看器中处理得不好。

在此处输入图像描述

Not all characters are available even in a full universal font, specifically coloured html objects like emoji or your ✅ since those are generated by browser fonts thus need conversion to image with underlying text.并非所有字符都可用,即使是完整的通用字体,特别是彩色 html 对象,如表情符号或您的 ✅,因为它们是由浏览器 fonts 生成的,因此需要转换为带有底层文本的图像。 That combination of two for one is problematic for use in a PDF. It depends on the PDF writer if it will be possible with a given font so safer fudge is use the square root symbol.这种二合一的组合在 PDF 中使用是有问题的。这取决于 PDF 作者是否可以使用给定的字体,因此更安全的软糖是使用平方根符号。 在此处输入图像描述

Side Note in some Scandinavian countries a tick can mean wrong not right https://en.wikipedia.org/wiki/Check_mark附注在某些斯堪的纳维亚国家/地区,勾号可能表示错误不正确https://en.wikipedia.org/wiki/Check_mark

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM