[英]How to extract the css style with JSOUP
I have html which is coming from System clipboard on copying the data in MS Excel,我有 html,它来自系统剪贴板,用于复制 MS Excel 中的数据,
I want to extract the data with style.我想用样式提取数据。 Here html content contains the css in STYLE tag, like shown below这里 html 内容包含 STYLE 标签中的 css,如下图
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=utf-8">
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:////Users/tikeshwar-1410/Library/Group%20Containers/UBF8T346G9.Office/TemporaryItems/msohtmlclip/clip.htm">
<link rel=File-List
href="file:////Users/tikeshwar-1410/Library/Group%20Containers/UBF8T346G9.Office/TemporaryItems/msohtmlclip/clip_filelist.xml">
<style>
<!--table
{mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";}
@page
{margin:.75in .7in .75in .7in;
mso-header-margin:.3in;
mso-footer-margin:.3in;}
tr
{mso-height-source:auto;}
col
{mso-width-source:auto;}
br
{mso-data-placement:same-cell;}
td
{padding-top:1px;
padding-right:1px;
padding-left:1px;
mso-ignore:padding;
color:black;
font-size:12.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:Calibri, sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;}
.xl63
{font-weight:700;}
.xl64
{background:yellow;
mso-pattern:black none;}
.xl65
{color:red;}
.xl66
{text-decoration:underline;
text-underline-style:single;}
-->
</style>
</head>
<body link="#0563C1" vlink="#954F72">
<table border=0 cellpadding=0 cellspacing=0 width=174 style='border-collapse:
collapse;width:130pt'>
<!--StartFragment-->
<col width=87 span=2 style='width:65pt'>
<tr height=21 style='height:16.0pt'>
<td height=21 class=xl63 width=87 style='height:16.0pt;width:65pt'>a</td>
<td class=xl64 align=right width=87 style='width:65pt'>1</td>
</tr>
<tr height=21 style='height:16.0pt'>
<td height=21 class=xl65 style='height:16.0pt'>b</td>
<td class=xl66 align=right>2</td>
</tr>
<!--EndFragment-->
</table>
</body>
</html>
I can get the inline style for ay element with style selector, but here I am not able to get the style as its inside tag.我可以使用样式选择器获取 ay 元素的内联样式,但在这里我无法将样式作为其内部标记。
Is there any way that I can get the style for each TD, TR and also the special class style?有什么方法可以让我获得每个 TD、TR 以及特殊的 class 风格的风格? Thanks in advance提前致谢
Internal css described as style attribute , not as element like inline css.内部 css 描述为样式属性,而不是像内联 css 这样的元素。 So you have to get internal css with attr() method.因此,您必须使用 attr() 方法获取内部 css 。 For example:例如:
Document htmlFile = Jsoup.parse(html);
Element firstTableElem = htmlFile.select("table").first();
String tableStyleValue = firstTableElem.attr("style"); //gives you internal css
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.