简体   繁体   English

如何使用 JSOUP 提取 css 样式

[英]How to extract the css style with JSOUP

I have html which is coming from System clipboard on copying the data in MS Excel,我有 html,它来自系统剪贴板,用于复制 MS Excel 中的数据,

I want to extract the data with style.我想用样式提取数据。 Here html content contains the css in STYLE tag, like shown below这里 html 内容包含 STYLE 标签中的 css,如下图

xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=Content-Type content="text/html; charset=utf-8">
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:////Users/tikeshwar-1410/Library/Group%20Containers/UBF8T346G9.Office/TemporaryItems/msohtmlclip/clip.htm">
<link rel=File-List
href="file:////Users/tikeshwar-1410/Library/Group%20Containers/UBF8T346G9.Office/TemporaryItems/msohtmlclip/clip_filelist.xml">
<style>
<!--table
    {mso-displayed-decimal-separator:"\.";
    mso-displayed-thousand-separator:"\,";}
@page
    {margin:.75in .7in .75in .7in;
    mso-header-margin:.3in;
    mso-footer-margin:.3in;}
tr
    {mso-height-source:auto;}
col
    {mso-width-source:auto;}
br
    {mso-data-placement:same-cell;}
td
    {padding-top:1px;
    padding-right:1px;
    padding-left:1px;
    mso-ignore:padding;
    color:black;
    font-size:12.0pt;
    font-weight:400;
    font-style:normal;
    text-decoration:none;
    font-family:Calibri, sans-serif;
    mso-font-charset:0;
    mso-number-format:General;
    text-align:general;
    vertical-align:bottom; 
    border:none;
    mso-background-source:auto;
    mso-pattern:auto;
    mso-protection:locked visible;
    white-space:nowrap;
    mso-rotate:0;}
.xl63
    {font-weight:700;}
.xl64
    {background:yellow;
    mso-pattern:black none;}
.xl65
    {color:red;}
.xl66
    {text-decoration:underline;
    text-underline-style:single;}
-->
</style>
</head>

<body link="#0563C1" vlink="#954F72">

<table border=0 cellpadding=0 cellspacing=0 width=174 style='border-collapse:
 collapse;width:130pt'>
<!--StartFragment-->
 <col width=87 span=2 style='width:65pt'>
 <tr height=21 style='height:16.0pt'>
  <td height=21 class=xl63 width=87 style='height:16.0pt;width:65pt'>a</td>
  <td class=xl64 align=right width=87 style='width:65pt'>1</td>
 </tr>
 <tr height=21 style='height:16.0pt'>
  <td height=21 class=xl65 style='height:16.0pt'>b</td>
  <td class=xl66 align=right>2</td>
 </tr>
<!--EndFragment-->
</table>

</body>

</html>

I can get the inline style for ay element with style selector, but here I am not able to get the style as its inside tag.我可以使用样式选择器获取 ay 元素的内联样式,但在这里我无法将样式作为其内部标记。

Is there any way that I can get the style for each TD, TR and also the special class style?有什么方法可以让我获得每个 TD、TR 以及特殊的 class 风格的风格? Thanks in advance提前致谢

Internal css described as style attribute , not as element like inline css.内部 css 描述为样式属性,而不是像内联 css 这样的元素 So you have to get internal css with attr() method.因此,您必须使用 attr() 方法获取内部 css 。 For example:例如:

Document htmlFile = Jsoup.parse(html);
Element firstTableElem = htmlFile.select("table").first();
String tableStyleValue = firstTableElem.attr("style"); //gives you internal css

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM