简体   繁体   English

如何使用python更改html文档的CSS属性?

[英]How can I use python to change css attributes of an html document?

I have a directory with many HTML documents. 我有一个包含许多HTML文档的目录。 Most of them contain the codeblock 其中大多数包含代码块

      .org-link {
        /* org-link */
        color: #b58900;
        font-weight: bold;
        text-decoration: underline;
      }

inside the <style type="text/css"> tag. <style type="text/css">标记内。 I'd like to write a script that removes the line text-decoration: underline; 我想编写一个删除行text-decoration: underline;的脚本text-decoration: underline; and changes the color to #2aa198 from this block in every file. 并在每个文件中将此颜色从该块更改为#2aa198

Is it possible to accomplish this with python? 是否可以使用python完成此操作?

You could use regular expressions to make the necessary replacements as follows: 您可以使用正则表达式进行必要的替换,如下所示:

import re

test = """
      .org-link {
        /* org-link */
        color: #b58900;
        font-weight: bold;
        text-decoration: underline;
      }
"""

def fix(org_link):
    new_color = re.sub(r'(.*?color\s*?:\s*?)(.*?)(;)', r'\1#777\3', org_link.group(0), flags=re.S)
    return re.sub(r'(.*?)(\s+?text-decoration: underline;)(.*?)', r'\1\3', new_color, flags=re.S)

print re.sub(r'(org-link\s+\{.*\})', fix, test, flags=re.S)

This would convert the text as follows: 这将转换文本如下:

  .org-link {
    /* org-link */
    color:#777;
    font-weight: bold;
  }

It works by first identifying suitable org-link blocks and then first replacing the color and then removing any text-decoration entries. 它的工作方式是首先确定合适的org-link块,然后先替换颜色,然后删除所有text-decoration条目。

The script could then be extended to carry this out on all of the HTML files in a given folder as follows: 然后可以将该脚本扩展为在给定文件夹中的所有HTML文件上执行此操作,如下所示:

import re
import glob

def fix(org_link):
    new_color = re.sub(r'(.*?color\s*?:\s*?)(.*?)(;)', r'\1#777\3', org_link.group(0), flags=re.S)
    return re.sub(r'(.*?)(\s+?text-decoration: underline;)(.*?)', r'\1\3', new_color, flags=re.S)

for html_file in glob.glob('*.html'):
    print html_file
    with open(html_file) as f_input:
        html = re.sub(r'(org-link\s+\{.*\})', fix, f_input.read(), flags=re.S)

    with open(html_file, 'w') as f_output:
        f_output.write(html)

Tested using Python 2.7.9 使用Python 2.7.9测试

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Python从HTML文档中提取信息? - How can I use Python to extract information from a HTML document? 我可以在kivy和python3中使用html&css吗? - Can I use html&css in kivy and python3? 如何在DynamoDB(Python)中的嵌套属性上使用FilterExpression? - How can i use FilterExpression over Nested Attributes in DynamoDB (Python)? 如何使用RegEx从html文档中提取信息 - How can I use RegEx to extract info from a html document 如何加载将用作类属性的文件,这些属性的顺序可以更改 - How to load a file that I'll use as class attributes where the order of these attributes can change 如何使 Python/Sphinx 文档对象属性仅在 __init__ 中声明? - How can I make Python/Sphinx document object attributes only declared in __init__? 如何在 Python 中将 PCA 用于术语文档矩阵? - How can I use the PCA for a term-document matrix in Python? 如何使用 Python 删除 docx 文档中的某些段落? - How can I use Python to delete certain paragraphs in docx document? 如何在 HTML/CSS 中整齐地格式化代码? 如果可以的话,我特别想使用数字圆圈标注,Python Docutils - How do I format code neatly in HTML/CSS ? I especially want to use the numerical circle callouts, with Python Docutils if I can 如何更改班级的属性? - How can I change the attributes of my class?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM