如何在python 3中打印/获取Html文件中的特定行

Question

我想从我的 HTML 文件中打印特定的行。 特定行是作为标题括起来的那一行。 我的test.html文件贴在底部以供参考

import codecs
import re
f = codecs.open("test.html", 'r')
f.read()
paragraphs = re.findall(r'<html>(.*?)</html>',str(f))
print(paragraphs)
f.close()

test.html 看起来像这样

<html>
<head>
<title>
Example
</title>
</head>
<body>
<h1>Hello, world</h1>
</body>
</html>

Answer 1

你可以做这样的事情：

import codecs
import re
g = codecs.open("test.html", 'r')
f = g.read()
start = f.find("<head>")
start = start + 7
end =  f.find("</head>")
end = end - 1
paragraphs = f[start:end]
print(paragraphs)
g.close()

这打印

<title>
Example
</title>

.find()返回您搜索的字符串内子字符串的起始索引，然后我们使用这些索引（在应用一些简单的数学之后）通过使用[:]对字符串进行切片来访问子字符串。

如何在python 3中打印/获取Html文件中的特定行

问题描述

1 个解决方案

解决方案1
2 已采纳

如何在python 3中打印/获取Html文件中的特定行

问题描述

1 个解决方案

解决方案1 2 已采纳

解决方案1
2 已采纳