如何在python 3中打印/获取Html文件中的特定行

Question

I wanted to print a specific line from my HTML file.我想从我的 HTML 文件中打印特定的行。 The specific line being the one enclosed as a header.特定行是作为标题括起来的那一行。 My test.html file is posted at the bottom for reference我的test.html文件贴在底部以供参考

import codecs
import re
f = codecs.open("test.html", 'r')
f.read()
paragraphs = re.findall(r'<html>(.*?)</html>',str(f))
print(paragraphs)
f.close()

test.html looks like this test.html 看起来像这样

<html>
<head>
<title>
Example
</title>
</head>
<body>
<h1>Hello, world</h1>
</body>
</html>

Answer 1

you could do something like this:你可以做这样的事情：

import codecs
import re
g = codecs.open("test.html", 'r')
f = g.read()
start = f.find("<head>")
start = start + 7
end =  f.find("</head>")
end = end - 1
paragraphs = f[start:end]
print(paragraphs)
g.close()

this prints这打印

<title>
Example
</title>

.find() returns the starting index of the substring inside the string you searched, then we use those indexes (after applying some simple math) to access the substring by slicing the string with [:] . .find()返回您搜索的字符串内子字符串的起始索引，然后我们使用这些索引（在应用一些简单的数学之后）通过使用[:]对字符串进行切片来访问子字符串。

如何在python 3中打印/获取Html文件中的特定行

问题描述

1 个解决方案

解决方案1
2 已采纳

如何在python 3中打印/获取Html文件中的特定行

问题描述

1 个解决方案

解决方案1 2 已采纳

解决方案1
2 已采纳