從BASH中的HTTP響應中提取主體部分

Question

任何人都可以幫助我弄清楚如何使用bash腳本從以下響應的標題部分提取正文。

我已經嘗試通過在SO上查看一些現有解決方案來嘗試awk，sed，grep ...，但收效甚微。 讓我知道是否需要提供其他信息。

HTTP/1.1 200 OK
Cache-Control: max-age=604800
Content-Type: text/html
Date: Mon, 24 Jul 2017 10:16:19 GMT
Etag: "359670651+gzip+ident"
Expires: Mon, 31 Jul 2017 10:16:19 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (iad/182A)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 1270

<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 50px;
        background-color: #fff;
        border-radius: 1em;
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        body {
            background-color: #fff;
        }
        div {
            width: auto;
            margin: 0 auto;
            border-radius: 0;
            padding: 1em;
        }
    }
    </style>    
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is established to be used for illustrative examples in documents. You may use this
    domain in examples without prior coordination or asking for permission.</p>
    <p><a href="http://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>

我沒有正確解釋，我的意思是從http響應中提取主體，而不是html響應主體。 換句話說，一般如何提取http的正文響應（在\\ r \\ n \\ r \\ n之后的響應部分），此代碼僅用於演示目的...

Answer 1

sed -n '/<body>/,/<\/body>/p' filename

打印所有內容，從至

Answer 2

如果您也想提取正文標簽之間的所有內容，包括和標簽，那么以下內容可能會幫助您。

awk '/<body/,/<\/body>/'  Input_file

如果您不希望在輸出中添加標簽，則以下內容可能會幫助您。

awk '/<\/body>/{a="";next} /<body>/{a=1;next} a' Input_file

Answer 3

要輸出body內部HTML（不帶body標簽）：

sed -n '/<body/,/<\/body>/{//!p}' file

從BASH中的HTTP響應中提取主體部分

問題描述

3 個解決方案

解決方案1
2 2017-07-24 11:37:45

解決方案2
0 2017-07-24 11:51:12

解決方案3
0 2017-07-24 12:09:50

從BASH中的HTTP響應中提取主體部分

問題描述

3 個解決方案

解決方案1 2 2017-07-24 11:37:45

解決方案2 0 2017-07-24 11:51:12

解決方案3 0 2017-07-24 12:09:50

解決方案1
2 2017-07-24 11:37:45

解決方案2
0 2017-07-24 11:51:12

解決方案3
0 2017-07-24 12:09:50