简体   繁体   English

用sed / awk解析文本

[英]Parsing text with sed / awk

I am trying to parse an html table in order to obtain the values. 我试图解析html表以获得值。 See here. 看这里。

    <tr>
            <th>CLI:</th>
            <td>0044123456789</td>
    </tr>

    <tr>
            <th>Call Type:</th>
            <td>New Enquiry</td>
    </tr>

    <tr>
            <th class=3D"nopaddingtop">Caller's Name:</th>
            <td class=3D"nopaddingtop">&nbsp;</td>
    </tr>

    <tr>
            <th class=3D"nopaddingmid"></th>
            <td class=3D"nopaddingmid">Mr</td>
    </tr>

    <tr>
            <th class=3D"nopaddingmid"></th>
            <td class=3D"nopaddingmid">Lee</td>
    </tr>

    <tr>
            <th class=3D"nopaddingbot"></th>
            <td class=3D"nopaddingbot">Butler</td>
    </tr>

I want to read the values associated wit the "CLI", "Call Type", and "Caller's Name" into separate variables using sed / awk. 我想使用sed / awk将与“ CLI”,“呼叫类型”和“呼叫者姓名”关联的值读入单独的变量中。

For example: 例如:

cli="0044123456789"
call_type="New Enquiry"
caller_name="Mr Lee Butler"

How can I do this? 我怎样才能做到这一点?

Many thanks, Neil. 非常感谢,尼尔。

One example for CLI one : CLI之一的一个示例:

var=$(xmllint --html --xpath '//th[contains(., "CLI")]/../td/text()' file.html)
echo "$var"

For the multi <tr> part : 对于multi <tr>部分:

$ for key in {4..6}; do
    xmllint \
        --html \
        --xpath "//th[contains(., 'CLI')]/../../tr[$key]/td/text()" file.html
    printf ' '
done
echo

Output: 输出:

Mr Lee Butler

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM