[英]In Linux how to check if 2 different patterns are in consecutive lines
I've an ASCII text file which I'm validating. 我有一个正在验证的ASCII文本文件。 The file contains contexts of 2 types:
该文件包含2种类型的上下文:
Necessary Context: One which should be present at least once in its exact position.
Optional Context: One which may or may not be present, but if present should hold its proper place.
The detail look of the file: 文件的详细外观:
[INDEX] <-- optional context, but if present should be the first context immediately followed by [FEATURE_ID], else file invalid
[FEATURE_ID] <-- necessary context and should always be immediately followed by [FEATURE_REV], else file is invalid. If [INDEX] context there then this should be the second CONTEXT in file else first.
[FEATURE_REV] <-- necessary context (must exist one per FEATURE_ID) and should always be immediately after [FEATURE_ID], else file is invalid.
[PRL_ID] <-- optional context, but if present should always be immediately after [FEATURE_REV], else file invalid
[NO_OF_BYTES] <--optional context, but if present, should always be immediately after [PRL_ID] if it is present, else immediately after [FEATURE_REV] if [PRL_ID] not present. Otherwise file invalid.
[NO_OF_SIGNIF_BITS] <-- optional context, but if present should always be between [NO_OF_BYTES] ( can be only present if [NO_OF_BYTES] present else not) and [CRC], else file invalid
[CRC] <-- necessary context,(must exist one per FEATURE_ID and FEATURE_REV). This is always the last context.
Note, there might be multiple [FEATURE_ID] contexts in a valid File, and in all cases the other contexts leading and following it, should follow same place holding rule. 请注意,有效文件中可能有多个[FEATURE_ID]上下文,并且在所有情况下,领导该文件的其他上下文应遵循相同的占位规则。 Something like this:
像这样:
Validfile_1:
[FEATURE_ID]
[FEATURE_REV]
[CRC]
[INDEX]
[FEATURE_ID]
[FEATURE_REV]
[CRC]
Validfile_2:
[FEATURE_ID]
[FEATURE_REV]
[NO_OF_BYTES]
[CRC]
[INDEX]
[FEATURE_ID]
[FEATURE_REV]
[PRL_ID]
[NO_OF_BYTES]
[NO_OF_SIGNIF_BITS]
[CRC]
Validfile_3
[FEATURE_ID]
[FEATURE_REV]
[CRC]
Invalidfile_1 (order of contexts not ok):
[FEATURE_ID]
[INDEX]
[FEATURE_REV]
[NO_OF_BYTES]
[CRC]
[PRL_ID]
Invalidfile_2(FEATURE_REV or CRC can never exist without a FEATURE_ID):
[FEATURE_REV]
[NO_OF_BYTES]
[CRC]
Invalidfile_3 ( NO_OF_SIGNIF_BITS cannot exist without NO_OF_BYTES)
[FEATURE_ID]
[FEATURE_REV]
[NO_OF_SIGNIF_BITS]
[CRC]
I'm trying to achieve this in a linux script via multiple if else statements and egreps
, but the lines of code are becoming more and complex. 我正在尝试通过多个if else语句和
egreps
在linux脚本中实现这egreps
,但是代码行变得越来越复杂。
The code that I'm going for: 我要使用的代码:
f_id_c=`egrep "[ ]*\[FEATURE_ID=[0-9].*\][ ]*" $1 | wc -l`
f_rev_c=`egrep "[ ]*\[FEATURE_REV=[0-9].*\][ ]*" $1 | wc -l`
crc_c=`egrep "[ ]*\[CRC\][ ]*" $1 | wc -l`
[[ $((f_id_c)) -eq 0 ]] && { echo "Invalid! No [FEATURE_ID=] context defined in profile file !"; exit 1; }
[[ $((f_rev_c)) -ne $((f_id_c)) ]] && { echo "Invalid! Not all [FEATURE_REV=] contexts have leading [FEATURE_ID=] defined"; exit 1; }
[[ $((crc_c)) -ne $((f_id_c)) ]] && { echo "Invalid! Not all [CRC] contexts have leading [FEATURE_ID=] defined"; exit 1; }
for (i=0;i<f_id_c;i++)
do
// Have a check with SED that will confirm there is a [FEATURE_REV=] immediately following [FEATURE_ID=]
done
Can someone suggest a compact awk script
or sed
manipulation where I can achieve all the above validation. 有人可以建议一个紧凑的
awk script
或sed
操作,使我可以实现上述所有验证。
You're going to want a FSM something like this: 您将需要这样的FSM:
$ cat tst.awk
BEGIN {
# define the allowed state transitions
ns["IDLE","INDEX"]
ns["IDLE","FEATURE_ID"]
ns["INDEX","FEATURE_ID"]
ns["FEATURE_ID","FEATURE_REV"]
ns["FEATURE_REV","PRL_ID"]
ns["FEATURE_REV","NO_OF_BYTES"]
ns["FEATURE_REV","CRC"]
ns["PRL_ID","NO_OF_BYTES"]
ns["PRL_ID","CRC"]
ns["NO_OF_BYTES","NO_OF_SIGNIF_BITS"]
ns["NO_OF_BYTES","CRC"]
ns["NO_OF_SIGNIF_BITS","CRC"]
ns["CRC","INDEX"]
ns["CRC","FEATURE_ID"]
# create a regexp of the state names for use in match()
for (state in ns) {
sub(SUBSEP".*","",state)
if (!seen[state]++) {
states = states (states ? "|" : "") state
}
}
# set the initial state
state = "IDLE"
}
# parse the input
match($0,states) {
nextState = substr($0,RSTART,RLENGTH)
if ( ! ((state,nextState) in ns) ) {
print "ERROR", NR, state, nextState, $0 | "cat>&2"
exit 1
}
state = nextState
}
When run against your posted sample input file: 针对发布的样本输入文件运行时:
$ cat file
....
[FEATURE_ID]
[FEATURE_REV]
...
...
[CRC]
[INDEX]
[FEATURE_ID]
[FEATURE_REV]
...
...
...
[CRC]
$
$ awk -f tst.awk file
$
it produces no output, as expected since the sample you provided contains no errors for it to find. 它不产生任何输出,这与预期的一样,因为您提供的样本没有错误可供查找。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.