[英]how to extract text which matches particular fields in text file using linux commands
Hi below is my text file 嗨,下面是我的文本文件
{"Author":"john"
"subject":"java"
"title":"java cook book.pdf"}
{"title":"Php book.pdf"
"Author":"Smith"
"subject":"PHP"}
{"Author":"Smith"
"title":"Java book.pdf"}
from the above data i want to extract all titles which contains "java" word, i should get the following output 从上面的数据我想提取所有包含“java”字的标题,我应该得到以下输出
java cook book.pdf
Java book.pdf
Please suggest me 请建议我
Thanks 谢谢
You can try something like this with awk
: 你可以用
awk
尝试这样的事情:
awk -F: '$1~/title/&&tolower($2)~/java/{gsub(/\"/,"",$2);print $2}' file
-F:
sets the field separator to :
-F:
将字段分隔符设置为:
$1~/title
checks where first column is title
$1~/title
检查第一列是title
tolower($2)~/java/
checks for second column java
case insensitively tolower($2)~/java/
检查第二列java
不区分大小写 gsub(..)
is to remove "
. gsub(..)
将删除"
。 print $2
to print your second column print $2
打印第二列 I will avoid any complex solution and will rely on old good grep+awk+tr instead: 我将避免任何复杂的解决方案,并将依赖旧的好grep + awk + tr代替:
$ grep '"title":' test.txt | grep '[Jj]ava' | awk -F: '{print $2}' | tr -d [\"}]
java cook book.pdf
Java book.pdf
which works as follow: 其工作原理如下:
"title":
"title":
所有行"title":
Java
or java
Java
或java
:
and show second field :
并显示第二个字段 "
and }
signs "
和}
标志 You should definitely use a json parser to get flawless results.. I like the one provided with PHP and if your file is, as shown, a bunch json blocks separated with blank lines: 你肯定应该使用一个json解析器来获得完美的结果..我喜欢PHP提供的那个,如果你的文件是,如图所示,一堆json块用空行分隔:
foreach( explode("\n\n", file_get_contents('/your/file.json_blocks')) as $js_block ):
$json = json_decode( trim($js_block) );
if ( isset( $json['title'] ) && $json['title'] && stripos($json['title'], 'java') ):
echo trim($json['title']), PHP_EOL;
endif;
endforeach;
This will be a lot more sure fire than doing the same with any given combination of sed/awk/grep/ et al, simply because json is follows a specific format and should be used with a parser. 对于任何给定的sed / awk / grep / et组合,这将更加肯定,因为json遵循特定格式并且应该与解析器一起使用。 As an example, a simple new line in the 'title' which has no real meaning to the json but will break the solution provided by Jaypal.. Please see this for a similar problem: parsing xhtml with regex and why you shouldn't do it: RegEx match open tags except XHTML self-contained tags
举个例子,'title'中的一个简单的新行对json没有实际意义,但会打破Jaypal提供的解决方案..请看一下类似的问题:用正则表达式解析xhtml以及为什么你不应该这样做它: RegEx匹配除XHTML自包含标签之外的开放标签
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.