[英]How can I use grep to search for years from 1900 to 2100?
How can I use grep to search for years from 1900 to 2100? 如何使用grep搜索1900到2100年?
For example, if I have a variable with 20123320
I want to print 2012
. 例如,如果我的变量为20123320
, 20123320
打印2012
。
Funny ways using bash ( sh
users beware!): 使用bash的有趣方式(请sh
用户注意!):
If you want to match and print all these years that appear at the beginning of lines in a file file
: 如果要匹配并打印出现在文件file
行首的所有这些年份:
printf "^%s\n" {1900..2100} | grep -of - file
If you have a variable variable
that contains 20123320
: 如果您的变量variable
包含20123320
:
variable=20123320
printf "^%s\n" {1900..2100} | grep -of - <(echo "$variable")
Now please detail a little bit more what you want to do exactly so that we can give you the most appropriate answer. 现在,请详细说明您要确切执行的操作,以便我们为您提供最合适的答案。
Edit. 编辑。 As I see other answers using other tools than bash and grep here's a 100% bash solution: 我看到使用bash和grep以外的其他工具的其他答案是100% bash解决方案:
variable="20123320"
# take the first 4 characters of variable:
year="${variable:0:4}"
# check that year is an integer and that it falls into the given range
if [[ "$year" =~ ^[[:digit:]]+$ ]] && (( 1900<=year && year<=2100)); then
echo "$year"
else
# Do whatever you want here
echo "You dumbo, I couldn't find a valid year in your string"
fi
awk 'BEGIN{FIELDWIDTHS="4 "}{if($1~/^[0-9]+$/&&$1>=1900&&$1<=2100)print $1}'
grep is not the better tool to do this, Perl will be more suitable, easier & robust to test numeric ranges : grep并不是执行此操作的更好工具,Perl将更合适,更容易且更可靠地测试数字范围:
echo "$var" | perl -lne '
$year = substr($_, 0, 4);
print $year if $year <= 2100 && $year >= 1900 && $year =~ /^\d+$/
'
or with awk with the same logic : 或使用具有相同逻辑的awk :
echo "$var" | awk '
{
year = substr($0, 0, 4)
if (year <= 2100 && year >= 1900 && $1 ~ /^[0-9]+$/) {
print year
}
}'
If you insist on using grep
for this, you can. 如果您坚持为此使用grep
,则可以。
I'll assume that you want to match a variable that starts with 4 digits in in the range 1900 to 2100, and you want to print just those 4 digits. 我假设您要匹配一个以1900到2100范围内的4位数字开头的变量,并且只想打印这4位数字。
echo "$var" | grep -Eo '^(((19|20)[0-9][0-9])|2100)'
This ignores whatever may follow those first 4 digits (because I can't think of a way to check the rest of the string without printing it). 这忽略了前四个数字之后的任何内容(因为我无法想到一种无需打印即可检查字符串其余部分的方法)。
But grep
is not the obvious tool for this job, nor is a regular expression the best tool for matching a range of numbers. 但是grep
并不是这项工作的明显工具,正则表达式也不是匹配一系列数字的最佳工具。 For example, if you needed to match numbers from 1950 to 2100, the regular expression would have to be substantially different. 例如,如果您需要匹配从1950到2100的数字,则正则表达式必须大不相同。
Personally, I'd use Perl: 就个人而言,我将使用Perl:
echo "$var" | perl -ne 'if (/^(\d{4})\d{4}$/ and $1 >= 1900 and $1 <= 2100) { print "$1\n" }'
This checks that $var
contains exactly 8 decimal digits. 这将检查$var
恰好包含8个十进制数字。 If you want to check that they make up a valid date, you'll need some more code. 如果要检查它们是否构成有效日期,则需要更多代码。
You could also do it fairly cleanly in awk, which might be a bit faster. 您也可以在awk中相当干净地执行此操作,这可能会更快一些。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.