How can I use grep to search for years from 1900 to 2100?
For example, if I have a variable with 20123320
I want to print 2012
.
Funny ways using bash ( sh
users beware!):
If you want to match and print all these years that appear at the beginning of lines in a file file
:
printf "^%s\n" {1900..2100} | grep -of - file
If you have a variable variable
that contains 20123320
:
variable=20123320
printf "^%s\n" {1900..2100} | grep -of - <(echo "$variable")
Now please detail a little bit more what you want to do exactly so that we can give you the most appropriate answer.
Edit. As I see other answers using other tools than bash and grep here's a 100% bash solution:
variable="20123320"
# take the first 4 characters of variable:
year="${variable:0:4}"
# check that year is an integer and that it falls into the given range
if [[ "$year" =~ ^[[:digit:]]+$ ]] && (( 1900<=year && year<=2100)); then
echo "$year"
else
# Do whatever you want here
echo "You dumbo, I couldn't find a valid year in your string"
fi
awk 'BEGIN{FIELDWIDTHS="4 "}{if($1~/^[0-9]+$/&&$1>=1900&&$1<=2100)print $1}'
Try doing this :
echo "$var" | grep -Eo '\b(((19|20)[0-9][0-9])|2100)'
Or see my perl solution, since I think using regex
here is not the best path.
grep is not the better tool to do this, Perl will be more suitable, easier & robust to test numeric ranges :
echo "$var" | perl -lne '
$year = substr($_, 0, 4);
print $year if $year <= 2100 && $year >= 1900 && $year =~ /^\d+$/
'
or with awk with the same logic :
echo "$var" | awk '
{
year = substr($0, 0, 4)
if (year <= 2100 && year >= 1900 && $1 ~ /^[0-9]+$/) {
print year
}
}'
If you insist on using grep
for this, you can.
I'll assume that you want to match a variable that starts with 4 digits in in the range 1900 to 2100, and you want to print just those 4 digits.
echo "$var" | grep -Eo '^(((19|20)[0-9][0-9])|2100)'
This ignores whatever may follow those first 4 digits (because I can't think of a way to check the rest of the string without printing it).
But grep
is not the obvious tool for this job, nor is a regular expression the best tool for matching a range of numbers. For example, if you needed to match numbers from 1950 to 2100, the regular expression would have to be substantially different.
Personally, I'd use Perl:
echo "$var" | perl -ne 'if (/^(\d{4})\d{4}$/ and $1 >= 1900 and $1 <= 2100) { print "$1\n" }'
This checks that $var
contains exactly 8 decimal digits. If you want to check that they make up a valid date, you'll need some more code.
You could also do it fairly cleanly in awk, which might be a bit faster.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.