简体   繁体   English

如何使用grep搜索1900到2100年?

[英]How can I use grep to search for years from 1900 to 2100?

How can I use grep to search for years from 1900 to 2100? 如何使用grep搜索1900到2100年?

For example, if I have a variable with 20123320 I want to print 2012 . 例如,如果我的变量为2012332020123320打印2012

Funny ways using bash ( sh users beware!): 使用bash的有趣方式(请sh用户注意!):

If you want to match and print all these years that appear at the beginning of lines in a file file : 如果要匹配并打印出现在文件file行首的所有这些年份:

printf "^%s\n" {1900..2100} | grep -of - file

If you have a variable variable that contains 20123320 : 如果您的变量variable包含20123320

variable=20123320
printf "^%s\n" {1900..2100} | grep -of - <(echo "$variable")

Now please detail a little bit more what you want to do exactly so that we can give you the most appropriate answer. 现在,请详细说明您要确切执行的操作,以便我们为您提供最合适的答案。

Edit. 编辑。 As I see other answers using other tools than and here's a 100% solution: 我看到使用以外的其他工具的其他答案是100% 解决方案:

variable="20123320"
# take the first 4 characters of variable:
year="${variable:0:4}"
# check that year is an integer and that it falls into the given range
if [[ "$year" =~ ^[[:digit:]]+$ ]] && (( 1900<=year && year<=2100)); then
    echo "$year"
else
    # Do whatever you want here
    echo "You dumbo, I couldn't find a valid year in your string"
fi
awk 'BEGIN{FIELDWIDTHS="4 "}{if($1~/^[0-9]+$/&&$1>=1900&&$1<=2100)print $1}'    

Try doing this : 尝试这样做:

echo "$var" | grep -Eo '\b(((19|20)[0-9][0-9])|2100)'

Or see my solution, since I think using regex here is not the best path. 或查看我的解决方案,因为我认为在此处使用regex不是最佳途径。

is not the better tool to do this, Perl will be more suitable, easier & robust to test numeric ranges : 并不是执行此操作的更好工具,Perl将更合适,更容易且更可靠地测试数字范围:

echo "$var" | perl -lne '
    $year = substr($_, 0, 4);
    print $year if $year <= 2100 && $year >= 1900 && $year =~ /^\d+$/
'

or with with the same logic : 或使用具有相同逻辑的

echo "$var" | awk '
{
    year = substr($0, 0, 4)
    if (year <= 2100 && year >= 1900 && $1 ~ /^[0-9]+$/) {
        print year
    }
}'

If you insist on using grep for this, you can. 如果您坚持为此使用grep ,则可以。

I'll assume that you want to match a variable that starts with 4 digits in in the range 1900 to 2100, and you want to print just those 4 digits. 我假设您要匹配一个以1900到2100范围内的4位数字开头的变量,并且只想打印这4位数字。

echo "$var" | grep -Eo '^(((19|20)[0-9][0-9])|2100)'

This ignores whatever may follow those first 4 digits (because I can't think of a way to check the rest of the string without printing it). 这忽略了前四个数字之后的任何内容(因为我无法想到一种无需打印即可检查字符串其余部分的方法)。

But grep is not the obvious tool for this job, nor is a regular expression the best tool for matching a range of numbers. 但是grep并不是这项工作的明显工具,正则表达式也不是匹配一系列数字的最佳工具。 For example, if you needed to match numbers from 1950 to 2100, the regular expression would have to be substantially different. 例如,如果您需要匹配从1950到2100的数字,则正则表达式必须大不相同。

Personally, I'd use Perl: 就个人而言,我将使用Perl:

echo "$var" | perl -ne 'if (/^(\d{4})\d{4}$/ and $1 >= 1900 and $1 <= 2100) { print "$1\n" }'

This checks that $var contains exactly 8 decimal digits. 这将检查$var恰好包含8个十进制数字。 If you want to check that they make up a valid date, you'll need some more code. 如果要检查它们是否构成有效日期,则需要更多代码。

You could also do it fairly cleanly in awk, which might be a bit faster. 您也可以在awk中相当干净地执行此操作,这可能会更快一些。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM