简体   繁体   中英

trying to grep file paths from pipe output

I need to find the file paths in perforce which are not following some standard we following.

Basically, Our standard way to add new file in perforce path:- //depot/project/name/content/<sub_project>/<version>/...

Here <sub_project> should be of alpha numeric and <version> should be of only integer kind eg:- 1.0, 1.1...etc. So I need to find the files which are not following above standard. And here is my command where I could get the output paths which are following correct standard. Is this correct way of using egrep here?

p4 files //depot/project/name/content/... | egrep "//depot/project/name/content/.+/[[:alnum:]]+"

Lets say we have following lines from p4 command ouptput:-

//depot/project/name/content/cuda/sccm_2.1
//depot/project/name/content/cpla/test_3.1
//depot/project/name/content/ctest/arm_test
//depot/project/name/content/bfm/1.2
//depot/project/name/content/nvlog/1.0

I am interested only in first three paths ie

//depot/project/name/content/cuda/sccm_2.1
//depot/project/name/content/cpla/test_3.1
//depot/project/name/content/ctest/arm_test

It's not clear to me if you want to consider versions like 1 or 1.2.3 as invalid or not. This treats both of those as invalid and requires version number to have exactly one . . It is easy to modify the regex if needed:

awk '$NF !~ /^[0-9]*\.[0-9]*$/' FS=/ input

Since you didn't give a list of possible names, I created a sample list:

echo "//depot/project/name/content/gaga/1.1\n//depot/project/name/content/chomp{}/1.1\n//depot/project/name/content/kaka/99.7\n//depot/project/name/content/kuku/1"    

//depot/project/name/content/gaga/1.1
//depot/project/name/content/chomp{}/1.1
//depot/project/name/content/kaka/99.7
//depot/project/name/content/kuku/1

To find the 2 matches I used grep -p (because perl regexp is much more friendly)

echo "//depot/project/name/content/gaga/1.1\n//depot/project/name/content/chomp{}/1.1\n//depot/project/name/content/kaka/99.7\n//depot/project/name/content/kuku/1" | grep -P "//depot/project/name/content/\w+/\d+\.\d+"

//depot/project/name/content/gaga/1.1
//depot/project/name/content/kaka/99.7

Now, if your version might be missing the dot, you can change the regexp to
"//depot/project/name/content/\w+/\d+\.?\d*"

Last, but not least - if you already called the p4 command with the full path, you can probably ignore the path in the regexp, as it is given by you...

UPDATE
given the input you gave, updated the regexp to
grep -P "//depot/project/name/content/\w+/[a-zA-Z]\w+(\d\.\d+)?"
If file names might begin with none alphabet signs, add them to the square brackets.

> echo "//depot/project/name/content/cuda/sccm_2.1\n//depot/project/name/content/cpla/test_3.1\n//depot/project/name/content/ctest/arm_test\n//depot/project/name/content/bfm/1.2\n//depot/project/name/content/nvlog/1.0" | grep -P "//depot/project/name/content/\w+/[a-zA-Z]\w+(\d\.\d+)?"
//depot/project/name/content/cuda/sccm_2.1
//depot/project/name/content/cpla/test_3.1
//depot/project/name/content/ctest/arm_test

The following grep command uses a regex that matches your desired path, but also uses the -v option to invert the match. This has the effect of returning the unwanted paths:

grep -v -E "\/\/depot\/project\/name\/content\/[[:alnum:]]*\/([0-9]+\.?)*[0-9]+"

The regular expression doesn't permit the <version> to start or end with a . . Also, [:alnum:] doesn't include _ or - , so those would have to be added if needed.

So from this data:

//depot/project/name/content/cuda/sccm_2.1
//depot/project/name/content/cpla/test_3.1
//depot/project/name/content/ctest/arm_test
//depot/project/name/content/bfm/1.2
//depot/project/name/content/nvlog/1.0
//depot/project/name/content/nvlog/.0
//depot/project/name/content/bfm/10.2
//depot/project/name/content/bfm/10.2.1.7
//depot/project/name/content/nvlog/123545

It will return:

//depot/project/name/content/cuda/sccm_2.1
//depot/project/name/content/cpla/test_3.1
//depot/project/name/content/ctest/arm_test
//depot/project/name/content/nvlog/.0

I think that's what you want, but if not, let me know.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM