Disclaimer : (off-topic warning) This is not about outputting the list of ignored files actually detected in the repo. This is about ignored paths , even when no file is in fact matching one of these paths.
Context : I'm attempting to write a git alias to "flatten" all .gitignore patterns recursively and output a list of paths as they're seen from the top level .
What I mean with an example:
├─ .git
├─ .gitignore
└─ dir1
├─ .gitignore
├─ file1.txt
└─ file2.txt
With these contents in .gitignore
files:
# (currently pointing at top-level directory)
$ cat .gitignore
some_path
$ cat dir1/.gitignore
yet_another_path
*.txt
I try to have an alias to output something along the lines of
$ git flattened-ignore-list
some_path
dir1/yet_another_path
dir1/*.txt
What do I have so far?
I know I can search for all .gitignore files in the repo with
find . -name ".gitignore"
which in this case would output
.gitignore
dir1/.gitignore
So I've tried to combine this with cat
to get their contents (either of these work)
find . -name ".gitignore" | xargs cat
# or
cat $(find . -name ".gitignore")
with this result:
some_path
yet_another_path
*.txt
which is technically expected but unfortunately unhelpful for what I am trying to achieve. So to (at last!) arrive at my actual question:
How can I, for each result of find
, refer to the current path? (in order to eventually prepend it to the line)
Note for people suspecting an XY problem : It might be the case, my approach might just be naive here, but maybe not, I'm unsure. For example I didn't consider complex cases where nested .gitignore files could refer to upper-levels, or special syntax with **
. I've stuck to very simple structures for now, so in case you see a flaw and/or can suggest a totally different way to achieve the same goal, I'll of course be happy to hear about it also.
I try to have an alias to output something along the lines of
$ git flattened-ignore-list some_path dir1/yet_another_path dir1/*.txt
Unfortunately, this approach is naive (and perhaps doomed, but maybe not) because entries in .gitignore
files are a bit complicated.
The simple answer to the simple question you asked is to use something that prepends the directory name, relative to the top level. Since find
never outputs unnecessarily-complicated names, you can do this with direct string processing:
.gitignore dir1/.gitignore
tells you that when reading the first file, prepend nothing, and when reading the second, prepend dir1
to each entry. Doing this in shell is a little tricky, but bash has the tools needed: you just get the line minus the /.gitignore
at the end, either using regexp replacement or just removing 11 characters (if I counted right) from anything that has a slash in it or isn't the literal 10-character string .gitignore
. Grab the directory off the part before the /.gitignore
name and use sed
or awk
to insert it, and a slash, in front of non-comment entries (and remember to handle !
entries a little differently).
You are probably better off handling the top level .gitignore
separately–you can just copy it straight through, adding a final newline if necessary—and then dealing with subdirectory .gitignore
s in a different code path.
Note that a subdirectory .gitignore
cannot refer to something above it: nothing in dir1/.gitignore
can change whether ./foo
or dir2/foo
is ignored or not. So that part is not a problem.
The part that is a problem is that, in dir1
, the entry:
*.txt
implies that the top level should not only ignore untracked dir1/*.txt
files, but also ignore dir1/sub/*.txt
files, dir1/sub/sub2/*.txt
, and so on. However, a dir1
entry reading:
sub/*.txt
means that the top level should ignore only untracked dir1/sub/*.txt
files, without ignoring any dir1/sub/sub2/*.txt
files!
You may be able to salvage this with yet more code: while reading a subdirectory .gitignore
, check to see if there are embedded slashes in any given line. An embedded slash is one that is not the final slash, because final slashes are removed for this particular differentiation.
If the entry contains an embedded slash, it applies only to the full-path-relative-to-the-subdirectory. You can therefore add dir1/
in front and be done, eg:
dir1/foo/*.txt
If the entry does not contain an embedded slash, it applies to the subdirectory and all of its nested sub-subdirectories. You will need to allow for any arbitrary number of subdirectories. This might be correct, but it's quite untested:
dir1/*.txt dir1/**/*.txt
(In theory **/
should also match the empty list of subdirectories, so only the second line should be needed, but in practice I have seen this not happen for some cases. I do not recall whether this was in other pathspecs, .gitignore
files, or both.)
In general, most .gitignore
entries seem not to contain embedded slashes, so any successful script you write will probably produce a nearly double-length "flattened" ignore file, compared to its input length.
You can produce a complete list of ignore patterns, with directory prefix like this:
#!usr/bin/env sh
find \
. \
-type f \
-name '.gitignore' \
-printf '%h\n' \
| while IFS= read -r dir_name; do
printf \
"${dir_name}/%s\\n" \
$(
sed \
--silent \
'/^[^#[:space:]]/p' \
"$dir_name/.gitignore"
)
done
The above code will just list all patterns found in .gitignore
files across directories, and add the directory as prefix of each pattern.
It does not reflect gitignore
syntax and behavior that is described here in git documentation: https://git-scm.com/docs/gitignore
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.