简体   繁体   中英

What does `cat-file` stand for in git?

In git, what does cat-file stand for in this command?

$ git cat-file <...>

My first thought is "concatenate file" because the Unix command cat stands for "concatenate", but this doesn't correspond to the function of git cat-file .

While cat does stand for "concatenate", what it actually does is simply display one or multiple files, in order of their appearance in the command line arguments to cat . The common pattern to view the contents of a file on Linux or *nix systems is:

cat <file>

The main difference between cat and Git's cat-file is that it only displays a single file (hence the -file part). Git's cat-file doesn't really stand for "concatenate"; it simply is a reference to the behavior of the cat command.

git-cat-file - Provide content or type and size information for repository objects

Technically, you can use git cat-file to concatenate files, if you use Batch Output mode:

BATCH OUTPUT

If --batch or --batch-check is given, cat-file will read objects from stdin, one per line, and print information about them. By default, the whole line is considered as an object, as if it were fed to git-rev-parse [1].

to read the content ( or blob ) of a git object

git cat-file -p <SHA1>

to read its type

git cat-file -t <SHA1>

The main difference between cat and Git's cat-file is that it only displays a single file (hence the -file part)

A single file, ... or a list of single files.

In the second form, a list of objects (separated by linefeeds) is provided on stdin, and the SHA-1, type, and size of each object is printed on stdout.
The output format can be overridden using the optional <format> argument.

This is important when you consider git cat-file --batch , which prints object information and contents for each object provided on stdin .

See also git cat-files --batch-command with Git 2.36 (Q2 2022) .

And With Git 2.34 (Q4 2021), the " ref-filter " machinery that drives the " --format " option of " git for-each-ref " ( man ) and its friends evolves, to be used in git cat-file --batch ( man ) ".

See commit bff9703 (01 Jul 2021) by Junio C Hamano ( gitster ) .
See commit b9dee07 , commit e85fcb3 , commit 7121c4d , commit bd0708c , commit 311d0b8 (26 Jul 2021) by ZheNing Hu ( adlternative ) .
(Merged by Junio C Hamano -- gitster -- in commit bda891e , 24 Aug 2021)

ref-filter : add %(rest) atom

Reviewed-by: Jacob Keller
Suggected-by: Jacob Keller
Mentored-by: Christian Couder
Mentored-by: Hariom Verma
Signed-off-by: ZheNing Hu

%(rest) is a atom used for cat-file batch mode, which can split the input lines at the first whitespace boundary, all characters before that whitespace are considered to be the object name; characters after that first run of whitespace (ie, the "rest" of the line) are output in place of the %(rest) atom.

In order to let " cat-file --batch=%(rest) " use the ref-filter interface, add %(rest) atom for ref-filter .

Introduce the reject_atom() to reject the atom %(rest) for " git for-each-ref " ( man ) , " git branch " ( man ) , " git tag " ( man ) and git verify-tag ".

So both command should return the same result:

git cat-file commit refs/tags/testtag^{} >expected &&
git for-each-ref --format="%(*raw)" refs/tags/testtag 

basic atom: refs/tags/testtag *raw

Same for:

git rev-parse refs/mytrees/first | git cat-file --batch >expected &&
git for-each-ref --format="%(objectname) %(objecttype) %(objectsize)%(raw)" refs/mytrees/first

Note that with Git 2.36 (Q2 2022), " git cat-file --help " ( man ) is clearer.

See commit 5fb2490 , commit 83dc443 (10 Jan 2022), and commit 245b948 , commit 9ce6000 , commit 57d6a1c , commit b3fe468 , commit 485fd2c , commit 5a40417 , commit 97fe725 , commit fa476be , commit 68c69f9 , commit ddf8420 (28 Dec 2021) by Ævar Arnfjörð Bjarmason ( avar ) .
(Merged by Junio C Hamano -- gitster -- in commit 008028a , 05 Feb 2022)

cat-file : correct and improve usage information

Signed-off-by: Ævar Arnfjörð Bjarmason

Change the usage output emitted on " git cat-file " ( man ) -h to group related options, making it clear to users which options go with which other ones.

The new output is:

 Check object existence or emit object contents -e check if <object> exists -p pretty-print <object> content Emit [broken] object attributes -t show object type (one of 'blob', 'tree', 'commit', 'tag', ...) -s show object size --allow-unknown-type allow -s and -t to work with broken/corrupt objects Batch objects requested on stdin (or --batch-all-objects) --batch[=<format>] show full <object> or <rev> contents --batch-check[=<format>] like --batch, but don't emit <contents> --batch-all-objects with --batch[-check]: ignores stdin, batches all known objects Change or optimize batch output --buffer buffer --batch output --follow-symlinks follow in-tree symlinks --unordered do not order objects before emitting them Emit object (blob or tree) with conversion or filter (stand-alone, or with batch) --textconv run textconv on object's content --filters run filters on object's content --path blob|tree use a <path> for (--textconv | --filters ); Not with 'batch'

The old usage was:

 <type> can be one of: blob, tree, commit, tag -t show object type -s show object size -e exit with zero when there's no error -p pretty-print object's content --textconv for blob objects, run textconv on object's content --filters for blob objects, run filters on object's content --batch-all-objects show all objects with --batch or --batch-check --path <blob> use a specific path for --textconv/--filters --allow-unknown-type allow -s and -t to work with broken/corrupt objects --buffer buffer --batch output --batch[=<format>] show info and content of objects fed from the standard input --batch-check[=<format>] show info about objects fed from the standard input --follow-symlinks follow in-tree symlinks (used with --batch or --batch-check) --unordered do not order --batch-all-objects output

While shorter, I think the new one is easier to understand, as eg " --allow-unknown-type " is grouped with " -t " and " -s ", as it can only be combined with those options.
The same goes for " --buffer ", " --unordered " etc.


Still with Git 2.36 (Q2 2022), optimize away strbuf_expand() call with a hardcoded formatting logic specific for the default format in git the --batch and --batch-check options of cat-file".

See commit eb54a33 (15 Mar 2022) by John Cai ( john-cai ) .
(Merged by Junio C Hamano -- gitster -- in commit 889860e , 23 Mar 2022)

cat-file : skip expanding default format

Signed-off-by: Ævar Arnfjörð Bjarmason
Signed-off-by: John Cai

When format is passed into --batch , --batch-check , --batch-command , the format gets expanded.
When nothing is passed in, the default format is set and the expand_format() gets called.

We can save on these cycles by hardcoding how to print the information when nothing is passed as the format, or when the default format is passed.
There is no need for the fully expanded format with the default.
Since batch_object_write() happens on every object provided in batch mode, we get a nice performance improvement.

 git rev-list --all > /tmp/all-obj.txt git cat-file --batch-check </tmp/all-obj.txt

with HEAD^:

 Time (mean ± σ): 57.6 ms ± 1.7 ms [User: 51.5 ms, System: 6.2 ms] Range (min … max): 54.6 ms … 64.7 ms 50 runs

with HEAD:

 Time (mean ± σ): 49.8 ms ± 1.7 ms [User: 42.6 ms, System: 7.3 ms] Range (min … max): 46.9 ms … 55.9 ms 56 runs

If nothing is provided as a format argument, or if the default format is passed, skip expanding of the format and print the object info with a default format.

Seethis discussion .

add to @Matoeil answer, You only need to specify 5 characters of your <SHA1> .

$ tree .git/

.git/
├── COMMIT_EDITMSG
├── HEAD
├── config
├── description
├── hooks
│   ├── applypatch-msg.sample
│   ├── commit-msg.sample
│   ├── fsmonitor-watchman.sample
│   ├── post-update.sample
│   ├── pre-applypatch.sample
│   ├── pre-commit.sample
│   ├── pre-push.sample
│   ├── pre-rebase.sample
│   ├── pre-receive.sample
│   ├── prepare-commit-msg.sample
│   └── update.sample
├── index
├── info
│   └── exclude
├── logs
│   ├── HEAD
│   └── refs
│       └── heads
│           ├── master
│           └── testBranch
├── objects
│   ├── 1e
│   │   └── e2a78c0b40dd8e5c6b08e31171a3ce1e8d931b
│   ├── 29
│   │   └── 33b9017f79a27ff5ad3c4e154f67b44ae8482c
│   ├── 4a
│   │   └── 6a376085b9b3b8e6e73d2cdcc5281cf6915c58
│   ├── 4b
│   │   └── 825dc642cb6eb9a060e54bf8d69288fbee4904
│   ├── 7e
│   │   └── 6965a8b2ff07da3e632a24ee024b9d2ec5245d
│   ├── ae
│   │   └── 853f7ece778281c463c1f0c603ef9d47a425b7
│   ├── info
│   └── pack
└── refs
├── heads
│   ├── master
│   └── testBranch
└── tags

17 directories, 28 files

$ git cat-file -t ae853
tree 

$ git cat-file -p ae853
100644 blob 7e6965a8b2ff07da3e632a24ee024b9d2ec5245d    fil1.txt 

here explain very well.

git cat-file - The Cat(concatenate) command It reads data from the file and outputs the contents.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM