Mirror of nyancrimew/goop@github.com
Go to file
Tillie Kottmann 0b6ec74296 Increase queue size for recursive download worker 2020-11-01 18:40:19 +01:00
cmd Add the ability to pass in an entire list of domains 2020-11-01 17:55:19 +01:00
internal Fix html detection 2020-10-31 13:46:49 +01:00
pkg/goop Increase queue size for recursive download worker 2020-11-01 18:40:19 +01:00
.gitignore Basic implementation to support sites with AutoIndex on 2020-10-30 11:47:23 +01:00
LICENSE Add readme and license 2020-10-30 18:48:44 +01:00
README.md Add readme and license 2020-10-30 18:48:44 +01:00
go.mod Limit concurrency 2020-10-30 18:42:48 +01:00
go.sum Limit concurrency 2020-10-30 18:42:48 +01:00
main.go Basic implementation to support sites with AutoIndex on 2020-10-30 11:47:23 +01:00

README.md

goop

Yet another tool to dump a git repository from a website. Code structure and console outputs heavily inspired by arthaud/git-dumper.

Usage

usage: goop [flags] url [dir]

Flags:
   -f, --force   overrides DIR if it already exists
   -h, --help    help for goop

Example

$ goop example.com

Installation

go get -u github.com/deletescape/goop

How does it work?

The tool will first check if directory listing is available. If it is, then it will just recursively download the .git directory (what you would do with wget).

If directory listing is not available, it will use several methods to find as many files as possible. Step by step, git-dumper will:

  • Fetch all common files (.gitignore, .git/HEAD, .git/index, etc.);
  • Find as many refs as possible (such as refs/heads/master, refs/remotes/origin/HEAD, etc.) by analyzing .git/HEAD, .git/logs/HEAD, .git/config, .git/packed-refs and so on;
  • Find as many objects (sha1) as possible by analyzing .git/packed-refs, .git/index, .git/refs/* and .git/logs/*;
  • Fetch all objects recursively, analyzing each commits to find their parents;
  • Run git checkout . to recover the current working tree