Fossil: Artifact [bfb379a4b0]

Artifact bfb379a4b09944b89c83dfb23dbf7eb5305f74fca22d9ff6374e5d788812db61:

File www/fossil-v-git.wiki — part of check-in [bd7c47c3f8] at 2019-08-06 23:08:11 on branch trunk — Added RBAC point to the fossil-v-git article, and tweaked the surrounding text. (user: wyoung size: 26387)
<title>Fossil Versus Git</title>

<h2>1.0 Don't Stress!</h2>

If you start out using one DVCS and later decide you like the other better,
you can easily [./inout.wiki | move your content].¹

Fossil and [http://git-scm.com | Git] are very similar in many respects,
but they also have important differences.
See the table below for
a high-level summary and the text that follows for more details.

Keep in mind that you are reading this on a Fossil website, and though
we try to be fair, the information here
might be biased in favor of Fossil.  Ask around for second opinions from
people who have used <em>both</em> Fossil and Git.


<h2>2.0 Differences Between Fossil And Git</h2>

Differences between Fossil and Git are summarized by the following table,
with further description in the text that follows.

<blockquote><table border=1 cellpadding=5 align=center>
<tr><th width="50%">GIT</th><th width="50%">FOSSIL</th></tr>
<tr><td>File versioning only</td>
    <td>VCS, tickets, wiki, docs, notes, forum,
    [https://en.wikipedia.org/wiki/Role-based_access_control|RBAC]</td></tr>
<tr><td>Ad-hoc pile-of-files key/value database</td>
    <td>Relational SQL database</td></tr>
<tr><td>Bazaar-style development</td><td>Cathedral-style development</td></tr>
<tr><td>Designed for Linux kernel development</td>
    <td>Designed for SQLite development</td></tr>
<tr><td>Many contributors</td>
    <td>Select contributors</td></tr>
<tr><td>Focus on individual branches</td>
    <td>Focus on the entire tree of changes</td></tr>
<tr><td>Lots of little tools</td><td>Stand-alone executable</td></tr>
<tr><td>One check-out per repository</td>
    <td>Many check-outs per repository</td></tr>
<tr><td>Remembers what you should have done</td>
    <td>Remembers what you actually did</td></tr>
</table></blockquote>

<h3 id="features">2.1 Feature Set</h3>

Git provides file versioning services only, whereas Fossil adds
an integrated [./wikitheory.wiki | wiki],
[./bugtheory.wiki | ticketing &amp; bug tracking],
[./embeddeddoc.wiki | embedded documentation], 
[./event.wiki | technical notes], and a [./forum.wiki | web forum],
all protected by a fine-grained
[https://en.wikipedia.org/wiki/Role-based_access_control|role-based
access control system].
These additional capabilities are available for Git as 3rd-party
add-ons, but with Fossil they are integrated into
the design.  One way to describe Fossil is that it is
"[https://github.com/ | GitHub]-in-a-box."

If you clone [https://github.com/git/git|Git's self-hosting repository],
you get just Git's source code.
If you clone Fossil's self-hosting repository, you get the entire
Fossil website — source code, documentation, ticket history, and so forth.
That means you get a copy of this very article and all of its historical
versions, plus the same for all of the other public content on this site.

For developers who choose to self-host projects (rather than using a
3rd-party service such as GitHub) Fossil is much easier to set up, since
the stand-alone Fossil executable together with a [./server.wiki#cgi|2-line CGI script]
suffice to instantiate a full-featured developer website.  To accomplish
the same using Git requires locating, installing, configuring, integrating,
and managing a wide assortment of separate tools.  Standing up a developer
website using Fossil can be done in minutes, whereas doing the same using
Git requires hours or days.

<h3 id="database">2.2 Database</h3>

The baseline data structures for Fossil and Git are the same, modulo
formatting details.  Both systems store check-ins as immutable
objects referencing their immediate ancestors and named by a
cryptographic hash of the check-in content.

The difference is that Git stores its objects as individual files
in the ".git" folder or compressed into
bespoke "pack-files," whereas Fossil stores its objects in a
relational ([https://www.sqlite.org/|SQLite]) database file.  To put it
another way, Git uses an ad-hoc pile-of-files key/value database whereas
Fossil uses a proven, general-purpose SQL database.  This
difference is more than an implementation detail.  It
has important consequences.

With Git, one can easily locate the ancestors of a particular check-in
by following the pointers embedded in the check-in object, but it is
difficult to go the other direction and locate the descendants of a
check-in.  It is so difficult, in fact, that neither native Git nor
GitHub provide this capability.  With Git, if you are looking at some
historical check-in then you cannot ask
"What came next?" or "What are the children of this check-in?"

Fossil, on the other hand, parses essential information about check-ins
(parents, children, committers, comments, files changed, etc.)
into a relational database that can be easily
queried using concise SQL statements to find both ancestors and
descendents of a check-in.

Leaf check-ins in Git that lack a "ref" become "detached," making them
difficult to locate and subject to garbage collection.  This
"detached head" problem has caused untold grief for countless
Git users.  With Fossil, all check-ins are easily located using
a variety of attributes (parents, children, committer, date, full-text
search of the check-in comment) and so detached heads are simply not possible.

The ease with which check-ins can be located and queried in Fossil
has resulted in a huge variety of reports and status screens
([./webpage-ex.md|examples]) that show project state
in ways that help developers
maintain enhanced awareness and comprehension
and avoid errors.


<h3 id="vs-linux">2.3 Linux vs. SQLite</h3>

Fossil and Git promote different development styles because each one was
specifically designed to support the creator's main software
development project: [https://en.wikipedia.org/wiki/Linus_Torvalds|Linus
Torvalds] designed Git to support development of
[https://www.kernel.org/|the Linux kernel], and
[https://en.wikipedia.org/wiki/D._Richard_Hipp|D. Richard Hipp] designed
Fossil to support the development of [https://sqlite.org/|SQLite].
Both projects must rank high on any objective list of "most
important FOSS projects," yet these two projects are almost entirely unlike
one another. So, too, are these two DVCSes.

In the following sections, we will explain how three key differences
between Linux and SQLite dictated the design of each DVCS's low-friction
usage path.

When deciding between these two DVCSes, you should ask yourself, "Is my
project more like Linux or more like SQLite?"


<h4 id="devorg">2.3.1 Development Organization</h4>

Eric S. Raymond's seminal essay-turned-book
"[https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar|The
Cathedral and the Bazaar]" details the two major development
organization styles found in
[https://en.wikipedia.org/wiki/Free_and_open-source_software|FOSS]
projects. As it happens, Linux and SQLite fall on opposite sides of this
dichotomy. Differing development organization styles dictate a different
design and low-friction usage path in the tools created to support each
project.

Git promotes the Linux kernel's bazaar development style, in which a
loosely-associated mass of developers contribute their work through
[https://git-scm.com/book/en/v2/Distributed-Git-Distributed-Workflows#_dictator_and_lieutenants_workflow|a
hierarchy of lieutenants] who manage and clean up these contributions
for consideration by Linus Torvalds, who has the power to cherrypick
individual contributions into his version of the Linux kernel. Git
allows an anonymous developer to rebase and push specific locally-named
private branches, so that a Git repo clone often isn't really a clone at
all: it may have an arbitrary number of differences relative to the
repository it originally cloned from. Git encourages siloed development.
Select work in a developer's local repository may remain private
indefinitely.

All of this is exactly what one wants when doing bazaar-style
development.

Fossil's normal mode of operation differs on every one of these points,
with the specific designed-in goal of promoting SQLite's cathedral
development model:

<ul>
    <li><p><b>Personal engagement:</b> SQLite's developers know each
    other by name and work together daily on the project.</p></li>

    <li><p><b>Trust over hierarchy:</b> SQLite's developers check
    changes into their local repository, and these are immediately and
    automatically sync'd up to the central repository; there is no
    "[https://git-scm.com/book/en/v2/Distributed-Git-Distributed-Workflows#_dictator_and_lieutenants_workflow|dictator
    and lieutenants]" hierarchy as with Linux kernel contributions.  D.
    Richard Hipp rarely overrides decisions made by those he has trusted
    with commit access on his repositories. Fossil allows you to give
    [/doc/trunk/www/admin-v-setup.md|some users] more power over what
    they can do with the repository, but Fossil does not otherwise
    directly support the enforcement of a development organization's
    social and power hierarchies. Fossil is a great fit for
    [https://en.wikipedia.org/wiki/Flat_organization|flat
    organizations].</p></li>

    <li><p><b>No easy drive-by contributions:</b> Git
    [https://www.git-scm.com/docs/git-request-pull|pull requests] offer
    a low-friction path to accepting
    [https://www.jonobacon.com/2012/07/25/building-strong-community-structural-integrity/|drive-by
    contributions]. Fossil's closest equivalent is its unique
    [/help?cmd=bundle|bundle] feature, which requires higher engagement
    than firing off a PR.² This difference comes directly from the
    initial designed purpose for each tool: the SQLite project doesn't
    accept outside contributions from previously-unknown developers, but
    the Linux kernel does.</p></li>

    <li><p><b>No rebasing:</b> When your local repo clone syncs changes
    up to its parent, those changes are sent exactly as they were
    committed locally. [#history|There is no rebasing mechanism in
    Fossil, on purpose.]</p></li>

    <li><p><b>Sync over push:</b> Explicit pushes are uncommon in
    Fossil-based projects: the default is to rely on
    [/help?cmd=autosync|autosync mode] instead, in which each commit
    syncs immediately to its parent repository. This is a mode so you
    can turn it off temporarily when needed, such as when working
    offline. Fossil is still a truly distributed version control system;
    it's just that its starting default is to assume you're rarely out
    of communication with the parent repo.
    <br><br>
    This is not merely a reflection of modern always-connected computing
    environments. It is a conscious decision in direct support of
    SQLite's cathedral development model: we don't want developers going
    dark, then showing up weeks later with a massive bolus of changes
    for us to integrate all at once.
    [https://en.wikipedia.org/wiki/Jim_McCarthy_(author)|Jim McCarthy]
    put it well in his book on software project management,
    <i>[https://www.amazon.com/dp/0735623198/|Dynamics of Software
    Development]</i>: "[https://www.youtube.com/watch?v=oY6BCHqEbyc|Beware
    of a guy in a room]."</p></li>

    <li><p><b>Branch names sync:</b> Unlike in Git, branch names are not
    purely local labels. They sync along with everything else, so
    everyone sees the same set of branch names.</p></li>

    <li><p><b>Private branches are rare:</b>
    [/doc/trunk/www/private.wiki|Private branches exist in Fossil], but
    they're normally used to handle rare exception cases, whereas in
    many Git projects, they're part of the straight-line development
    process.</p></li>

    <li><p><b>Identical clones:</b> Fossil's autosync system tries to
    keep local clones identical to the repository it cloned
    from.</p></li>
</ul>

Where Git encourages siloed development, Fossil fights against it.
Fossil places a lot of emphasis on synchronizing everyone's work and on
reporting on the state of the project and the work of its developers, so
that everyone — especially the project leader — can maintain a better
mental picture of what is happening, leading to better situational
awareness.

Each DVCS can be used in the opposite style, but doing so works against
their low-friction paths.


<h4 id="scale">2.3.2 Scale</h4>

The Linux kernel has a far bigger developer community than that of
SQLite: there are thousands and thousands of contributors to Linux, most
of whom do not know each others names. These thousands are responsible
for producing roughly 89⨉ more code than is in SQLite. (10.7
[https://en.wikipedia.org/wiki/Source_lines_of_code|MLOC] vs. 0.12 MLOC
according to [https://dwheeler.com/sloccount/|SLOCCount].) The Linux
kernel and its development process were already uncommonly large back in
2005 when Git was designed, specifically to support the consequences of
having such a large set of developers working on such a large code base.

95% of the code in SQLite comes from just four programmers, and 64% of
it is from the lead developer alone. The SQLite developers know each
other well and interact daily. Fossil was designed for this development
model. As well, we think the fact of Fossil's birth a year later
than Git allowed it to learn from some of the key design mistakes in
Git.

We think you should ask yourself whether you have Linus Torvalds scale
software configuration management problems or D. Richard Hipp scale
problems when choosing your DVCS. An
[https://en.wikipedia.org/wiki/Impact_wrench|automotive air impact
wrench] running at 8000 RPM driving an M8 socket-cap bolt at 16 cm/s is
not the best way to hang a picture on the living room wall.


<h4 id="contrib">2.3.3 Accepting Contributions</h4>

As of this writing, Git has received about 4.5⨉ as many commits as
Fossil resulting in about 2.5⨉ as many lines of source code. The line
count excludes tests and in-tree third-party dependencies. It does not
exclude the default GUI for each, since it's integral for Fossil, so we
count the size of <tt>gitk</tt> in this.

It is obvious that Git is bigger in part because of its first-mover
advantage, which resulted in a larger user community, which results in
more contributions. But is that the <i>only</i> reason? We believe there
are other relevant differences that also play into this which fall out
of the "Linux vs. SQLite" framing: licensing, community structure, and
how we react to
[https://www.jonobacon.com/2012/07/25/building-strong-community-structural-integrity/|drive-by
contributions]. In brief, it's harder to get a new feature into Fossil
than into Git.

A larger feature set size is not necessarily a good thing. Git's command line
interface is famously arcane. Masters of the arcane are able to do
wizardly things, but only by studying their art deeply for years. This
strikes us as a good thing only in cases where use of the tool itself is
the primary point of that user's work.

Most DVCS users are not using a DVCS for its own sake, so we do not want
the DVCS with the most features, we want the one with a more easily
internalized behavior set, which we can pick up, use quickly, and then
set aside in order to get back to our
actual job as quickly as possible. There is some minimal set of features
required to achieve that, but there is a level beyond which more
features only slow us down while we're learning about the DVCS, as we
must plow through documentation on features we're not likely to ever
use. When the number of features grows
to the point where people of normal motivation cannot spend the time to
master them all, you make the tool <i>less</i> productive to use.

We achieve this balance between feature set size and ease of use by
carefully choosing which users to give commit bits to, then in being
choosy about which of the contributed feature branches to merge down to
trunk.

The end result is that Fossil more closely adheres to
[https://en.wikipedia.org/wiki/Principle_of_least_astonishment|the
principle of least astonishment] than Git does.


<h3 id="branches">2.4 Individual Branches vs. The Entire Change History</h3>

Both Fossil and Git store history as a directed acyclic graph (DAG)
of changes, but Git tends to focus more on individual branches of
the DAG, whereas Fossil puts more emphasis on the entire DAG.

For example, the default "sync" behavior in Git is to only sync
a single branch, whereas with Fossil the only sync option it to
sync the entire DAG.  Git commands,
GitHub, and GitLab tend to show only a single branch at
a time, whereas Fossil usually shows all parallel branches at
once.  Git has commands like "rebase" that help keep all relevant
changes on a single branch, whereas Fossil encourages a style of
many concurrent branches constantly springing into existance,
undergoing active development in parallel for a few days or weeks, then
merging back into the main line and disappearing.

This difference in emphasis arises from the different purposes of
the two systems.  Git focuses on individual branches, because that
is exactly what you want for a highly-distributed bazaar-style project
such as Linux.  Linus Torvalds does not want to see every check-in
by every contributor to Linux, as such extreme visibility does not scale
well.  But Fossil was written for the cathedral-style SQLite project
with just a handful of active committers.  Seeing all
changes on all branches all at once helps keep the whole team
up-to-date with what everybody else is doing, resulting in a more 
tightly focused and cohesive implementation.


<h3 id="executables">2.5 Lots of little tools vs. Self-contained system</h3>

Git consists of many small tools, each doing one small part of the job,
which can be recombined (by experts) to perform powerful operations.
Git has a lot of complexity and many dependencies and requires an "installer"
script or program to get it running.

Fossil is a single self-contained stand-alone executable with hardly
any dependencies.  Fossil can be (and often is) run inside a
minimally configured chroot jail.  To install Fossil,
one merely puts the executable somewhere in the $PATH.

The designer of Git says that the Unix philosophy is to have lots of
small tools that collaborate to get the job done.  The designer of
Fossil says that the Unix philosophy is "It just works."  Both
individuals have written their DVCSes to reflect their own view
of the "Unix philosophy."


<h3 id="checkouts">2.6 One vs. Many Check-outs per Repository</h3>

A "repository" in Git is a pile-of-files in the ".git" subdirectory
of a single check-out.  The check-out and the repository are located
together in the filesystem.

With Fossil, a "repository" is a single SQLite database file
that can be stored anywhere.  There
can be multiple active check-outs from the same repository, perhaps
open on different branches or on different snapshots of the same branch.
Long-running tests or builds can be running in one check-out while
changes are being committed in another.

Git version 2.5 adds a feature to emulate Fossil's decoupling of the
repository from the check-out tree, which it calls
"[https://git-scm.com/docs/git-worktree|git-worktree]." This command
sets up a series of links in the filesystem to
allow a single repository to host multiple check-outs.  However,
the interface is sufficiently difficult to use that most people
find it easier to create a separate clone for each check-out.
There are also practical consequences of the way it's implemented
that make worktrees not quite equivalent to the main Git repo + checkout
tree.

With Fossil, the complete decoupling of repository and check-out tree
means every working check-out tree is treated equally. It's common in
Fossil to have a check-out tree for each major working branch so that
you can switch branches with a "cd" command rather than replace the
current working file set with a different file set by updating in place,
as Git prefers.


<h3 id="history">2.7 What you should have done vs. What you actually did</h3>

Git puts a lot of emphasis on maintaining
a "clean" check-in history.  Extraneous and experimental branches by
individual developers often never make it into the main repository.  And
branches are often rebased before being pushed, to make
it appear as if development had been linear.  Git strives to record what
the development of a project should have looked like had there been no
mistakes.

Fossil, in contrast, puts more emphasis on recording exactly what happened,
including all of the messy errors, dead-ends, experimental branches, and
so forth.  One might argue that this
makes the history of a Fossil project "messy."  But another point of view
is that this makes the history "accurate."  In actual practice, the
superior reporting tools available in Fossil mean that the added "mess"
is not a factor.

Like Git, Fossil has an [/help?cmd=amend|amend command] for modifying
prior commits, but unlike in Git, this works not by replacing data in
the repository, but by adding a correction record to the repository that
affects how later Fossil operations present the corrected data. The old
information is still there in the repository, it is just overridden from
the amendment point forward. For extreme situations, Fossil adds the
[/doc/trunk/www/shunning.wiki|shunning mechanism], but it has strict
limitations that prevent global history rewrites.

One commentator characterized Git as recording history according to
the victors, whereas Fossil records history as it actually happened.


<h2 id="missing">3.0 Missing Features</h2>

Most of the capabilities found in Git are also available in Fossil and
the other way around. For example, both systems have local check-outs,
remote repositories, push/pull/sync, bisect capabilities, and a "stash."
Both systems store project history as a directed acyclic graph (DAG)
of immutable check-in objects.

But there are a few capabilities in one system that are missing from the
other.


<h3 id="missing-in-git">3.1 Features found in Fossil but missing from Git</h3>

  *  <b>The ability to show descendents of a check-in.</b>

   Both Git and Fossil can easily find the ancestors of a check-in.  But
   only Fossil shows the descendents.  (It is possible to find the
   descendents of a check-in in Git using the log, but that is sufficiently
   difficult that nobody ever actually does it.)

  *  <b>Wiki, Embedded documentation, Trouble-tickets, Tech-Notes, and Forum</b>

   Git only provides versioning of source code.  Fossil strives to provide
   other related project management services as well.

  *  <b>Named branches</b>

   Branches in Fossil have persistent names that are propagated
   to collaborators via [/help?cmd=push|push] and [/help?cmd=pull|pull].
   All developers see the same name on the same branch.  Git, in contrast,
   uses only local branch names, so developers working on the
   same project can (and frequently do) use a different name for the
   same branch.

  *  <b>The [/help?cmd=all|fossil all] command</b>

   Fossil keeps track of all repositories and check-outs and allows
   operations over all of them with a single command.  For example, in
   Fossil is possible to request a pull of all repositories on a laptop
   from their respective servers, prior to taking the laptop off network.
   Or it is possible to do "fossil all changes" to see if there are any
   uncommitted changes that were overlooked prior to the end of the workday.

  *  <b>The [/help?cmd=ui|fossil ui] command</b>

   Fossil supports an integrated web interface.  Some of the same features
   are available using third-party add-ons for Git, but they do not provide
   nearly as many features and they are not nearly as convenient to use.

  *  <b>The [/help?cmd=undo|fossil undo] command</b>

   Whenever Fossil is told to modify the local checkout in some
   destructive way ([/help?cmd=rm|fossil rm], [/help?cmd=update|fossil
   update], [/help?cmd=revert|fossil revert], etc.) Fossil remembers the
   prior state and is able to return the local check-out directory to
   its prior state with a simple "fossil undo" command. You
   [#history|cannot undo a commit], since writes to the actual
   repository — as opposed to the local check-out directory — are more
   or less permanent, on purpose, but as long as the change is simply
   staged locally, Fossil makes undo
   [https://git-scm.com/book/en/v2/Git-Basics-Undoing-Things|easier than
   in Git].


<h3 id="missing-in-fossil">3.2 Features found in Git but missing from Fossil</h3>

  *  <b>Rebase</b>

   Because of its emphasis on recording history exactly as it happened,
   rather than as we would have liked it to happen, Fossil deliberately
   does not provide a "rebase" command.  One can rebase manually in Fossil,
   with sufficient perserverence, but it is not something that can be done with
   a single command.

  *  <b>Push or pull a single branch</b>

   The [/help?cmd=push|fossil push], [/help?cmd=pull|fossil pull], and
   [/help?cmd=sync|fossil sync] commands do not provide the capability to
   push or pull individual branches.  Pushing and pulling in Fossil is
   all or nothing.  This is in keeping with Fossil's emphasis on maintaining
   a complete record and on sharing everything between all developers.

<hr/>

<h3>Asides and Digressions</h3>

<i><small><ol>
    <li><p>Git does not include a wiki, a ticket tracker, a forum, or a
    tech-note feature, so those elements will not transfer when
    exporting from Fossil to Git. GitHub adds some of these to stock
    Git, but because they're not part of Git proper,
    [./mirrortogithub.md|exporting a Fossil repository to GitHub] will
    still not include them; Fossil tickets do not become GitHub issues,
    for example.  See further discussion in the
    [./mirrorlimitations.md|Limitations on Git Mirrors] document</p></li>

    <li><p>Both Fossil and Git support
    [https://en.wikipedia.org/wiki/Patch_(Unix)|<tt>patch(1)</tt>
    files], a common way to allow drive-by contributions, but it's a
    lossy contribution path for both systems. Unlike Git PRs and Fossil
    bundles, patch files collapse mulitple checkins together, they don't
    include check-in comments, and they cannot encode changes made above
    the individual file content layer: you lose branching decisisions,
    tag changes, file renames, and more when using patch files.</p></li>
</ol></i></small>