25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
|
Check-in 3 is derived from check-in 2, making
3 a child of 2. We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1
and 2 are both <i>ancestors</i> of 3.
<h2 id="dag">DAGs</h2>
The graph of check-ins is a
[http://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic graph]
commonly shortened to <i>DAG</i>. Check-in 1 is the <i>root</i> of the DAG
since it has no ancestors. Check-in 4 is a <i>leaf</i> of the DAG since
it has no descendants. (We will give a more precise definition later of
"leaf.")
Alas, reality often interferes with the simple linear development of a
project. Suppose two programmers make independent modifications to check-in 2.
After both changes are committed, the check-in graph looks like Figure 2:
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
<tr><td align="center">
<img src="branch02.svg"><br>
Figure 2
</td></tr></table>
The graph in Figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has
two children, check-ins 3 and 4. We call this state a <i>fork</i>.
Fossil tries to prevent forks. Suppose two programmers named Alice and
Bob are each editing check-in 2 separately. Alice finishes her edits
first and commits her changes, resulting in check-in 3. Later, when Bob
attempts to commit his changes, Fossil verifies that check-in 2 is still
a leaf. Fossil sees that check-in 3 has occurred and aborts Bob's commit
attempt with a message "would fork." This allows Bob to do a "fossil
update" which pulls in Alice's changes, merging them into his own
changes. After merging, Bob commits check-in 4 as a child of check-in 3.
The result is a linear graph as shown in Figure 1. This is how CVS
works. This is also how Fossil works in [./concepts.wiki#workflow |
"autosync"] mode.
But perhaps Bob is off-network when he does his commit, so he
has no way of knowing that Alice has already committed her changes.
Or, it could be that Bob has turned off "autosync" mode in Fossil. Or,
maybe Bob just doesn't want to merge in Alice's changes before he has
saved his own, so he forces the commit to occur using the "--allow-fork"
option to the <b>fossil commit</b> command. For any of these reasons,
two commits against check-in 2 have occurred and now the DAG has two leaves.
So which version of the project is the "latest" in the sense of having
the most features and the most bug fixes? When there is more than
one leaf in the graph, you don't really know, so we like to have
check-in graphs with a single leaf.
Fossil resolves such problems using the check-in time on the leaves to
decide which leaf to use as the parent of new leaves. When a branch is
forked as in Figure 2, Fossil will choose check-in 4 as the parent for a
later check-in 5, but <i>only</i> if it has sync'd that check-in down
into the local repository. If autosync is disabled or the user is
off-network when that fifth check-in occurs, so that check-in 3 is the
latest on that branch at the time within that clone of the repository,
Fossil will make check-in 3 the parent of check-in 5!
Fossil also uses a forked branch's leaf check-in timestamps when
checking out that branch: it gives you the fork with the latest
check-in, which in turn selects which parent your next check-in will be
a child of. This situation means development on that branch can fork
into two independent lines of development, based solely on which branch
tip is newer at the time the next user starts his work on it. Because
of this, we strongly recommend that you do not intentionally create
forks on long-lived shared working branches with "--allow-fork". (Prime
example: trunk.)
Let us return to Figure 2. To resolve such situations before they can
become a real problem, Alice can use the <b>fossil merge</b> command to
merge Bob's changes into her local copy of check-in 3. Then she can
commit the results as check-in 5. This results in a DAG as shown in
Figure 3.
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
<tr><td align="center">
<img src="branch03.svg"><br>
Figure 3
</td></tr></table>
Check-in 5 is a child of check-in 3 because it was created by editing
check-in 3. But check-in 5 also inherits the changes from check-in 4 by
virtue of the merge. So we say that check-in 5 is a <i>merge child</i>
of check-in 4 and that it is a <i>direct child</i> of check-in 3.
The graph is now back to a single leaf, check-in 5.
We have already seen that if Fossil is in autosync mode then Bob would
have been warned about the potential fork the first time he tried to
commit check-in 4. If Bob had updated his local check-out to merge in
Alice's check-in 3 changes, then committed, then the fork would have
never occurred. The resulting graph would have been linear, as shown
in Figure 1.
Realize that the graph of Figure 1 is a subset of Figure 3. Hold your
hand over the check-in 4 circle of Figure 3 and then Figure 3 looks
exactly like Figure 1, except that the leaf has a different check-in
number, but that is just a notational difference — the two check-ins
have exactly the same content. In other words, Figure 3 is really a
superset of Figure 1. The check-in 4 of Figure 3 captures additional
state which is omitted from Figure 1. Check-in 4 of Figure 3 holds a
copy of Bob's local checkout before he merged in Alice's changes. That
snapshot of Bob's changes, which is independent of Alice's changes, is
omitted from Figure 1. Some people say that the approach taken in
Figure 3 is better because it preserves this extra intermediate state.
Others say that the approach taken in Figure 1 is better because it is
much easier to visualize a linear line of development and because the
merging happens automatically instead of as a separate manual step. We
will not take sides in that debate. We will simply point out that
Fossil enables you to do it either way.
<h2 id="branching">The Alternative to Forking: Branching</h2>
Having more than one leaf in the check-in DAG is called a "fork." This
is usually undesirable and either avoided entirely,
as in Figure 1, or else quickly resolved as shown in Figure 3.
But sometimes, one does want to have multiple leaves. For example, a project
might have one leaf that is the latest version of the project under
development and another leaf that is the latest version that has been
tested.
When multiple leaves are desirable, we call this <i>branching</i>
instead of <i>forking</i>:
<blockquote>
<b>Key Distinction:</b> A branch is a <i>named, intentional</i> fork.
</blockquote>
Forks <i>may</i> be intentional, but most of the time, they're accidental.
Figure 4 shows an example of a project where there are two branches, one
for development work and another for testing.
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
<tr><td align="center">
<img src="branch04.svg"><br>
Figure 4
</td></tr></table>
The hypothetical scenario of Figure 4 is this: The project starts and
progresses to a point where (at check-in 2)
it is ready to enter testing for its first release.
In a real project, of course, there might be hundreds or thousands of
check-ins before a project reaches this point, but for simplicity of
presentation we will say that the project is ready after check-in 2.
The project then splits into two branches that are used by separate
teams. The testing team, using the blue branch, finds and fixes a few
bugs. This is shown by check-ins 6 and 9. Meanwhile the development
team, working on the top uncolored branch,
is busy adding features for the second
release. Of course, the development team would like to take advantage of
the bug fixes implemented by the testing team. So periodically, the
changes in the test branch are merged into the dev branch. This is
shown by the dashed merge arrows between check-ins 6 and 7 and between
check-ins 9 and 10.
In both Figures 2 and 4, check-in 2 has two children. In Figure 2,
we call this a "fork." In diagram 4, we call it a "branch." What is
the difference? As far as the internal Fossil data structures are
concerned, there is no difference. The distinction is in the intent.
In Figure 2, the fact that check-in 2 has multiple children is an
accident that stems from concurrent development. In Figure 4, giving
check-in 2 multiple children is a deliberate act. So, to a good
approximation, we define forking to be by accident and branching to
be by intent. Apart from that, they are the same.
Fossil offers two primary ways to create named, intentional forks,
a.k.a. branches. First:
<pre>
$ fossil commit --branch my-new-branch-name
</pre>
This is the method we recommend for most cases: it creates a branch as
part of a checkin using the version in the current checkout directory
as its basis. (This is normally the tip of the current branch, though
it doesn't have to be. You can create a branch from an ancestor checkin
on a branch as well.) After making this branch-creating
checkin, your local working directory is switched to that branch, so
that further checkins occur on that branch as well, as children of the
tip checkin on that branch.
The second, more complicated option is:
<pre>
$ fossil branch new my-new-branch-name trunk
$ fossil update my-new-branch-name
$ fossil commit
</pre>
Not only is this three commands instead of one, the first of which is
longer than the entire simpler command above, you must give the second command
before creating any checkins, because until you do, your local working
directory remains on the same branch it was on at the time you issued
the command, so that the commit would otherwise put the new material on
the original branch instead of the new one.
In addition to those problems, the second method is a violation of the
[https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it|YAGNI
Principle]. We recommend that you wait until you actually need the
branch and create it using the first command above.
(Keep in mind that trunk is just another branch in Fossil. It is simply
the default branch name for the first checkin and every checkin made as
one of its direct descendants. It is special only in that it is Fossil's
default when it has no better idea of which branch you mean.)
<h2 id="forking">Justifications For Forking</h2>
The primary cases where forking is justified over branching are all when
it is done purely in software in order to avoid losing information:
<ol>
<li><p id="offline">By Fossil itself when two users check in children to the same
leaf of a branch, as in Figure 2. If the fork occurs because
autosync is disabled on one or both of the repositories or because
the user doing the check-in has no network connection at the moment
of the commit, Fossil has no way of knowing that it is creating a
fork until the two repositories are later synchronized.</p></li>
<li><p id="dist-clone">By Fossil when the cloning hierarchy is more
than 2 levels deep.
<br><br>
[./sync.wiki|Fossil's synchronization protocol] is a two-party
negotiation; syncs don't automatically propagate up the clone tree
beyond that. Because of that, if you have a master repository and
Alice clones it, then Bobby clones from Alice's repository, a
check-in by Bobby that autosyncs with Alice's repo will <i>not</i>
also autosync with the master repo. The master doesn't get a copy of
Bobby's checkin until Alice <i>separately</i> syncs with the master.
If Carol cloned from the master repo and checks something in that
creates a fork relative to Bobby's check-in, the master repo won't
know about that fork until Alice syncs her repo with the master.
Even then, realize that Carol still won't know about the fork until
she subsequently syncs with the master repo.
<br><br>
One way to deal with this is to just accept it as a fact of using a
[https://en.wikipedia.org/wiki/Distributed_version_control|Distributed
Version Control System] like Fossil.
<br><br>
Another option, which we recommend you consider carefully, is to
make it a local policy that checkins be made only against the master
repo or one of its immediate child clones so that the autosync
algorithm can do its job most effectively; any clones deeper than
that should be treated as read-only and thus get a copy of the new
state of the world only once these central repos have negotiated
that new state. This policy avoids a class of inadvertent fork you
might not need to tolerate. Since [#bad-fork|forks on long-lived
shared working branches can end up dividing a team's development
effort], a team may easily justify this restriction on distributed
cloning.</p></li>
<li><p id="automation">You've automated Fossil (e.g. with a shell script) and
forking is a possibility, so you write <b>fossil commit
--allow-fork</b> commands to prevent Fossil from refusing the
check-in because it would create a fork. It's better to write such
a script to detect this condition and cope with it (e.g. <b>fossil
update</b>) but if the alternative is losing information, you may
feel justified in creating forks that an interactive user must later
clean up with <b>fossil merge</b> commands.</p></li>
</ol>
That leaves only one case where we can recommend use of "--allow-fork"
by interactive users: when you're working on
a personal branch so that creating a dual-tipped branch isn't going to
cause any other user an inconvenience or risk forking the development.
Only one developer is involved, and the fork may be short-lived, so
there is no risk of [#bad-fork|inadvertently forking the overall development effort].
This is a good alternative to branching when you just need to
temporarily fork the branch's development. It avoids cluttering the
global branch namespace with short-lived temporary named branches.
There's a common generalization of that case: you're a solo developer,
so that the problems with branching vs forking simply don't matter. In
that case, feel free to use "--allow-fork" as much as you like.
<h2 id="fix">Fixing Forks</h2>
|
|
>
>
>
|
|
|
|
<
|
|
|
|
|
|
|
>
>
|
|
|
|
|
|
>
|
>
|
|
|
>
>
>
>
|
|
>
>
|
|
|
|
|
|
|
|
|
>
>
|
>
>
|
<
<
<
<
<
<
|
|
|
|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|
|
|
|
|
|
|
|
|
|
|
>
>
|
|
|
|
<
|
|
>
>
>
|
>
>
|
|
|
|
|
|
|
|
<
|
|
>
|
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
|
Check-in 3 is derived from check-in 2, making
3 a child of 2. We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1
and 2 are both <i>ancestors</i> of 3.
<h2 id="dag">DAGs</h2>
The graph of check-ins is a
[http://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic graph],
commonly shortened to <i>DAG</i>. Check-in 1 is the <i>root</i> of the DAG
since it has no ancestors. Check-in 4 is a <i>leaf</i> of the DAG since
it has no descendants. (We will give a more precise definition later of
"leaf.")
Alas, reality often interferes with the simple linear development of a
project. Suppose two programmers make independent modifications to check-in 2.
After both changes are committed, the check-in graph looks like Figure 2:
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
<tr><td align="center">
<img src="branch02.svg"><br>
Figure 2
</td></tr></table>
The graph in Figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has
two children, check-ins 3 and 4. We call this state a <i>fork</i>.
Fossil tries to prevent forks, primarily through its
"[./concepts.wiki#workflow | autosync]" mechanism.
Suppose two programmers named Alice and
Bob are each editing check-in 2 separately. Alice finishes her edits
and commits her changes first, resulting in check-in 3. When Bob later
attempts to commit his changes, Fossil verifies that check-in 2 is still
a leaf. Fossil sees that check-in 3 has occurred and aborts Bob's commit
attempt with a message "would fork." This allows Bob to do a "fossil
update" to pull in Alice's changes, merging them into his own
changes. After merging, Bob commits check-in 4 as a child of check-in 3.
The result is a linear graph as shown in Figure 1. This is how CVS
works. This is also how Fossil works in autosync mode.
But perhaps Bob is off-network when he does his commit, so he has no way
of knowing that Alice has already committed her changes. Or, it could
be that Bob has turned off "autosync" mode in Fossil. Or, maybe Bob
just doesn't want to merge in Alice's changes before he has saved his
own, so he forces the commit to occur using the "--allow-fork" option to
the <b>[/help?cmd=commit | fossil commit]</b> command. For any of these
reasons, two commits against check-in 2 have occurred, so the DAG now
has two leaves.
In such a condition, a person working with this repository has a
dilemma: which version of the project is the "latest" in the sense of
having the most features and the most bug fixes? When there is more
than one leaf in the graph, you don't really know, which is why we
would ideally prefer to have linear check-in graphs.
Fossil resolves such problems using the check-in time on the leaves to
decide which leaf to use as the parent of new leaves. When a branch is
forked as in Figure 2, Fossil will choose check-in 4 as the parent for a
later check-in 5, but <i>only</i> if it has sync'd that check-in down
into the local repository. If autosync is disabled or the user is
off-network when that fifth check-in occurs so that check-in 3 is the
latest on that branch at the time within that clone of the repository,
Fossil will make check-in 3 the parent of check-in 5! We show practical
consequences of this [#bad-fork | later in this article].
Fossil also uses a forked branch's leaf check-in timestamps when
checking out that branch: it gives you the fork with the latest
check-in, which in turn selects which parent your next check-in will be
a child of. This situation means development on that branch can fork
into two independent lines of development, based solely on which branch
tip is newer at the time the next user starts his work on it.
Because of these potential problems, we strongly recommend that you do
not intentionally create forks on long-lived shared working branches
with "--allow-fork". (Prime example: trunk.) The inverse case —
intentional forks on short-lived single-developer branches — is far
easier to justify, since presumably the lone developer is never confused
about why there are two or more leaves on that branch. Further
justifications for intentional forking are [#forking | given below].
Let us return to Figure 2. To resolve such situations before they can
become a real problem, Alice can use the <b>[/help?cmd=merge | fossil
merge]</b> command to merge Bob's changes into her local copy of
check-in 3. Without arguments, that command merges all leaves on the
current branch. Alice can then verify that the merge is sensible and if
so, commit the results as check-in 5. This results in a DAG as shown in
Figure 3.
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
<tr><td align="center">
<img src="branch03.svg"><br>
Figure 3
</td></tr></table>
Check-in 5 is a child of check-in 3 because it was created by editing
check-in 3, but since check-in 5 also inherits the changes from check-in 4 by
virtue of the merge, we say that check-in 5 is a <i>merge child</i>
of check-in 4 and that it is a <i>direct child</i> of check-in 3.
The graph is now back to a single leaf, check-in 5.
We have already seen that if Fossil is in autosync mode then Bob would
have been warned about the potential fork the first time he tried to
commit check-in 4. If Bob had updated his local check-out to merge in
Alice's check-in 3 changes, then committed, the fork would have
never occurred. The resulting graph would have been linear, as shown
in Figure 1.
Realize that the graph of Figure 1 is a subset of Figure 3. If you hold your
hand over the ④ in Figure 3, it looks
exactly like Figure 1 except that the leaf has a different check-in
number. That is just a notational difference: the two check-ins
have exactly the same content.
Inversely, Figure 3 is a
superset of Figure 1. The check-in 4 of Figure 3 captures additional
state which is omitted from Figure 1. Check-in 4 of Figure 3 holds a
copy of Bob's local checkout before he merged in Alice's changes. That
snapshot of Bob's changes, which is independent of Alice's changes, is
omitted from Figure 1.
Some people say that the development approach taken in
Figure 3 is better because it preserves this extra intermediate state.
Others say that the approach taken in Figure 1 is better because it is
much easier to visualize linear development and because the
merging happens automatically instead of as a separate manual step. We
will not take sides in that debate. We will simply point out that
Fossil enables you to do it either way.
<h2 id="branching">The Alternative to Forking: Branching</h2>
Having more than one leaf in the check-in DAG is called a "fork." This
is usually undesirable and either avoided entirely,
as in Figure 1, or else quickly resolved as shown in Figure 3.
But sometimes, one does want to have multiple leaves. For example, a project
might have one leaf that is the latest version of the project under
development and another leaf that is the latest version that has been
tested.
When multiple leaves are desirable, we call this <i>branching</i>
instead of <i>forking</i>:
Figure 4 shows an example of a project where there are two branches, one
for development work and another for testing.
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
<tr><td align="center">
<img src="branch04.svg"><br>
Figure 4
</td></tr></table>
Figure 4 diagrams the following scenario: the project starts and
progresses to a point where (at check-in 2)
it is ready to enter testing for its first release.
In a real project, of course, there might be hundreds or thousands of
check-ins before a project reaches this point, but for simplicity of
presentation we will say that the project is ready after check-in 2.
The project then splits into two branches that are used by separate
teams. The testing team, using the blue branch, finds and fixes a few
bugs with check-ins 6 and 9. Meanwhile, the development
team, working on the top uncolored branch,
is busy adding features for the second
release. Of course, the development team would like to take advantage of
the bug fixes implemented by the testing team, so periodically the
changes in the test branch are merged into the dev branch. This is
shown by the dashed merge arrows between check-ins 6 and 7 and between
check-ins 9 and 10.
In both Figures 2 and 4, check-in 2 has two children. In Figure 2,
we call this a "fork." In diagram 4, we call it a "branch." What is
the difference? As far as the internal Fossil data structures are
concerned, there is no difference. The distinction is in the intent.
In Figure 2, the fact that check-in 2 has multiple children is an
accident that stems from concurrent development. In Figure 4, giving
check-in 2 multiple children is a deliberate act. To a good
approximation, we define forking to be by accident and branching to
be by intent. Apart from that, they are the same.
When the fork is intentional, it helps humans to understand what is
going on if we <i>name</i> the forks. This is not essential to Fossil's
internal data model, but humans have trouble working with long-lived
branches identified only by the commit ID currently at its tip, being a
long string of hex digits. Therefore, Fossil conflates two concepts:
branching as intentional forking and the naming of forks as branches.
They are in fact separate concepts, but since Fossil is intended to be
used primarily by humans, we combine them in Fossil's human user
interfaces.
<blockquote>
<b>Key Distinction:</b> A branch is a <i>named, intentional</i> fork.
</blockquote>
Unnamed forks <i>may</i> be intentional, but most of the time, they're
accidental and left unnamed.
Fossil offers two primary ways to create named, intentional forks,
a.k.a. branches. First:
<pre>
$ fossil commit --branch my-new-branch-name
</pre>
This is the method we recommend for most cases: it creates a branch as
part of a check-in using the version in the current checkout directory
as its basis. (This is normally the tip of the current branch, though
it doesn't have to be. You can create a branch from an ancestor check-in
on a branch as well.) After making this branch-creating
check-in, your local working directory is switched to that branch, so
that further check-ins occur on that branch as well, as children of the
tip check-in on that branch.
The second, more complicated option is:
<pre>
$ fossil branch new my-new-branch-name trunk
$ fossil update my-new-branch-name
$ fossil commit
</pre>
Not only is this three commands instead of one, the first of which is
longer than the entire simpler command above, you must give the second command
before creating any check-ins, because until you do, your local working
directory remains on the same branch it was on at the time you issued
the command, so that the commit would otherwise put the new material on
the original branch instead of the new one.
In addition to those problems, the second method is a violation of the
[https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it|YAGNI
Principle]. We recommend that you wait until you actually need the
branch before you create it using the first command above.
The "trunk" is just another named branch in Fossil. It is simply
the default branch name for the first check-in and every check-in made as
one of its direct descendants. It is special only in that it is Fossil's
default when it has no better idea of which branch you mean.
<h2 id="forking">Justifications For Forking</h2>
The primary cases where forking is justified over branching are all when
it is done purely in software in order to avoid losing information:
<ol>
<li><p id="offline">By Fossil itself when two users check in children to the same
leaf of a branch, as in Figure 2.
<br><br>
If the fork occurs because
autosync is disabled on one or both of the repositories or because
the user doing the check-in has no network connection at the moment
of the commit, Fossil has no way of knowing that it is creating a
fork until the two repositories are later synchronized.</p></li>
<li><p id="dist-clone">By Fossil when the cloning hierarchy is more
than 2 levels deep.
<br><br>
[./sync.wiki|Fossil's synchronization protocol] is a two-party
negotiation; syncs don't automatically propagate up the clone tree
beyond that. Because of that, if you have a master repository and
Alice clones it, then Bobby clones from Alice's repository, a
check-in by Bobby that autosyncs with Alice's repo will <i>not</i>
also autosync with the master repo. The master doesn't get a copy of
Bobby's check-in until Alice <i>separately</i> syncs with the master.
If Carol cloned from the master repo and checks something in that
creates a fork relative to Bobby's check-in, the master repo won't
know about that fork until Alice syncs her repo with the master.
Even then, realize that Carol still won't know about the fork until
she subsequently syncs with the master repo.
<br><br>
One way to deal with this is to just accept it as a fact of using a
[https://en.wikipedia.org/wiki/Distributed_version_control|Distributed
Version Control System] like Fossil.
<br><br>
Another option, which we recommend you consider carefully, is to
make it a local policy that check-ins be made only directly against the master
repo or one of its immediate child clones so that the autosync
algorithm can do its job most effectively. Any clones deeper than
that should be treated as read-only and thus get a copy of the new
state of the world only once these central repos have negotiated
that new state. This policy avoids a class of inadvertent fork you
might not need to tolerate. Since [#bad-fork|forks on long-lived
shared working branches can end up dividing a team's development
effort], a team may easily justify this restriction on distributed
cloning.</p></li>
<li><p id="automation">You've automated Fossil, so you use
<b>fossil commit --allow-fork</b> commands to prevent Fossil from
refusing the check-in simply because it would create a fork.
<br><br>
If you are writing such a tool — e.g. a shell script to make
multiple manipulations on a Fossil repo — it's better to make it
smart enough to detect this condition and cope with it, such as
by making a call to <b>[/help?cmd=update | fossil update]</b>
and checking for a merge conflict. That said, if the alternative is
losing information, you may feel justified in creating forks that an
interactive user must later manually clean up with <b>fossil merge</b>
commands.</p></li>
</ol>
That leaves only one case where we can recommend use of "--allow-fork"
by interactive users: when you're working on a personal branch so that
creating a dual-tipped branch isn't going to cause any other user an
inconvenience or risk [#bad-fork | inadvertently forking the development
effort]. In such a case, the lone developer working on that branch is
not confused, since the fork in development is intentional. Sometimes it
simply makes no sense to bother creating a name, cluttering the global
branch namespace, simply to convert an intentional fork into a "branch."
This is especially the case when the fork is short-lived.
There's a common generalization of that case: you're a solo developer,
so that the problems with branching vs forking simply don't matter. In
that case, feel free to use "--allow-fork" as much as you like.
<h2 id="fix">Fixing Forks</h2>
|