Fossil

Diff
Login

Diff

Differences From Artifact [ce4ed43d67]:

To Artifact [9498881004]:


85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
 *  `[0-9a-fA-F]` Matches exactly one hexadecimal digit;
 *  `[a-]` Matches either `a` or `-`;
 *  `[][]` Matches either `]` or `[`;
 *  `[^]]` Matches exactly one character other than `]`;
 *  `[]^]` Matches either `]` or `^`; and
 *  `[^-]` Matches exactly one character other than `-`.

White space means the ASCII characters TAB, LF, VT, FF, CR, and SPACE.
Note that this does not include any of the many additional spacing
characters available in Unicode, and specifically does not include
U+00A0 NO-BREAK SPACE. 

Because both LF and CR are white space and leading and trailing spaces
are stripped from each glob in a list, a list of globs may be broken
into lines between globs when the list is stored in a file (as for a
versioned setting).

Similarly 'single quotes' and "double quotes" are the ASCII straight







|
|
|
|







85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
 *  `[0-9a-fA-F]` Matches exactly one hexadecimal digit;
 *  `[a-]` Matches either `a` or `-`;
 *  `[][]` Matches either `]` or `[`;
 *  `[^]]` Matches exactly one character other than `]`;
 *  `[]^]` Matches either `]` or `^`; and
 *  `[^-]` Matches exactly one character other than `-`.

White space means the specific ASCII characters TAB, LF, VT, FF, CR,
and SPACE.  Note that this does not include any of the many additional
spacing characters available in Unicode, and specifically does not
include U+00A0 NO-BREAK SPACE. 

Because both LF and CR are white space and leading and trailing spaces
are stripped from each glob in a list, a list of globs may be broken
into lines between globs when the list is stored in a file (as for a
versioned setting).

Similarly 'single quotes' and "double quotes" are the ASCII straight
110
111
112
113
114
115
116
117

118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133

134
135
136





137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
Before it is compared to a glob pattern, each file name is transformed
to a canonical form. The glob must match the entire canonical file
name to be considered a match.

The canonical name of a file has all directory separators changed to
`/`, redundant slashes are removed, all `.` path components are
removed, and all `..` path components are resolved. (There are
additional details we won't go into here.)


The goal is a name that is the simplest possible for each particular
file, and will be the same on Windows, Unix, and any other platform
where fossil is run.

Beware, however, that all glob matching is case sensitive. This will
not be a surprise on Unix where all file names are also case
sensitive. However, most Windows file systems are case preserving and
case insensitive. On Windows, the names `ReadMe` and `README` are
names of the same file; on Unix they are different files.

Some example cases:
 
 *  The glob `README` matches only a file named `README` in the root of
    the tree. It does not match a file named `src/README` because it
    does not include any characters that consumed the `src/` part. 

 *  The glob `*/README` does match `src/README`. Unlike Unix file
    globs, it also matches `src/library/README`. However it does not
    match the file `README` in the root of the tree.





 *  The glob `src/README` does match the file named `src\README` on
    Windows because all directory separators are rewritten as `/` in
    the canonical name before the glob is matched. This makes it much
    easier to write globs that work on both Unix and Windows.
 *  The glob `*.[ch]` matches every C source or header file in the
    tree at the root or at any depth. Again, this is (deliberately)
    different from Unix file globs and Windows wild cards.



## Where Globs are Used

### Settings that are Globs

These settings are all lists of glob patterns:

 * `binary-glob`
 * `clean-glob`
 * `crlf-glob`
 * `crnl-glob`
 * `encoding-glob`
 * `ignore-glob`
 * `keep-glob`

All may be [versioned, local, or global][settings]. Use `fossil
settings` to manage local and global settings, or a file in the
repository's `.fossil-settings/` folder at the root of the tree named
for each for versioned setting.

  [settings]: /doc/trunk/www/settings.wiki

Using versioned settings for these not only has the advantage that
they are tracked in the repository just like the rest of your project,
but you can more easily keep longer lists of more complicated glob
patterns than would be practical in either local or global settings.

The `ignore-glob` is an example of one setting that frequently grows
to be an elaborate list of files that should be ignored by most
commands. This is especially true when one (or more) IDEs are used in
a project because each IDE has its own ideas of how and where to cache
information that speeds up its browsing and building tasks but which
need not be preserved in your project's history.


### Commands that Refer to Globs

Many of the commands that respect the settings containing globs have
options to override some or all of the settings. These options are
usually named to correspond to the setting they override, such as
`--ignore` to override the `ignore-glob` setting. These commands are:

 * [`add`][]
 * [`addremove`][]
 * [`changes`][]
 * [`clean`][]
 * [`extras`][]
 * [`merge`][]
 * [`settings`][] 
 * [`status`][]
 * [`unset`][]

The commands [`tarball`][] and [`zip`][] produce compressed archives of a
specific checkin. They may be further restricted by options that
specify glob patterns that name files to include or exclude rather
than archiving the entire checkin.

The commands [`http`][], [`cgi`][], [`server`][], and [`ui`][] that







|
>

|
|
|




|
|





|
>



>
>
>
>
>









<






|
|
|
|
|
|
|

|




<
<




















|
|
|
|
|
|
|
|
|







110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152

153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171


172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
Before it is compared to a glob pattern, each file name is transformed
to a canonical form. The glob must match the entire canonical file
name to be considered a match.

The canonical name of a file has all directory separators changed to
`/`, redundant slashes are removed, all `.` path components are
removed, and all `..` path components are resolved. (There are
additional details we are ignoring here, but they cover rare edge
cases and also follow the principle of least surprise.)

The goal is to have a name that is the simplest possible for each
particular file, and that will be the same on Windows, Unix, and any
other platform where fossil is run.

Beware, however, that all glob matching is case sensitive. This will
not be a surprise on Unix where all file names are also case
sensitive. However, most Windows file systems are case preserving and
case insensitive. That is, on Windows, the names `ReadMe` and `README`
are names of the same file; on Unix they are different files.

Some example cases:
 
 *  The glob `README` matches only a file named `README` in the root of
    the tree. It does not match a file named `src/README` because it
    does not include any characters that consume (and match) the
    `src/` part. 
 *  The glob `*/README` does match `src/README`. Unlike Unix file
    globs, it also matches `src/library/README`. However it does not
    match the file `README` in the root of the tree.
 *  The glob `*README` does match `src/README` as well as the file
    `README` in the root of the tree as well as `foo/bar/README` or
    any other file named `README` in the tree. However, it also
    matches `A-DIFFERENT-README` and `src/DO-NOT-README`, or any other
    file whose name ends with `README`.
 *  The glob `src/README` does match the file named `src\README` on
    Windows because all directory separators are rewritten as `/` in
    the canonical name before the glob is matched. This makes it much
    easier to write globs that work on both Unix and Windows.
 *  The glob `*.[ch]` matches every C source or header file in the
    tree at the root or at any depth. Again, this is (deliberately)
    different from Unix file globs and Windows wild cards.



## Where Globs are Used

### Settings that are Globs

These settings are all lists of glob patterns:

 *  `binary-glob`
 *  `clean-glob`
 *  `crlf-glob`
 *  `crnl-glob`
 *  `encoding-glob`
 *  `ignore-glob`
 *  `keep-glob`

All may be [versioned, local, or global](settings.wiki). Use `fossil
settings` to manage local and global settings, or a file in the
repository's `.fossil-settings/` folder at the root of the tree named
for each for versioned setting.



Using versioned settings for these not only has the advantage that
they are tracked in the repository just like the rest of your project,
but you can more easily keep longer lists of more complicated glob
patterns than would be practical in either local or global settings.

The `ignore-glob` is an example of one setting that frequently grows
to be an elaborate list of files that should be ignored by most
commands. This is especially true when one (or more) IDEs are used in
a project because each IDE has its own ideas of how and where to cache
information that speeds up its browsing and building tasks but which
need not be preserved in your project's history.


### Commands that Refer to Globs

Many of the commands that respect the settings containing globs have
options to override some or all of the settings. These options are
usually named to correspond to the setting they override, such as
`--ignore` to override the `ignore-glob` setting. These commands are:

 *  [`add`][]
 *  [`addremove`][]
 *  [`changes`][]
 *  [`clean`][]
 *  [`extras`][]
 *  [`merge`][]
 *  [`settings`][] 
 *  [`status`][]
 *  [`unset`][]

The commands [`tarball`][] and [`zip`][] produce compressed archives of a
specific checkin. They may be further restricted by options that
specify glob patterns that name files to include or exclude rather
than archiving the entire checkin.

The commands [`http`][], [`cgi`][], [`server`][], and [`ui`][] that
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
shells. Fossil glob patterns also have a quoting mechanism, discussed
above. Because other parts of your operating system may interpret glob
patterns and quotes separately from Fossil, it is often difficult to
give glob patterns correctly to Fossil on the command line. Quotes and
special characters in glob patterns are likely to be interpreted when
given as part of a `fossil` command, causing unexpected behavior.

These problems do not affect [versioned settings
files](/doc/trunk/www/settings.wiki) or Admin &rarr; Settings in Fossil
UI. Consequently, it is better to set long-term `*-glob` settings via
these methods than to use `fossil settings` commands.

That advice doesn't help you when you are giving one-off glob patterns
in `fossil` commands. The remainder of this section gives remedies and
workarounds for these problems.


## POSIX Systems

If you are using Fossil on a system with a POSIX-compatible shell







|
|
|
|

|







253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
shells. Fossil glob patterns also have a quoting mechanism, discussed
above. Because other parts of your operating system may interpret glob
patterns and quotes separately from Fossil, it is often difficult to
give glob patterns correctly to Fossil on the command line. Quotes and
special characters in glob patterns are likely to be interpreted when
given as part of a `fossil` command, causing unexpected behavior.

These problems do not affect [versioned settings files](settings.wiki)
or Admin &rarr; Settings in Fossil UI. Consequently, it is better to
set long-term `*-glob` settings via these methods than to use `fossil
settings` commands.

That advice does not help you when you are giving one-off glob patterns
in `fossil` commands. The remainder of this section gives remedies and
workarounds for these problems.


## POSIX Systems

If you are using Fossil on a system with a POSIX-compatible shell
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
…which is compatible with the `fossil add` command's argument list,
which allows multiple files.

Now consider what happens instead if you say:

    $ fossil add --ignore RE* src/*.c

This *doesn't* do what you want because the shell will expand both `RE*`
and `src/*.c`, causing one of the two files matching the `RE*` glob
pattern to be ignored and the other to be added to the repository. You
need to say this in that case:

    $ fossil add --ignore 'RE*' src/*.c

The single quotes force a POSIX shell to pass the `RE*` glob pattern
through to Fossil untouched, which will do its own glob pattern
matching. There are other methods of quoting a glob pattern or escaping
its special characters; see your shell's manual.

Beware that Fossil's `--ignore` option doesn't override explicit file
mentions:

    $ fossil add --ignore 'REALLY SECRET STUFF.txt' RE*

You might think that would add everything beginning with `RE` *except*
for `REALLY SECRET STUFF.txt`, but when a file is both given explicitly
to Fossil and also matches an ignore rule, Fossil asks what you want to
do with it in the default case; it doesn't even ask if you gave the `-f`
or `--force` option along with `--ignore`.

The spaces in the ignored file name above bring us to another point:
such file names must be quoted in Fossil glob patterns, lest Fossil
interpret it as multiple glob patterns, but the shell interprets
quotation marks itself.

One way to fix both this and the previous problem is:







|











|





|
|
|
|







287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
…which is compatible with the `fossil add` command's argument list,
which allows multiple files.

Now consider what happens instead if you say:

    $ fossil add --ignore RE* src/*.c

This *does not* do what you want because the shell will expand both `RE*`
and `src/*.c`, causing one of the two files matching the `RE*` glob
pattern to be ignored and the other to be added to the repository. You
need to say this in that case:

    $ fossil add --ignore 'RE*' src/*.c

The single quotes force a POSIX shell to pass the `RE*` glob pattern
through to Fossil untouched, which will do its own glob pattern
matching. There are other methods of quoting a glob pattern or escaping
its special characters; see your shell's manual.

Beware that Fossil's `--ignore` option does not override explicit file
mentions:

    $ fossil add --ignore 'REALLY SECRET STUFF.txt' RE*

You might think that would add everything beginning with `RE` *except*
for `REALLY SECRET STUFF.txt`, but when a file is both given
explicitly to Fossil and also matches an ignore rule, Fossil asks what
you want to do with it in the default case; and it does not even ask
if you gave the `-f` or `--force` option along with `--ignore`.

The spaces in the ignored file name above bring us to another point:
such file names must be quoted in Fossil glob patterns, lest Fossil
interpret it as multiple glob patterns, but the shell interprets
quotation marks itself.

One way to fix both this and the previous problem is:
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380

    $ fossil add --ignore "'doc/REALLY SECRET STUFF.txt'" READ*

instead. The Fossil glob pattern still needs the `doc/` prefix because
Fossil always interprets glob patterns from the base of the checkout
directory, not from the current working directory as POSIX shells do.

When in doubt, use `fossil status` after running commands like the above
to make sure the right set of files were scheduled for insertion into
the repository before checking the changes in. You wouldn't want to
accidentally check something like a password, an API key, or the private
half of a public crypto key into Fossil repository that can be read by
people who should not have such secrets.


## Windows

Neither standard Windows command shell &mdash; `cmd.exe` or PowerShell
&mdash; expands glob patterns the way POSIX shells do. Windows command
shells rely on the command itself to do the glob pattern expansion. The
way this works depends on several factors:

*   the version of Windows you're using
*   which OS upgrades have been applied to it
*   the compiler that built your Fossil executable
*   whether you're running the command interactively
*   whether the command is built against a runtime system that does this
    at all
*   whether the Fossil command is being run from a file named `*.BAT` vs
    being named `*.CMD`

These factors also affect how a program like `fossil.exe` interprets
quotation marks on its command line.

The fifth item above doesn't apply to `fossil.exe` when built with
typical tool chains, but we'll see an example below where the exception
applies in a way that affects how Fossil interprets the glob pattern.

The most common problem is figuring out how to get a glob pattern passed
on the command line into `fossil.exe` without it being expanded by the C
runtime library that your particular Fossil executable is linked to,
which tries to act like the POSIX systems described above. Windows is
not strongly governed by POSIX, so it has not historically hewed closely







|
|
|
|
|
|









|
|
|
|
|

|





|
|







342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384

    $ fossil add --ignore "'doc/REALLY SECRET STUFF.txt'" READ*

instead. The Fossil glob pattern still needs the `doc/` prefix because
Fossil always interprets glob patterns from the base of the checkout
directory, not from the current working directory as POSIX shells do.

When in doubt, use `fossil status` after running commands like the
above to make sure the right set of files were scheduled for insertion
into the repository before checking the changes in. You never want to
accidentally check something like a password, an API key, or the
private half of a public cryptographic key into Fossil repository that
can be read by people who should not have such secrets.


## Windows

Neither standard Windows command shell &mdash; `cmd.exe` or PowerShell
&mdash; expands glob patterns the way POSIX shells do. Windows command
shells rely on the command itself to do the glob pattern expansion. The
way this works depends on several factors:

 *  the version of Windows you are using
 *  which OS upgrades have been applied to it
 *  the compiler that built your Fossil executable
 *  whether you are running the command interactively
 *  whether the command is built against a runtime system that does this
    at all
 *  whether the Fossil command is being run from a file named `*.BAT` vs
    being named `*.CMD`

These factors also affect how a program like `fossil.exe` interprets
quotation marks on its command line.

The fifth item above does not apply to `fossil.exe` when built with
typical tool chains, but we will see an example below where the exception
applies in a way that affects how Fossil interprets the glob pattern.

The most common problem is figuring out how to get a glob pattern passed
on the command line into `fossil.exe` without it being expanded by the C
runtime library that your particular Fossil executable is linked to,
which tries to act like the POSIX systems described above. Windows is
not strongly governed by POSIX, so it has not historically hewed closely
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428

This works because the built-in command `echo` does not expand its
arguments, and the `--args -` option makes it read further command
arguments from Fossil's standard input, which is connected to the output
of `echo` by the pipe. (`-` is a common Unix convention meaning
"standard input.")

Another correct approach is:

    C:\...> fossil setting crlf-glob *,

This works because the trailing comma prevents the command shell from
matching any files, unless you happen to have files named with a
trailing comma in the current directory. If the pattern matches no
files, it is passed into Fossil's `main()` function as-is by the C







|







418
419
420
421
422
423
424
425
426
427
428
429
430
431
432

This works because the built-in command `echo` does not expand its
arguments, and the `--args -` option makes it read further command
arguments from Fossil's standard input, which is connected to the output
of `echo` by the pipe. (`-` is a common Unix convention meaning
"standard input.")

Another (usually) correct approach is:

    C:\...> fossil setting crlf-glob *,

This works because the trailing comma prevents the command shell from
matching any files, unless you happen to have files named with a
trailing comma in the current directory. If the pattern matches no
files, it is passed into Fossil's `main()` function as-is by the C