Fossil

View Ticket
Login

View Ticket

2010-10-05
03:29 Fixed ticket [2a1e8e3c4b]: content_get can recursive too deep plus 1 other change ... (artifact: 4a8112443e user: drh)
03:29
Merge the small-stack changes into the trunk. This completes the fix for ticket [2a1e8e3c4b0b39e08fdde] ... (check-in: b8f134bbbb user: drh tags: trunk)
03:24
Fix issues with the prior commit on this branch. The small-stack non-recursive implementation appears to be working. Ticket [2a1e8e3c4b0b39e08fdde]. ... (Closed-Leaf check-in: f93a54d0ba user: drh tags: small_stack)
02:46
An attempt to reduce the depth of recursion in order to run better on systems with limited stack spack. Ticket [2a1e8e3c4b0b39e08fdde0]. This check-in compiles and runs but has issues. ... (check-in: 9664989c0f user: drh tags: small_stack)
2010-10-04
10:00 Ticket [2a1e8e3c4b] content_get can recursive too deep status still Open with 1 other change ... (artifact: f3c0c5e2dd user: anonymous)
2010-10-03
20:00
Dramatic performance improvement for "fossil deconstruct" and "fossil reconstruct" on large repositories. Add progress information for "fossil reconstruct". Possibly related to ticket [2a1e8e3c4b0b39e08fdde0]. Fix for ticket [76d3ecfdab577bdf843]. ... (check-in: 5f0201030c user: drh tags: trunk)
19:16 Ticket [2a1e8e3c4b] content_get can recursive too deep status still Open with 1 other change ... (artifact: 9918b41e2d user: drh)
19:01
For "fossil rebuild" increment the progress counter after each artifact is processed, rather than waiting for its delta children to be processed, in order to give a more uniform progress indication. Possibly related to ticket [2a1e8e3c4b0b39e08fdde]. ... (check-in: ae000c23fa user: drh tags: trunk)
18:32 Ticket [2a1e8e3c4b] content_get can recursive too deep status still Open with 1 other change ... (artifact: 413917f5ef user: drh)
18:02 Ticket [2a1e8e3c4b]: 1 change ... (artifact: 446360743b user: drh)
17:52 Ticket [2a1e8e3c4b]: 1 change ... (artifact: 9d4fe50867 user: anonymous)
12:52 Ticket [2a1e8e3c4b]: 3 changes ... (artifact: 6541ee201e user: anonymous)
02:40 New ticket [2a1e8e3c4b]. ... (artifact: 2531e176fb user: anonymous)

Ticket Hash: 2a1e8e3c4b0b39e08fdde0d24d9fb35fbc66d39a
Title: content_get can recursive too deep
Status: Fixed Type: Code_Defect
Severity: Severe Priority:
Subsystem: Resolution: Fixed
Last Modified: 2010-10-05 03:29:45
Version Found In: c492eab395
Description:
I'm trying to convert an existing larger CVS repository into fossil. The process bar from rebuild_db had reached the 100% and fossil was spending several minutes without giving any indication on what. After attaching gdb, I saw
#2556 0x000000000040d048 in content_get (rid=11398, pBlob=<value optimized out>) at content_.c:256

in the bt output. So it seems like it is creating a stack frame for ~every CVS revision of a file. That is screaming like it should be done differently by not depending on huge stacks.


anonymous added on 2010-10-03 12:52:15:
This issue results in stack overflows with the default 2MB limit while cloning a medium sized (40MB after vacuum) repository. It also seems to be responsible for the 6h of rebuild_db() after the equivalent of fossil reconstruct.


anonymous claiming to be Joerg Sonnenberger added on 2010-10-03 17:52:58:
Bumping the content cache from 50 entries to 2500 brings the rebuild time for this repository from 6h to 2min or so. It doesn't fix the deep stack nesting.


drh added on 2010-10-03 18:02:19:
I don't understand your issue with the stack size. Yes, content_get() is deeply recursive, but each stack frame is only 52 bytes. What workstation these days can't handle 100K or more recursions of a 52-byte stack frame?

Content is stored as a sequence of deltas. So the extraction algorithm is inherently recursive. We could switch to using a loop and store the recursion information in memory obtained from the heap. But what does that really accomplish, other than making the code more obtuse.

We've not had performance issues with rebuild before. This suggests that your repository has a different structure than what we have seen in the past. Can we clone a copy of your repository for further study, so that we can get a better grip on the source of your performance problem?


drh added on 2010-10-03 18:32:30:
Changing the cache size from 50 to 2500 makes no measurable difference in the rebuild time for the 10.35-year, 8400 check-in, 46MB repository used by SQLite. This is further evidence that your repository structure is different from what we are used to and that we will need to see your repository in order to figure out what is slowing things down for you.


drh added on 2010-10-03 19:16:34:
Perhaps I have misunderstood the nature of the problem reported above. Are you trying to run "fossil rebuild" or "fossil reconstruct"? I've been assuming "fossil rebuild", but now I'm thinking I misunderstood.

There is an inefficiency in "fossil reconstruct", I think. And I think I have a simple fix. But I need to fix some inefficiencies in "fossil deconstruct" first, so that I can construct a large set of files from which to test "fossil reconstruct". I'll try to post a patch soon....


anonymous added on 2010-10-04 10:00:21:
I'm seeing this as part of the rebuild step of reconstruct and clone. I haven't tried "fossil rebuild". With "fossil clone", the stack nesting is deep enough to hit the stack limit. Backtrace starts with a few thousand content_get and ends with a few thousand of:

#14478 0x000000000040d6a0 in after_dephantomize (rid=25789, linkFlag=1) at content_.c:361

I can't share this repository, but the problematic part is a single file that has ~3000 commits, incrementally extending it. Think of a ChangeLog if you want.

I think the best approach here is to avoid deep recursions by limiting the length of a delta chain. What do you think about storing the start and depth of the delta chain with each delta and introducing an on-disk content cache to cut it when it reaches a specific limit. Phantoms could be processed as forward process without recursion by using a small todo bag.