From: Joaquín Cuenca Abela (cuenca@pacaterie.u-psud.fr)
Date: Sun Jul 21 2002 - 15:22:40 EDT
On Sat, 2002-07-20 at 14:56, Joaquín Cuenca Abela wrote:
> just to let you that I've found the problem. The layout redoes each
run
> n*(n + 1) / 2 times (n being the number of pages).
>
> That gives us ~ 205^2 / 2 * the number of runs that we should really
> have.
>
> I don't have enough time to explain all that right now. I will try to
> fix it tomorrow and send an explanation.
Unfortunatelly, I didn't have enough time to fix all that. I will give
now a detailed explanation of what is going on (at least that what I can
deduce with the limited tools that I have to do performance
measurements...)
The document has, as I said, has 205 pages. It has 165382 runs in the
whole document.
We load the whole document in memory, we put all the runs in the first
page, and then we call fl_DocSectionLayout::breakSection. At the end,
breakSection does:
// Bump content down the columns
while (pCurColumn != NULL && pCurColumn != pNextColumn)
{
pCurColumn->bumpContainers(pLastContainerToKeep);
[...]
pCurColumn = (fp_Column *) pCurColumn->getNext();
[...]
}
That's, for each column (as I have 1 column by page, that means for each
page), we bumpContainers. That takes all the runs that don't fit in the
current container, and bump them to the next container.
Things go like that:
1. Put every line in the first page.
2. i = 1
3. Get every line that don't fits in the i'th page, and put it in the
i+1'th page. If everything fits in the i'th page, we're done.
4. ++i, and go to 3.
If we measure the cost of the operation in number of lines "reflowed",
we get:
t: nb of lines in the document
n: nb of lines by page (approx.)
N: nb of pages
t = n * N
cost = t + (t - n) + (t - 2n) + ... + (t - n * N)
so
cost = N * t - (N + 1) * t / 2 = t * (N - (N + 1) / 2) ~= t * N / 2
we have N = 205, t = 6462, so that means cost ~= 662355 (when it should
be 6462).
Come back to the output given by the profiler (I've removed the names of
the classes to make it more readable here):
0.14 5.68 204/204 breakSection(fl_ContainerLayout *) [10]
0.14 5.68 204 bumpContainers(fp_ContainerObject *) [11]
0.11 5.40 656302/656302 addContainer(fp_Container *) [15]
0.11 0.10 656302/1997342 clearScreen(void) [61]
Indeed, this 100 times factor is quite important, because we end calling
some fp_Run methods (for each run in each line). We have ~25.6 runs by
line, so that means that we call 2560 times some fp_Run methods.
(In addition we have some stupid errors, like in fp_Line::clearScreen we
start with
bool bNeedsClearing = true;
when AFAICS it should be
bool bNeedsClearing = false;
)
I've not yet studied all the branches that this implies, but AFAICS,
this factor is the main responsable of the 6 millions and 9 millions
times that some functions are called.
If somebody is able to study all that in deep, be my guest. I've done
my best with the tools that I have. Btw, I've tried FunctionChecker and
done a program to make some sense out of its output + a script to
convert from the raw address it gives to the name of the functions that
are at these address, etc. but at the end it blocks when I try to
measure the performance reading big files (or it takes a looooooong time
to complete). So I'm still only with gprof.
And a last note, the breakSection (the root of this problem) is
convoluted (>500 lines for a single function?). If somebody knows
exactly how it works, please add some comments :)
Cheers,
-- Joaquín Cuenca Abela cuenca@pacaterie.u-psud.fr-- Joaquín Cuenca Abela cuenca@pacaterie.u-psud.fr
This archive was generated by hypermail 2.1.4 : Sun Jul 21 2002 - 15:28:00 EDT