Re: performance issues

From: Joaquín Cuenca Abela (
Date: Sun Jul 21 2002 - 15:22:40 EDT

  • Next message: Patrick Lam: "a pair of commits: xft and table fixes"

    On Sat, 2002-07-20 at 14:56, Joaquín Cuenca Abela wrote:
    > just to let you that I've found the problem. The layout redoes each
    > n*(n + 1) / 2 times (n being the number of pages).
    > That gives us ~ 205^2 / 2 * the number of runs that we should really
    > have.
    > I don't have enough time to explain all that right now. I will try to
    > fix it tomorrow and send an explanation.

    Unfortunatelly, I didn't have enough time to fix all that. I will give
    now a detailed explanation of what is going on (at least that what I can
    deduce with the limited tools that I have to do performance

    The document has, as I said, has 205 pages. It has 165382 runs in the
    whole document.

    We load the whole document in memory, we put all the runs in the first
    page, and then we call fl_DocSectionLayout::breakSection. At the end,
    breakSection does:

    // Bump content down the columns
    while (pCurColumn != NULL && pCurColumn != pNextColumn)
            pCurColumn = (fp_Column *) pCurColumn->getNext();

    That's, for each column (as I have 1 column by page, that means for each
    page), we bumpContainers. That takes all the runs that don't fit in the
    current container, and bump them to the next container.

    Things go like that:

    1. Put every line in the first page.
    2. i = 1
    3. Get every line that don't fits in the i'th page, and put it in the
    i+1'th page. If everything fits in the i'th page, we're done.
    4. ++i, and go to 3.

    If we measure the cost of the operation in number of lines "reflowed",
    we get:

    t: nb of lines in the document
    n: nb of lines by page (approx.)
    N: nb of pages

    t = n * N
    cost = t + (t - n) + (t - 2n) + ... + (t - n * N)


    cost = N * t - (N + 1) * t / 2 = t * (N - (N + 1) / 2) ~= t * N / 2

    we have N = 205, t = 6462, so that means cost ~= 662355 (when it should
    be 6462).

    Come back to the output given by the profiler (I've removed the names of
    the classes to make it more readable here):

    0.14 5.68 204/204 breakSection(fl_ContainerLayout *) [10]
    0.14 5.68 204 bumpContainers(fp_ContainerObject *) [11]
    0.11 5.40 656302/656302 addContainer(fp_Container *) [15]
    0.11 0.10 656302/1997342 clearScreen(void) [61]

    Indeed, this 100 times factor is quite important, because we end calling
    some fp_Run methods (for each run in each line). We have ~25.6 runs by
    line, so that means that we call 2560 times some fp_Run methods.

    (In addition we have some stupid errors, like in fp_Line::clearScreen we
    start with

    bool bNeedsClearing = true;

    when AFAICS it should be

    bool bNeedsClearing = false;

    I've not yet studied all the branches that this implies, but AFAICS,
    this factor is the main responsable of the 6 millions and 9 millions
    times that some functions are called.

    If somebody is able to study all that in deep, be my guest. I've done
    my best with the tools that I have. Btw, I've tried FunctionChecker and
    done a program to make some sense out of its output + a script to
    convert from the raw address it gives to the name of the functions that
    are at these address, etc. but at the end it blocks when I try to
    measure the performance reading big files (or it takes a looooooong time
    to complete). So I'm still only with gprof.

    And a last note, the breakSection (the root of this problem) is
    convoluted (>500 lines for a single function?). If somebody knows
    exactly how it works, please add some comments :)


    Joaquín Cuenca Abela

    -- Joaquín Cuenca Abela

    This archive was generated by hypermail 2.1.4 : Sun Jul 21 2002 - 15:28:00 EDT