Summer of Code weekly report #5

My project.

This post assumes that the reader has read my last post.

This was a week full of segfaults, failed assertions, changes in the user data and other crazy stuff. I started working on the allocation of memory on the freed lines. In the first glance, it seemed to be much harder than the small changes I’ve done to free memory in lines, because of the a never-seen-before-by-me datatypes being dealed with in alloc_for_copy(), from rts/sm/Evac.c: gen_workspace. But in a short time the code of alloc_for_copy() has shown itself to be simple, and it was easy to do an initial implementation. I went in the same path I was going on before, trying the simplest solution that works. For instance, the line representation used is not ideal, because groups of lines are not considered, but it was the simplest at that time, and I plan to improve it latter.

The gen_workspace data type contains a pointer to an area of a block that is not being used (todo_free) and a pointer to the end of that area (todo_lim). When a space for an object is requested, it’s allocated in this area, and todo_free is adjusted. If the area is full, a new block is requested. My intended change was to return the first free line of the generation when the object is smaller or equal to the size of a line, and use the current approach otherwise. This is the simplest way I could think, but has the problem that the only one object is allocated per line. This was a known issue.

    if(size <= BITS_IN(W_) && gen->first_line != NULL) {
        to = gen->first_line;
        gen->first_line = (StgPtr) *gen->first_line;
        return to;

    ws = &gct->gens[gen->no];

After that I got the first round of segfaults. One problem I could spot so far was in the code in rts/sm/Sweep.c, related to the liberation of the lines. The block is allocated by need, and there’s a pointer to the first free byte, free. This free space is used by alloc_for_copy(). So, we should only think of lines in the region already allocated in the block, that is, where the address is smaller than free.

                    if(bd->u.bitmap[i] == 0 && bd->u.bitmap[i - 1] == 0 &&
                       start + BITS_IN(W_) <= bd->free)

But the segfaults, memory corruptions, etc, kept going on. I tried some restrictions, like allocating in lines only objects with the exact size of a line, or very small objects (size == 2), or allocating just one object for generation, or only one object at all. I also tried checking for the type of the allocated object, to see a correlation with the problems I was having. Nothing helped by now.

So I thought I should work on other things, so that maybe my mind gets clearer and I can spot the problem. I followed the suggestion of using todo_free and todo_lim to make it possible to allocate more than one object in a line. I liked this change, since I had the impression that it fits better with the rest of the code than the initial implementation, and it will be easier to addapt it when I improve the representation of the free lines. As I said, I wanted something to work while I think in how I’ll solve the segfault problems. The bad side of this choice is that I’ll not be able to test it, since it’s just a reimplementation of the same idea in the allocation code, and is not expected to work. The good size is that it may work, and solve the problems I was having by accident.

    if (ws->todo_free > ws->todo_lim) {
        if (size <= BITS_IN(W_) && gen->first_line != NULL) {
            ws->todo_free = gen->first_line;
            gen->first_line = (StgPtr) *gen->first_line;
            to = ws->todo_free;
            ws->todo_lim = ws->todo_free + BITS_IN(W_);
        } else {
            to = todo_block_full(size, ws);

I got the same results as with the older implementation, and no ideas about how to solve it (yet). So I thought about another thing to work on, which I’m talking about all through this post: improve the free line representation. This has the advantage of being testable, since it’s unrelated to the allocation code, and may give me an idea about how I can fix the segfaults. The bad side is that it probably won’t fix my problems directly. I’m working on it right now.

Following a suggestion, I’ve started using gdb to see what was happening in the GC instead of including printfs everywhere, and it’s been useful. I’ve noticed that sometimes printf is more effective, but sometimes gdb is much better.


Deixe um comentário

Preencha os seus dados abaixo ou clique em um ícone para log in:

Logotipo do

Você está comentando utilizando sua conta Sair /  Alterar )

Foto do Google+

Você está comentando utilizando sua conta Google+. Sair /  Alterar )

Imagem do Twitter

Você está comentando utilizando sua conta Twitter. Sair /  Alterar )

Foto do Facebook

Você está comentando utilizando sua conta Facebook. Sair /  Alterar )


Conectando a %s

%d blogueiros gostam disto: