Archive for julho \21\UTC 2010

Immix on GHC Summer of Code weekly report #10

21/7/2010 (quarta-feira)

My project.

This post assumes that the reader has read my last post.

This weekly report is late, because I was too interested in the project to stop working on it to write. Next week it have to be earlier, since I’m going to New York on Sunday, and will have to take a time from the project.

First of all, I need to say that I thought all problems were gone in my last post, but I noticed after a lot of testing that there is a segfault (or memory corruption or etc) that happens about one of five times I ran the real/maillist benchmark. I tried a lot of combinations of parameters to the RTS (-C0 -A4k -i0 -DS and others) to make it behave deterministically, and it did, without segfaults. I could not reproduce the segfault deterministically, because all of these parameters made the segfault disappear. My old debugging technique, of printing a log to see where the program has went before the segfault happened, also is causing the segfault to disappear, so this one is being much harder to debug than the last ones. I’m still looking at it.

In a moment of this week I thought of pulling the latests patches from GHC, which created the need of rebuilding the whole system. I probably rebuilt with a different mk/, so I got some very distorted benchmark results when comparing before executions from before this build and after them. Because this distortion only made itself visible in a benchmark where I’ve introduced some improvements which I was very positive about, I naively thought that the -46% results were actually real. After a reasonable disbelief of my mentor, I remembered the complete rebuild and thought that was the cause. I reran the pre-rebuild benchmarks, and got the updated results.

This whole week I kept on benchmarking. At first I was using my usual system, but I noticed some very unexpected results, so I decided to run them in single-user mode, running nothing in parallel. In this condition, I belive the distortions are minimized. I organized my changes in sequential patches to the repository, to make it easier to measure. It takes a long time to run nofib, specially gc/gc_bench. While I organized the code, I notice the segfault was gone. I don’t know still what was causing it, but I’m not worrying very much, since it’s gone.

I’m planning to write a bigger report next week with the complete results of the benchmarks.

Immix on GHC Summer of Code weekly report #9

7/7/2010 (quarta-feira)

My project.

This post assumes that the reader has read my last post.

I’m posting this weekly report earlier this week because there are too many things to tell already. I’ve found the reason behind the segfault I’m looking for so much time. The last problem, which is the only I know exactly when was fixed, because the programs started working, was related to allocating two times in the same region of memory. This happened because in each major GC the list of free line groups is generated again, but my old code was still allocating in the same line group of the last generation. So the last part of the line group, which was not yet used, would be a part of a line group in generated in the new collection, and it will be used for allocation two times: one in the allocation of the current line group, and another when this new line group starts being allocated.

The implementation of the allocation of memory in lines is not very complicated, but it has some details that should be paid attention, and that were the cause of most trouble last weeks, and still need improvement. Initially I was allocating one object per line, just to see if it would work. As it didn’t, I kept on improving the approach until I could find the problem. The next attempt was by setting ws->todo_free and ws->todo_lim in alloc_for_copy() in rts/sm/Evac.c. I think this is not ideal, because I didn’t want the code to become too inconsistent with the way memory was allocated using these pointers before my changes. So I created new variables, line_free and line_lim, at first in the gen_workspace ws, the same place that todo_free and todo_lim are, but because of the last problem I described in the previous paragraph I changed it to generation. I’m still not sure about where to place these pointers, this is something that can be improved.

Another problem that I took a long time to understand is that the object need to be scavenged after being allocated. When it was allocated in todo_free, it was being scavenged by scavenge_block() in rts/sm/Scav.c, because the block in which it’s in, todo_bd, is scavenged by this function. As I didn’t wanted to the whole block where the free line group is to get scavenged again, I didn’t want to send it to this function. So I thought about creating a way to scavenge only part of a block, that is, the space in the free line group that was allocated. This is still a valid idea, but I noticed that it was easier to use the mark stack. So I mark the object that is allocated in the line and push it in the mark stack. The main problem with this approach it’s only possible to allocate in lines during major GCs, since only in this kind of GCs the mark stack is active. This is certainly the place where I can make more improvement.

The patch of these changes and another one for the sanity checking explained in the last post.

I’m now benchmarking these changes with nofib, to see how much it affects the performance.

Immix on GHC Summer of Code weekly report #8

5/7/2010 (segunda-feira)

My project.

This post assumes that the reader has read my last post.

This week I could get my focus on the project again, since most of my classes are already over. I’ve investigated the segfault that was happening in GHC with +RTS -w -DS, and I noticed that the code in rts/sm/Sanity.c assumed that all objects in the allocated area of the block are being used.

    for (; bd != NULL; bd = bd->link) {
	p = bd->start;
	while (p < bd->free) {
	    nat size = checkClosure((StgClosure *)p);
	    /* This is the smallest size of closure that can live in the heap */
	    ASSERT( size >= MIN_PAYLOAD_SIZE + sizeofW(StgHeader) );
	    p += size;
	    /* skip over slop */
	    while (p < bd->free &&
		   (*p < 0x1000 || !LOOKS_LIKE_INFO_PTR(*p))) { p++; } 

Since this is true for copy collection and mark compact, it was only with mark sweep that the error happened. The only way I could manage to make the segfault disappear by now was marking the swept blocks with a new flag, and avoid running this code in them.

So I went back to my old problem with the allocation of memory in lines. I noticed that (one of) the problem(s) may be that the object that is allocated in the free line is not scavenged after the evacuation. When the object is allocated in a block, using the current allocation method, it will eventually be scavenged, because all blocks that were being used to allocation are scavenged. I’m planning to implement a list of lines that need to be scavenged and code to scavenge the lines in this list, and the current line. It’ll be very similar to the code that does this with blocks.