I’ve tried to see how slower can ghc be compared to gcc in compiling a simple
imperative program. I’ve written this small test program in C:
/* loop.c */
#include
int
main(int argc, char **argv)
{
int i, j, k, t = 0;
for(i = 0; i < 1000; i++)
for(j = 0; j < 1000; j++)
for(k = 0; k < 1000; k++)
if(i * i + j * j + k * k % 7 == 0)
t++;
printf("%d\n", t);
return 0;
}
And a similar version in imperative Haskell, being as similar as possible to
the original C code:
-- loop.hs
import Control.Applicative
import Control.Monad
import Data.IORef
main :: IO ()
main
= do
i <- int
j <- int
k <- int
t <- int
for (i =: 0) (i <: 1000) (inc i)
$ for (j =: 0) (j <: 1000) (inc j)
$ for (k =: 0) (k <: 1000) (inc k)
$ do
i_ <- readIORef i
j_ <- readIORef j
k_ a -> IO ()
(=:) = writeIORef
( IORef a -> a -> IO Bool
i <: v = (< v) readIORef i
inc :: Enum a => IORef a -> IO ()
inc i = modifyIORef i succ
for :: IO () -> IO Bool -> IO () -> IO () -> IO ()
for preface condition step code
= preface >> for_ condition step code
for_ :: IO Bool -> IO () -> IO () -> IO ()
for_ condition step code
= condition >>= flip when (code >> step >> for_ condition step code)
A functional version:
-- functional.hs
main :: IO ()
main
= print
$ length
[i
| i <- [0 :: Int .. 999]
, j <- [0 .. 999]
, k <- [0 .. 999]
, i * i + j * j + k * k `mod` 7 == 0]
And a version using Template Haskell.
The results: The time to build the code with TH is not the best, but the execution time The results in a table:
{-# LANGUAGE TemplateHaskell #-}
import Language.Haskell.TH.Syntax
main :: IO ()
main
= print
$ $(lift $ length
[i
| i <- [0 :: Int .. 999]
, j <- [0 .. 999]
, k <- [0 .. 999]
, i * i + j * j + k * k `mod` 7 == 0])
$ time gcc -O3 -o loop -Wall loop.c
real 0m0.256s
user 0m0.148s
sys 0m0.040s
$ time ./loop
143
real 0m13.652s
user 0m11.629s
sys 0m0.052s
$ time ghc --make -O3 loop.hs
real 0m0.209s
user 0m0.160s
sys 0m0.020s
$ time ./loop
143
real 3m46.834s
user 3m24.313s
sys 0m0.712s
$ time ghc --make -O3 functional.hs
[1 of 1] Compiling Main ( functional.hs, functional.o )
Linking functional ...
real 0m1.420s
user 0m1.016s
sys 0m0.220s
$ time ./functional
143
real 1m29.057s
user 1m25.809s
sys 0m0.224s
$ time ghc --make -O3 th.hs
[1 of 1] Compiling Main ( th.hs, th.o )
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.3.0.0 ... linking ... done.
Loading package containers-0.3.0.0 ... linking ... done.
Loading package pretty-1.0.1.1 ... linking ... done.
Loading package template-haskell ... linking ... done.
real 90m54.226s
user 87m32.024s
sys 0m14.785s
marcot@zezinho:~/codigo/haskell$ time ./th
143
real 0m0.005s
user 0m0.000s
sys 0m0.004s
surely is. Maybe we could use this technic to improve the execution time of all
of the codes, interpreting a part of the code in build time automaticly,
without user indication.
Version
Build time (s)
Execution time (s)
C
0.256
13.652
Imperative Haskell (IORefs)
0.209
136.834
Functional Haskell
1.420
89.057
Template Haskell
5454.226
0.005
13/3/2010 (sábado) às 20:58 |
What’s the size of the template binary?
13/3/2010 (sábado) às 22:51 |
Your first Haskell code sample seems a bit mangled.
13/3/2010 (sábado) às 23:00 |
Yes, they’re all messed up, sorry by that. A link to them:
http://people.debian.org/~marcot/blog/loop.c
http://people.debian.org/~marcot/blog/loop.hs
http://people.debian.org/~marcot/blog/functional.hs
http://people.debian.org/~marcot/blog/th.hs
The sizes of the binary are: 6.6K (C), 675K (IORef), 676K (Functional) and 689K
(TH).
13/3/2010 (sábado) às 23:57 |
given the run time for template haskell it looks like the compiler unrolled the loops. I doubt you would get the same results with dynamic data instead of static.
14/3/2010 (domingo) às 4:40 |
Can you say what compiler versions and hardware you’re using?
14/3/2010 (domingo) às 8:58 |
Dale, you’re right, this can only be used with static data.
$ gcc –version
gcc (Debian 4.4.3-3) 4.4.3
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ ghc –version
The Glorious Glasgow Haskell Compilation System, version 6.12.1
$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T5750 @ 2.00GHz
stepping : 13
cpu MHz : 1994.810
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm
bogomips : 3989.62
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Core(TM)2 Duo CPU T5750 @ 2.00GHz
stepping : 13
cpu MHz : 1994.810
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm
bogomips : 3990.09
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
$ free -m
total used free shared buffers cached
Mem: 3014 1566 1447 0 102 786
-/+ buffers/cache: 677 2336
Swap: 2999 0 2999
$ uname -a
Linux zezinho 2.6.32-2-amd64 #1 SMP Fri Feb 12 00:01:47 UTC 2010 x86_64 GNU/Linux
I hope this answers the question.
14/3/2010 (domingo) às 9:09 |
[...] interesting code using Template Haskell While writing the post about performance, I got interested in Template Haskell, specially in the $(lift) construction. Investigating a [...]