So, I had this function going on (after macro expansion and some type substitution) to scale a vector by a constant:
(deftype float-type () 'double-float) (deftype vector-type () '(vector float-type)) (defun vs (scale vec) (declare (type float-type scale)) (declare (type vector-type vec)) (declare (optimize (speed 3) (safety 0))) (map vector-type #'(lambda (x) (declare (type float-type x)) (the float-type (* x scale))) vec)))
Straight forward enough, right? I've got to scale a vector, so I'm going to scale each component (with lots of type cruft) and plop the results back into a vector.
I did some timing tests comparing this to the version with
no type cruft. The results were dishearteningly similar.
But, having just read how terrible
is when you are trying to
(map-into 'list ...), I figured I'd see what I could
do there first even though I was using
(map vector-type ...).
It turned out that when I changed my
(deftype vector-type () 'vector)
Suddenly, both the typed and non-typed versions got 10x faster and the typed version was 20% faster.
In the non-typed version, the compiler whines and whines that it can't optimize this and can't optimize that. Some of it, I have no idea where it's getting the pieces it's trying to optimize. For example, where in the above code would this message get triggered?
; ; note: unable to ; convert x*2^k to shift ; due to type uncertainty: ; The first argument is a NUMBER, not a INTEGER. ; The second argument is a NUMBER, not a INTEGER. ;
Anyhow, that's not the point here. The point is, I cannot figure out how to get rid of this last warning in the typed version.
; ; note: doing float to pointer coercion (cost 13) to "
I even jumped through some extra hoops to use
after reading a section of some compiler's manual about the advantages
of local function calls. Alas, still no dice.
Alas, it seems that even where the result type is known the whole way through the calculation, the compiler still needs to have the type flagged for a return value and there are no bits left over in a float to co-opt for this purpose. Apparently, other compilers have some ways around this. *shrug*
But, we have a winner with
(map-into ...) wrapped
(flet ...). Holy Heap Handling, Batman. Here's
(defun vs (scale vec) (declare (type float-type scale)) (declare (type vector-type vec)) (declare (optimize (speed 3) (safety 0))) (let ((ret (make-array (length vec) :element-type 'float-type))) (flet ((scaler (x) (declare (type float-type x)) (* x scale))) (map-into ret #'scaler vec))))
So, the runtimes (for 10,000,000 calls) are:
- vs (non-typed,map): 3.548 seconds
- vs (typed,map): 2.894 seconds
- vs (typed,flet,map-into): 0.973 seconds
The big difference? Somehow or other, the memory usage of the
(map ...) is insanely worse. It looks like they are
dynamically resizing the result vector.
First, consider that I was using a vector of length 4. So, if we
resized each time, we'd alloc 1 + 2 + 3 + 4 = 10 double-floats. If we
alloced once, we'd just need 4 double-floats. Keep that 10 to 4 ratio
in mind when looking at these numbers:
- vs (non-typed,map): 1,040,005,296 bytes consed
- vs (typed,map): 1,040,003,784 bytes consed
- vs (typed,flet,map-into): 399,997,656 bytes consed
The real killer is that the HyperSpec page
(map-into ...) shows a
which does the single allocation. The same
method they use could easily be used to figure the output size for
could be implemented like this