Patrick (patrickwonders) wrote,

  • Mood:

Vector math optimization continues...

So, I had this function going on (after macro expansion and some type substitution) to scale a vector by a constant:

(deftype float-type  () 'double-float)
(deftype vector-type () '(vector float-type))
(defun vs (scale vec)
    (declare (type float-type  scale))
    (declare (type vector-type vec))
    (declare (optimize (speed 3) (safety 0)))
    (map vector-type
	#'(lambda (x) (declare (type float-type x))
			(the float-type (* x scale)))

Straight forward enough, right? I've got to scale a vector, so I'm going to scale each component (with lots of type cruft) and plop the results back into a vector.

I did some timing tests comparing this to the version with no type cruft. The results were dishearteningly similar. But, having just read how terrible sbcl's performance is when you are trying to (map-into 'list ...), I figured I'd see what I could do there first even though I was using (map vector-type ...).

It turned out that when I changed my vector-type to the weaker:

(deftype vector-type () 'vector)

Suddenly, both the typed and non-typed versions got 10x faster and the typed version was 20% faster.

In the non-typed version, the compiler whines and whines that it can't optimize this and can't optimize that. Some of it, I have no idea where it's getting the pieces it's trying to optimize. For example, where in the above code would this message get triggered?

; note: unable to
;   convert x*2^k to shift
; due to type uncertainty:
;   The first argument is a NUMBER, not a INTEGER.
;   The second argument is a NUMBER, not a INTEGER.

Anyhow, that's not the point here. The point is, I cannot figure out how to get rid of this last warning in the typed version.

; note: doing float to pointer coercion (cost 13) to ""

I even jumped through some extra hoops to use (flet ...) after reading a section of some compiler's manual about the advantages of local function calls. Alas, still no dice.

Alas, it seems that even where the result type is known the whole way through the calculation, the compiler still needs to have the type flagged for a return value and there are no bits left over in a float to co-opt for this purpose. Apparently, other compilers have some ways around this. *shrug*

But, we have a winner with (map-into ...) wrapped in (flet ...). Holy Heap Handling, Batman. Here's that code:

(defun vs (scale vec)
    (declare (type float-type  scale))
    (declare (type vector-type vec))
    (declare (optimize (speed 3) (safety 0)))
    (let ((ret (make-array (length vec) :element-type 'float-type)))
	    (flet ((scaler (x)
			(declare (type float-type x))
			(* x scale)))
		(map-into ret #'scaler vec))))

So, the runtimes (for 10,000,000 calls) are:

  • vs (non-typed,map): 3.548 seconds
  • vs (typed,map): 2.894 seconds
  • vs (typed,flet,map-into): 0.973 seconds

The big difference? Somehow or other, the memory usage of the (map ...) is insanely worse. It looks like they are dynamically resizing the result vector. First, consider that I was using a vector of length 4. So, if we resized each time, we'd alloc 1 + 2 + 3 + 4 = 10 double-floats. If we alloced once, we'd just need 4 double-floats. Keep that 10 to 4 ratio in mind when looking at these numbers:

  • vs (non-typed,map): 1,040,005,296 bytes consed
  • vs (typed,map): 1,040,003,784 bytes consed
  • vs (typed,flet,map-into): 399,997,656 bytes consed

The real killer is that the HyperSpec page for (map-into ...) shows a map-into could be implemented like this which does the single allocation. The same method they use could easily be used to figure the output size for map. Wheee....

Tags: lisp
  • Post a new comment


    default userpic

    Your reply will be screened

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.