Jakob Østergaard Hegelund

Tech stuff of all kinds

Bitten by quotes

2016-08-25

I got bittem by quotes twice in a day recently. When "obviously correct" code doesn't work, debugging quickly gets frustrating. In both cases I had super simple obviously correct code that malfunctioned.

Naturally, in both instances, my code was obviously wrong once I got to take a step back.

Modifying literals

So I was writing test cases for a newly written procedure which destructively modifies a parameter passed to it. This itself is an interesting concept in LISP and it's one of these pretty big "little differences" to C++ where my experience really is.

Let's digress. In C++ you can pass arguments by value or by reference; but can choose to pass a pointer to an object - this is technically passing by value (you pass the pointer by value) but the code that gets generated by the compiler is more like when you pass by reference (which passes a pointer). However, in C++ you would stick a const on every argument your function shouldn't modify (which can cause confusion when you stick a const on a pointer-type value but the function then modifies the non-const data that the constant-valued pointer is pointing at; but poor code and lack of insight can cause mistakes in any language of course).

In LISP on the other hand, you always pass references to data (except when you pass literals) and there's no const equivalent. Effectively that means any procedure can modify the arguments it's passed. Of course you don't generally want that, and the standard library has conventions on procedure names that modify versus those that don't (e.g. subst vs. nsubst) - so this is not really a problem in the real world.

Anyway, I wrote a function that modifies a subset of an s-expression representing an XML document; it modifies the given document in-place and returns the modified document. The implementation is not important but the snippet below shows that in one situation we modify the cdr of the document (sexp) given.

(defun sexp-add-parameter (sexp key value) " ... " (declare (type list sexp) (type string key) (type string value)) ... ;; We don't have an arg list; insert one (setf (cdr sexp) (cons (list :@ (cons key value)) (cdr sexp))))) sexp)

My test code uses the 5am framework and I had two tests that looked like:

;; Add parameter to empty element with non-empty parameter list (5am:is (equal '("foo" (:@ ("a" . "b") ("key" . "value"))) (sexp-add-parameter '("foo" '(:@ ("a" . "b"))) "key" "value"))) ;; Add parameter to non-empty element with empty parameter list (5am:is (equal '("foo" (:@ ("key" . "value")) "baz") (sexp-add-parameter '("foo" (:@) "baz") "key" "value")))

So in short these two tests supply S-expressions that represent simple XML documents and the sexp-add-parameter procedure is called to insert (or alter) a given parameter (or attribute if you will) on the top-level element.

On my developer machine, executing this code from the editor (C-x C-e from Emacs using SLIME integration to the running SBCL LISP system) just worked. But the CI system failed the build! Re-building locally with the build-system optimization settings also failed the test.

That just sucks; obviously correct code that works on one optimization setting and fails on another. Argh! I trust the SBCL implementation, it is my general impression that it is quite mature, so I didn't want to believe this was an optimizer error. But debugging stuff like this... Argh... The debugger stack traces showed some really strange things like my simple quoted parameters being different from what the code said. So for example, evaluating the call (sexp-add-parameter '("foo" (:@) "baz")) would result in a debugger stack trace with a top-level call being (sexp-add-parameter '("foo" (:@ ("a" . "b")) "baz")) for example. Super super strange - I mean, you do the most simple thing; you execute a call - then your debugger claims the top-level is different from the call you just evaluated. Ouch.

Anyway, after a bit of hair pulling, I eventually ended up locating this little tidbit in the CLHS about the quote operator. It says The consequences are undefined if literal objects (including quoted objects) are destructively modified. Well d'oh. What was I thinking? This simple change fixed everything:

;; Add parameter to empty element with non-empty parameter list (5am:is (equal '("foo" (:@ ("a" . "b") ("key" . "value"))) (sexp-add-parameter (list "foo" (list :@ '("a" . "b"))) "key" "value"))) ;; Add parameter to non-empty element with empty parameter list (5am:is (equal '("foo" (:@ ("key" . "value")) "baz") (sexp-add-parameter (list "foo" (list :@) "baz") "key" "value")))

Ever since I learned using the quote operator, I just saw it as a shorthand for list (and cons). It never once occurred to me that list returns a freshly consed (freshly allocated) list, whereas the quote operator produces a list literal. And in most situation I could get away with my ignorance because "usually" I don't provide literal lists to procedures that modify their arguments. Lesson learned. More importantly I think I need to rename this procedure to make it clear that it modifies its argument.

Redefining procedures

When working in the LISP REPL you quickly get used to redefining procedures and re-evaluating calls to them - this is probably the primary reason I like working in a LISP environment; the escape from the old edit-compile-run cycle. But you quicly get spoiled - I got used to redefining my procedures and calling them and expecting them to actually be redefined. Well... It turns out that this does not always work exactly like I had imagined.

Let's say we have a report generator that can generate a number of different reports. Let's further entertain the idea that it holds a data structure that describes the report generators and also refers to the actual report generator procedures (for easy invocation from some "driver" or scheduling procedure). It could look like this:

(defparameter *reporters* `(,(make-repgen ... :generator #'repgen/accounts ...) ,(make-repgen ... :generator #'repgen/accounts-full ...)))

So in this example I have a list that contains two repgen structure instances. Each instance has a number of members, one of them being generator which is initialized to one of two report generator functions; repgen/accounts and repgen/accounts-full respectively.

While working on this system, I did some corrections to my repgen/accounts procedure, redefined it, and then called the report generator. Much to my surprise my changes did not take effect. Of course, I tried this a number of times with variations by my changes kept not taking effect. Again, this was a case of the simplest thing not working - very difficult to debug.

The following example may shed some light on this; a simple REPL session:

* (defun foo () "hi") FOO * (defparameter fun #'foo) FUN * (funcall fun) "hi" * (defun foo () "there") STYLE-WARNING: redefining COMMON-LISP-USER::FOO in DEFUN FOO * (funcall fun) "hi" *

I would have expected the second call to return "there" of course. Again, the CLHS to the rescue; the page about the Sharpsign Single-Quote and the function. It turns out that using #'foo results in the definition of the function foo, not a "dynamic link" to whatever definition is currently pointed to by the function by that name. So in other terms it's akin to static linking.

Of course when you know it, it's simple enough. I just re-evaluate my defparameter as well, after redefining procedures it reference.

What to take away...

I didn't read the HyperSpec from start to finish before starting programming LISP. And if I had, I wouldn't remember it anyway. In general I think it has been pretty smooth sailing - the language being so syntactically small makes it relatively easy to get started and doing relatively complex things (simple macros are not syntactically hard - they can be hard to wrap your mind around, but that is because of the concept, not so much because of the syntax). Of course I get bitten by my own misunderstandings and lack of insight at times, but frankly I'm surprised at how well it goes in general. This day was special - getten bitten by the simplest things, twice. Special enough to get space here.