30 Aug 2013

Clojure puzzler #2

Seven lines of Clojure, numbered for easy reference. From which lines do you get a reflection warning?

(set! *warn-on-reflection* true)                          ; 1
                                                          ; 2
(Thread. (fn [] (prn 1)) "foo")                           ; 3
(Thread. :x "foo")                                        ; 4
(Thread. (if true (fn [] (prn 1)) (fn [] (prn 2))) "foo") ; 5
(Thread. (if true (fn [] (prn 1)) :x) "foo")              ; 6
(Thread. (if true :y :x) "foo")                           ; 7

31 Jul 2013

Clojure puzzler #1

Which of the following fails and why?
(hash-map (gensym "A") 1 (gensym "B") 2)
(hash-map (gensym) 1 (gensym) 2)
{(gensym "A") 1 (gensym "B") 2}
{(gensym) 1 (gensym) 2}
Hint (hover to reveal): Clojure works by first reading the code into a Clojure data structure and then evaluating it. What does the Clojure reader do when it encounters a map literal? What does (read-string "{(:a) 1 (:a) 2}") do?

25 Jun 2013

Clojure style: accessing maps and records

There are three ways of accessing the member :key of a map my-map:
(:key my-map)
(my-map :key)
(get my-map :key)
What are the drawbacks and advantages of each, and which one should you use? What should a style guide recommend?

Since Clojure is still a young language, there are not too many style guides available. Two popular sources for style guidance are Bozhidar Batsov's guide and clojure.org's Library Coding Standards. Both of them recommend the first form, (:key my-map). The ZenRobotics style guide (as yet unpublished) takes an opposing view: you should use the second form, (my-map :key).

What's wrong with (:key my-map)? Of course it works just fine:
user=> (def my-map {:foo 1 :bar 2})
user=> (:foo my-map)
The problem is that it works just as fine for too many cases:
user=> (:foo (java.util.LinkedList. [1 2 3]))
user=> (:foo [1 2 3])
user=> (:foo nil)
whereas (my-map :foo) gives you an error for these inputs:
user=> ((java.util.LinkedList. [1 2 3]) :foo)
ClassCastException java.util.LinkedList cannot be cast to clojure.lang.IFn  user/eval27 (NO_SOURCE_FILE:12)
user=> ([1 2 3] :foo)
IllegalArgumentException Key must be integer  clojure.lang.APersistentVector.invoke (APersistentVector.java:261)
user=> (nil :foo)
CompilerException java.lang.IllegalArgumentException: Can't call nil, compiling:(NO_SOURCE_PATH:14)
An error in each of these cases is good, because usually you shouldn't be looking for the key :foo in a linked list or a vector, and the sooner you discover it, the less time you'll spend debugging why some other code is getting a nil input. The (get my-map :foo) form has the same problem: it allows non-maps as input and just returns nil.

Of course, a style guide must balance various concerns, one of which is conciseness. Our code does have multiple instances of calls like
(filter :foo list-of-maps)
where #(% :foo) would be considered just too verbose. Similarly, sometimes you do want to make your code work with various different types and just get nil when the type is not applicable. A good thing about keeping (my-map :foo) as the general rule is that uses like this stand out, so if you're looking for why you're only getting nil values somewhere, you immediately think of list-of-maps being not actually maps at all - perhaps a case of making wrong code look wrong.

Often even better than calling maps as functions is to define a version of get that ensures errors for another kind of mistakes:
(defn get*
  "Like clojure.core/get but throws an error if
  result is not found."
  [m k]
  (let [result (get m k ::not-found)]
    (if (= result ::not-found)
      (throw (Error. (format
         "Key %s not found among %s" k (keys m))))
Now (get* {:foo 1 :bar 2} :baz) is an error, so using get* guards against typos in map keys.

Once code matures, it is common to replace hash-maps with records. With records the same rule won't work:
user=> (defrecord MyRecord [x y z size value material])
user=> (def r (->MyRecord 4.2 5.2 -0.9 7.0 9 :plastic))
user=> r
#user.MyRecord{:x 4.2, :y 5.2, :z -0.9, :size 7.0, :value 9, :material :plastic}
user=> (:x r)
user=> (r :x)
ClassCastException user.MyRecord cannot be cast to clojure.lang.IFn  user/eval108 (NO_SOURCE_FILE:23)
user=> (get r :x)
The get* function defined above works fine even for records, but an intriguing alternative is to take advantage of the Java interface associated with each record type and use Java interop to access fields:
user=> (.x r)
user=> (.x [1 2 3])
IllegalArgumentException No matching field found: x for class clojure.lang.PersistentVector  clojure.lang.Reflector.getInstanceField (Reflector.java:271)
The immediate problem with this is that some unrelated objects could have an x field or method, but since we are looking for type safety, it would be recommended to tag the record object with the interface - usually in the function signature, but we can demonstrate with let:
user=> (let [^MyRecord my-record r] (.x my-record))
user=> (defrecord OtherRecord [x y velocity])
user=> (def r' (->OtherRecord 1 2 3))
user=> (let [^MyRecord my-record r'] (.x my-record))
ClassCastException user.OtherRecord cannot be cast to user.MyRecord  user/eval158 (NO_SOURCE_FILE:46)
Since this can compile into a straightaway field access, it is actually very slightly faster than using the keyword as a function:
user=> (time (dotimes [_ 1000000]
        (let [^MyRecord my-record r] (.x my-record))))
"Elapsed time: 3.081 msecs"
user=> (time (dotimes [_ 1000000]
        (let [^MyRecord my-record r] (:x my-record))))
"Elapsed time: 6.24 msecs"
But there is a far more devious problem with this idea:
user=> r
#user.MyRecord{:x 4.2, :y 5.2, :z -0.9, :size 7.0, :value 9, :material :plastic}
user=> (.size r)
How did that happen? The value should be 7 but we get 6.

The reason is that MyRecord is, among other things, a java.util.Collection:
user=> (ancestors MyRecord)
#{clojure.lang.IMeta clojure.lang.Associative clojure.lang.Counted clojure.lang.IObj java.io.Serializable clojure.lang.ILookup java.lang.Object clojure.lang.IRecord java.lang.Iterable clojure.lang.Seqable clojure.lang.IPersistentCollection java.util.Map clojure.lang.IKeywordLookup clojure.lang.IPersistentMap}
... and Collection has a nullary method named size, and since apparently method calls take precedence over field accesses, that was a call to the size method, which told us that the record had six entries.

For records, we have so far decided to live with get* though some code does use the (.field record) form, which means we have to take care not to name any record field one of these:
user=> (defrecord Empty [])
user=> (sort (map :name (filter #(and (instance? clojure.reflect.Method %) (empty? (.parameter-types %))) ((reflect Empty) :members))))
(clear count empty entrySet getBasis hashCode isEmpty iterator keySet meta seq size values)
Out of these, mainly size and count are really plausible names for record fields. The danger, of course, is that future versions of Clojure add more methods.

13 May 2013

Our Interview Process

Our (Software Engineer) Interview Process

TL;DR Google style technical interviews (google for it)

In this post I try to explain the process you should expect if you are applying to work as a software engineer at ZenRobotics and why we are doing it that way. The process is different for HW engineers and interns (but those processes have similar goals).

Process structure and schedule

This process holds if we have official open positions. If we don't, the time scales may vary a lot - but do send us an application: we'll always consider people and normally new positions open up after a few months.
  1. You send us an application (for details, see zenrobotics.com/jobs)
  2. 2--4 weeks later we will contact you to either schedule an A-round interview or to let you down gently
  3. The A-round is one 1-hour interview in which we try to ascertain you can code and that there is a reasonable cultural fit (and you can ask questions about us)
  4. 1--2 weeks after the A-round we'll contact you to schedule B-round interviews or to let you down gently
  5. The B-round is 3--4 technical interviews, 1 hour each. Each interview is given by a different member of staff and consists of 1--2 technical problems you will get to solve on the whiteboard. Each interviewer gives an independent -2 .. +2 judgement which are combined into a hire/no-hire decision.
  6. 1--4 weeks after the B-round we'll contact you to schedule salary negotiations and signing of a contract or to let you down gently.
The cultural fit is mostly about contributing in a technology startup - "Smart and Gets Things Done" is probably a close-enough description.

Our technical questions

We work with intelligent robots. They are made intelligent by software, but to make the software intelligent you need some other technical skills (than software skills) too. In our interviews we try out the following kinds of problems (not necessarily all of them for everybody, and we may add some probes into skills you claim on your CV):
  • Actual nuts-and-bolts programming (think FizzBuzz) in one language you get to choose, most typically in the A-round.
  • Algorithms and data structures (think graphs, trees, hashes rather than bloom filters)
  • Software design (think a simple API for a simple component)
  • Math and physics (think high-school, rather than 3-body)
  • Statistics and probability theory (think Monty Hall and Bayes' rule, rather than convergence of estimators)
  • Ability to explain some basic technology (how does binary work, how do two processes communicate over a network)

Why do we do it this way?

We want to get several independent reasonably reliable datapoints. Several so that you don't fail because of a single mismatch with an interviewer. The reliability is supposed to come from actually getting you to solve problems rather than to make claims of your ability.

(Do note that for us the cost of hiring the wrong person is pretty high and the cost of not hiring a good candidate is pretty low; for you it's the other way round. Also: we have to be able to make judgements with a reasonable amount of effort).

We want very good people (but not necessarily (just) rock stars). The hire/no-hire doesn't depend on what team you'll start in - we want the bar to be high enough that you are a good fit to any team. However, different teams have different balances between the different technical skills named above and how well you do in the different questions will have an effect in where we try to start you out in.

We try to start easy but have ways of turning up the difficulty if you are doing well. This is so that we can gauge not only whether you are 'good enough' but we can distinguish between good and better.

19 Apr 2013

Bit banging WS2811 LED strips with PWM


Up until now, I have been a mild-mannered software developer, but recently I was bitten by the electronics bug. Now the spare room in our apartment is filled with resistors, capacitors and ICs with serial numbers looking like foreign license plates.

And LEDs. I didn't have a clear plan what I was going to do, but I have a 1.5 year old son, so I suspect making some blinkenlichts at home is going to get me at least umpteen million World's Best Daddy points. I bought different kinds: water-proof strips, plain strips, individual pixels, holiday decoration lights… all with some sort of digital control.
Which brings us to today's topic. Some of the LED strips use the WS2811 control chip. While its close relative, the WS2801, uses SPI, the WS2811 has its own serial protocol, which rules out most of the hardware features in microcontrollers. So, it's bit banging time.

I'd like to stress that I was doing this as a hobby project. There are very good solutions for the WS2811 control already, such as Alan Burlison's WS2811.h or the FastSPI_LED2 library. The inspiration for reinventing the wheel came from Alan's remark:
Hmm, that means that driving these with a simple C routine is unlikely to be sufficient. I spent a bit of time looking to see if there was any sort of hardware assist that could be brought to bear, but even SPI at 4MHz, close to the maximum that the MCU can support, wouldn't be fast enough as it would still be necessary to marshal each byte into a series of 5-bit patterns to get the timings right for the WS2811 protocol.
Challenge accepted.


The communications protocol is serial: the chip picks out the first 24 bits arriving after 50 μs reset pause, and forwards the following bits to the next chip in the chain. Each bit starts with a low-high transition, and the bit value is determined by the timing of the high-low transition. So we're sending a pulse of varying width; if only we could module pulse width somehow
After examining the datasheet for the Atmega328, I noticed this tidbit:
The OCR0x Registers are double buffered when using any of the Pulse Width Modulation (PWM) modes. For the normal and Clear Timer on Compare (CTC) modes of operation, the double buffering is disabled. The double buffering synchronizes the update of the OCR0x Compare Registers to either top or bottom of the counting sequence.
That's perfect for our needs.  The plan is simple: the timer module will be set for a 20 tick cycle (OCR2A) and the PWM limit value (OCR2B) is adjusted depending on the bit value. Thanks to the double-buffering, we have 20 cycles to prepare the next bit, so it becomes feasible writing this in C.

Of course, the devil is in the details. If we update the PWM limit value too fast, we'll lose the bits, so we need to synchronize the updates to the timer overflow. That's about 4 cycles. Then we need to do the update, clear the timer overflow and so on. In total, we use about 13 cycles, so we need to load the next bit in 7 cycles, including loop termination check.

To achieve that, I unrolled an update of a byte into 8 single-bit updates. This gets compiled rather nicely, as the Atmega assembly language has an opcode called SBRS: skip if register bit set. The byte can be kept in the register unchanged without any need for shifting or masking.

The loading of the next byte also takes time. If I used a regular for loop to iterate over an array, the compiler would happily put all the code at the end and mess up the timings between bytes. So I split the operations into smaller bits (advance pointer, dereference pointer, check for termination) and slotted them between bytes.

The C code, then, looks like this:
    bang_bit(b, 4);
    bang_bit(b, 3);
    nextByte = *data;

and the generated assembly looks like (sync and bang_bit are inlined):
    ;; Wait for overflow
    sbis    0x17, 0
    rjmp    .-4
    ;; Bit set or not?
    sbrc    r25, 4
    rjmp    .+4  
    ldi     r24, 0x04
    rjmp    .+2  
    ldi     r24, 0x10
    ;; Update OCR2B
    sts     0x00B4, r24
    ;; Reset overflow
    ldi     r24, 0x01
    out     0x17, r24
    ;; Advance pointer
    adiw    r28, 0x01
    ;; And the next bit 
    sbis    0x17, 0
    rjmp    .-4     
    sbrc    r25, 3
    rjmp    .+4  
    ldi     r24, 0x04
    rjmp    .+2
    ldi     r24, 0x10
    sts     0x00B4, r24
    ldi     r24, 0x01
    out     0x17, r24
    ;; Load next byte
    ld      r16, Y

After a few passes of checking the generated code and tweaking the C code in response, I ended up with the worst-case longest interval between bits being 19 cycles.

Testing on the hardware revealed some instability on the first LED; turns out I had mistyped one of the timing constants. Fix that, upload and hey presto! Blinkenlichts!

The code is up at GitHub; it involves some calls to Arduino library for delays and such. I was too lazy to look those up from the datasheet.

"But how is this related to ZenRobotics?" I hear you ask. I'll let you know after the Housewarming Party.


You'll note that the generated code is still less than optimal. In an effort to save registers, r24 gets reused. Keeping the magic constant 0x01 in a dedicated register would save a cycle per bit.
The "bit set" tests also waste cycles with all the jumping around.


Turns out the timings I had were slightly off. They worked fine when I was blasting at full power, but when there were just one or two bits set per byte, the whole thing would go dark. I had to change just one constant, and then things worked.

That's the good side of this approach. I can easily modify the pulse lengths and the length of the bit if needed. If I resort to writing assembly by hand, I can probably squeeze the minimum cycle length to 15 or so.

11 Apr 2013

Gerrit but not gerrit

We like code reviews. Well, at least officially we all like code reviews. Or anyway they are mandatory.

We evaluated a bunch of possible review tools at some point (rietveld, Review Board and gerrit). We would have been happy with rietveld from a user's point of view but we didn't want to ship our source code to AppEngine (and the gae2django project that would allow it to be run on our own server wasn't very mature then). Out of Review Board and gerrit we preferred gerrit.

We liked the gerrit UI and tools (ssh auth and commands over ssh with JSON output are fabulous) but we really didn't like the gerrit workflow.

Gerrit wants to run your main repository and handle the merges into it. The reviewable changes have to be single commits, not branches, so you basically need to rebase your branches for review and merge. (I can understand that, it's kind of nice to review commits as units of change).

How we have been working is that intermediate results are regularly pulled between people and developers are attached to their intermediate commits in their feature branches.

So we use gerrit without the workflow. This comes in two parts: synthetic review commits and merging outside gerrit.

Synthetic review commits

The commits we push to gerrit are created by doing something like:

branch=$(git rev-parse --abbrev-ref HEAD)

git fetch origin
git checkout -b gerrit-review origin/master
git merge --squash $branch
git commit -a
... edit commit message ...
git push gerrit HEAD:refs/for/master/$USER-$branch

(with error checking, checking for an existing review request and re-using the Change-ID, some related magic in the git hooks etc.)

Merging outside gerrit

Instead of pushing Submit on the gerrit review request, our merge master pulls the original branch that was used to create the review request (actually, we inject a field into the review request with the commit hash), merges that into a working copy, runs merge tests and pushes to our regular shared repository. There's a post-receive hook in that repo that force-pushes onto the gerrit repository and all is well.

28 Mar 2013

REPL for a robot

As a new employee of ZenRobotics, I wanted to share some first-hand experiences of controlling an industrial robot with a REPL.

One nice thing about programming in Lisp is the REPL, i.e. the interactive loop in which you can enter code and get immediate feedback about what the code does. One nice thing about programming with robots is that you get very physical feedback about what your code does. Naturally, in a Lisp shop that
programs robotic systems you want to combine these.

Every now and then one gets the opportunity to connect to our recycling robot with a REPL. The other day I got to do this with a ZenRobotics Recycler Heavy Picker. This was the first time I actually interacted with the robot, and it was such an interesting experience to sit in the recycler control room with blinkenlichten around, that I wanted to share some examples. Below I have three examples showing the Clojure REPL and for each, a video of the robot. There are many reasons why ZenRobotics is a cool place to work, and the robot REPL is certainly one of these!

Some simple movements

First, some simple movements. Here's a video, and the REPL input and output is below.

;; First, a couple of small movements to the right and left. The first three items in the parameter 
;; vector are x y z in meters. The x axis points towards the camera, the y axis to the right, and z is up.
repl=> (.moveToAxisPtpAsync *io* (.tryIk *io* [0 0.3 0 1 0 0 0]))
repl=> (.moveToAxisPtpAsync *io* (.tryIk *io* [0 0 0 1 0 0 0]))

;; Then a few moves in a sequence. This is much less verbose with a helper function.
repl=> (defn move-xyz [x y z] (.moveToAxisPtpAsync *io* (.tryIk *io* [x y z 1 0 0 0])))
repl=> (doseq [y [-0.5 0.5 0 0.2 -0.2]] (move-xyz 0 y 0))
;; We use quaternions to represent rotations:
;; http://en.wikipedia.org/wiki/Quaternion
;; Euler angles are probably more widely used for this
;; purpose but quaternions are much nicer because they
;; avoid the singularities and they are easier to compose.
;; But quaternions may not be as intuitive for simple
;; movements. I used a handy (deprecated) function to convert 
;; Euler angles to quaternions. The first parameter to this 
;; function is xyz, and  the the first item in the second 
;; parameter is roughly the robot wrist rotation angle in 
;; radians.

repl=> (transform-from-xyz-abc-deprecated [0 0 0] [0 0 0])
[0.0 0.0 0.0 1.0 0.0 0.0 0.0]
repl=> (transform-from-xyz-abc-deprecated [0 0 0] [0.4 0 0])
[0.0 0.0 0.0 0.9800665778412416 0.0 0.0 0.19866933079506122]
repl=> (.moveToAxisPtpAsync *io* (.tryIk *io* (transform-from-xyz-abc-deprecated [0 0 0] [0.4 0 0])))

;; Try out the gripper jaw with the close-jaw and open-jaw functions.
repl=> (use 'com.zenrobotics.control.jaw)
repl=> (close-jaw *io*)
repl=> (open-jaw *io*)

Picking and throwing some waste with the REPL

Then to get some action I actually picked up a piece of waste wood and threw it in the wood bin.

;; For these movements I had to set the position and angle, so I defined a new helper
;; function which also has a rotation angle parameter.
repl=> (defn move-xyza [x y z a] (.moveToAxisPtpAsync *io* (.tryIk *io* (transform-from-xyz-abc-deprecated [x y z] [a 0 0]))))

;; ...
;; Then I did quite a bit of boring moving around to find the coordinates
;; of the long piece of wood. After finding suitable coordinates I ran the
;; commands below in a sequence to show the picking and throwing action.

;; Move to a position above the target piece.

repl=> (move-xyza 0.5 0.4 0 0.75)
;; Go down.
repl=> (move-xyza 0.5 0.4 -0.36 0.75)
;; Close the gripper jaw.
repl=> (close-jaw *io*)
;; Move up a bit.
repl=> (move-xyza 0.5 0.4 -0.25 0.75)
;; Throw it in the wood bin. The throwing would be hard to get right with the 
;; simple movement commands, so I threw it with a test function.
repl=> (test-throw-trajectory-from-acc *io* :wood-bin 0.4 0.7)

Another throw

The little pile of waste looks interesting, so for another example, I tried to pick up some wood out of
the pile in a more random fashion.

repl=> (move-xyza 0.1 0.35 -0.25 0)
repl=> (close-jaw *io*)
repl=> (move-xyza 0.1 0.35 0 0)
repl=> (test-throw-trajectory-from-acc *io* :wood-bin 0.4 0.7)

19 Mar 2013

Introducing the ZenRobotics R&D Blog

Short and sweet: we are launching a blog, by the technical staff, for other technical people, to:
  • Share what it is like to work as an engineer at ZenRobotics so you'll want to come and join us (and some guidance on how you would actually do that).
  • Allow our staff to share karma with the company and vice versa by describing the cool things we do.
  • Create documentation where none exists: we think one of the best reasons to blog about something is when you had a problem, Google didn't come up with any answers and you figured it out yourself.