Random Distributed Scalar Encoder

Experiments on improving the Scalar Encoder

Author: Fergal Byrne (fergalbyrnedublin@gmail.com)

Library: v0.1.1-SNAPSHOT

Date: 18 July 2014

Website: https://github.com/fergalbyrne/clortex

Generated By: MidjeDoc

1 Background

Chetan Surpur gave a great talk on a new version of the Scalar Encoder for NuPIC (video). This document explores the implications of the new design and makes some recommendations about further improvements.

A Scalar Encoder is a function which converts a scalar value v into a bit map sdr with a given number bits of possible bits, of which width bits are 1 and the rest are 0. One important property of the SDRs is that values close together share many more bits than values further apart.

Here's a simple example:

(def enc (s/scalar-encoder :bits 12 :on 4 :minimum 1 :maximum 12))

(def to-bitstring (:encode-to-bitstring enc))

(vec (map to-bitstring (range 1 13)))

=> ["111100000000"
    "011110000000"
    "001111000000"
    "000111100000"
    "000111100000"
    "000011110000"
    "000001111000"
    "000000111100"
    "000000111100"
    "000000011110"
    "000000001111"
    "000000001111"]

1.1 The Current Scalar Encoder

The current scalar encoder (see example above) represents each scalar using a sliding window of on 1's. The current scalar encoder has the benefit of being instantly understandable (once you see an example like the one in the first section) and visually decodable by the user. It does, however, have a number of limitations.

The encoder is given a min and max when defined and it clamps its output when given values outside its initial range.

(to-bitstring -20) => "111100000000"
(to-bitstring 20) => "000000001111"

This means that the above two values come to represent many values, and the region receiving the encoding will not be able to discriminate between these values.

The second limitation is that the encoding is very inefficient in its use of available bits. The number of possible on-bits out of bits bits is given by the binomial coefficient:

(binomial 12 4) => 495

which is significantly larger than the number of distinct SDRs produced by the scalar-encoder:

(count (set (map to-bitstring (range -100 100)))) => 9

This encoder is only using 1.8% of the SDRs available. Perhaps this is an extreme example because of the very small number of bits. Let's check the numbers for a more typical encoder:

(def enc (s/scalar-encoder :bits 512 :on 21 :minimum 1 :maximum 512))
(def to-bitstring (:encode-to-bitstring enc))
(str (binomial 512 21)) => "10133758886507113849867996785041062400"
(count (set (map to-bitstring (range -100 1000)))) => 492

Oh dear.

1.2 The Random Distributed Scalar Encoder

Chetan's talk explains a new approach. The idea is to use a set of 'buckets' to represent a set of intervals in the encoders range, and to encode each one using a randomly chosen set of on bits. The property of semantic closeness is achieved by only changing one bit at a time when choosing the bits for the next bucket along.

Let's implement an RDSE. We'll need a utility function so we always get the same SDRs.

(def randomer (random-fn-with-seed 123456)) ; returns the same sequence of random numbers
(randomer 10) => 3
(randomer 10) => 7

1.2.1 Basic Random Distributed Scalar Encoder - Implementation

The following is a basic RDSE implemented in Clojure. It starts off empty, and adds buckets lazily as new values are encoded. Please let me know if there is anything unclear in this code!.

(defn bottom-sorter
    "sorts buckets by their bottom value"
    [x y]
    (let [c (compare (x :bottom) (y :bottom))]
         (if (zero? c)
             (compare x y)
             c)))

(defn ordered-bins [bins] (sort-by :bottom bins))

(defn find-bucket
    "returns the bucket which covers value. updates the 'read' slot of the bucket"
    [^double value buckets]
    (when-let [bucket (first (filter #(<= (:bottom %) value (:top %)) (:bins buckets)))]
        bucket)); removed from find-bucket to make it pure; (swap! buckets update-in [:bins (:index bucket) :read] inc)

(defn new-bucket
    "returns a bucket map given centre, radius and index"
    [^double value ^double radius ^long index]
    {:bottom (- value radius) :top (+ value radius) :index index :counter 1 :read 0})

(defn min-distance
    "used to accumulate the best nearby bucket when searching"
    [acc a-bucket]
    (let [diff (min (abs-diff (:mine acc) (:bottom a-bucket))
                    (:best acc))]
         (if (< diff (:best acc))
             (conj acc {:index (:index a-bucket) :best diff})
             acc)))

(defn sdrs [bins] (reduce conj #{} (map :sdr bins)))

(defn bottom-of-buckets [bins] (reduce min (map :bottom bins)))

(defn top-of-buckets [bins] (reduce max (map :top bins)))

(defn n-bins [buckets] (count (:bins buckets)))

(defn search-starter [bucket] {:index nil :best Integer/MAX_VALUE  :mine (:bottom bucket)})

(defn sdr->bitstring [sdr bits] (apply str (vec (map #(if (contains? (set sdr) %) 1 0) (range bits)))))

(defn new-sdr
    [bucket buckets]
    (let [bins (:bins buckets)
          ^int on (:on buckets)
          ^int bits (:bits buckets)
          randomer (:randomer buckets)]
         (if (empty? bins)
             (vec (range on))
             (let [sorted-bins (sort-by :bottom bins)
                   above? (> (:bottom bucket) (:bottom (first sorted-bins)))
                   nearest-buckets
                       (if above?
                           (vec (reverse (drop (- (count sorted-bins) on) sorted-bins)))
                           (vec (take on sorted-bins)))
                   nearest-bits (vec (sort (reduce #(union %1 (set (:sdr %2))) #{} nearest-buckets)))
                   previous-sdr (:sdr (first nearest-buckets))
                   previous-sdr (if above? previous-sdr (vec (reverse previous-sdr)))
                   remove-bit (previous-sdr (inc (randomer (dec on))))
                   remove-bit (previous-sdr (randomer on))
                   same-bits (vec (disj (set previous-sdr) remove-bit))
                   free-bits (vec (difference (set (range bits)) (set nearest-bits)))
                   new-bit-pos (randomer (count free-bits))
                   new-bit (free-bits new-bit-pos)
                   new-sdr (vec (sort (conj (set same-bits) new-bit)))]
                new-sdr))))

(defn add-to-buckets!
    [buckets bucket]
    (let [bits (:bits @buckets)
         sdr (new-sdr bucket @buckets)
         sdr-bucket (conj bucket {:sdr sdr})
         ;bitstring (sdr->bitstring sdr bits)
         ;sdr-bucket (conj bucket {:sdr sdr :bitstring bitstring})
         ]
       (swap! buckets update-in [:bins] conj sdr-bucket)
       sdr-bucket))

(defn add-bucket!
    [value buckets]
    (let [diameter (:diameter @buckets)
          radius (/ diameter 2.0)
          mn #(bottom-of-buckets (:bins @buckets))
	      mx #(top-of-buckets (:bins @buckets))]
      (if (empty? (:bins @buckets))
        (let [bucket (new-bucket value radius (n-bins @buckets))]
          (add-to-buckets! buckets bucket))
        (do (while (> value (mx))
	           (add-to-buckets! buckets (new-bucket (+ (mx) radius) radius (n-bins @buckets))))
	        (while (< value (mn))
	           (add-to-buckets! buckets (new-bucket (- (mn) radius) radius (n-bins @buckets))))))))

(defn random-sdr-encoder-1
    [& {:keys [^double diameter ^int bits ^int on] :or {diameter 1.0 bits 127 on 21}}]
    (let [randomer
            (random-fn-with-seed 123456)
          buckets
            (atom {:diameter diameter :bits bits :on on :randomer randomer :bins []})
          encode!
            (fn [^double x]
                (if-not (find-bucket x @buckets) (add-bucket! x buckets))
                (sort (:sdr (find-bucket x @buckets))))
          encode-to-bitstring!
            (fn [^double x]
                (sdr->bitstring (encode! x) bits))]
        {:buckets buckets
         :encode encode!
         :encode-to-bitstring! encode-to-bitstring!
         :bits bits
         :on on}))

(defn precalculate [x f & {:keys [center step] :or {center 0.0 step 10.0}}]
    (let [bands (inc (int (/ x step)))]
       (dorun (for [band (map inc (range bands))]
           (do (f (- center (* step band)))
               (f (+ center (* step band)))
;(println (str "precalculating [" (- 0 (* step band)) "," (* step band) "]"))
           ))))
    (f x))

(defn random-sdr-encoder
    [& {:keys [^double diameter ^int bits ^int on ^double center]
        :or {diameter 1.0 bits 127 on 21 center 0.0}}]
    (let [randomer
            (random-fn-with-seed 123456)
          buckets
            (atom {:diameter diameter :bits bits :on on :randomer randomer :bins []})
          encode!
            (fn [^double x]
                (if-not (find-bucket x @buckets) (add-bucket! x buckets))
                (sort (:sdr (find-bucket x @buckets))))
          encode (fn [^double x] (precalculate x encode! :center center :step (* 10 diameter)))
          encode-to-bitstring!
            (fn [^double x]
                (sdr->bitstring (encode! x) bits))]
        {:buckets buckets
         :encode encode
         :encode-to-bitstring! encode-to-bitstring!
         :bits bits
         :on on}))

1.2.2 Testing the RDSE on 4-of-12-bit encoding

Let's re-run the tests for the simple 12 bit encoder, where we had very quickly saturated the basic scalar-encoder.

(def encoder (r/random-sdr-encoder-1 :diameter 1.0 :bits 12 :on 4))
(def buckets (:buckets encoder))
(def to-bitstring (:encode-to-bitstring! encoder))

(to-bitstring 1) => "111100000000"

(vec (map to-bitstring (range -5 22)))
=> ["000010111000"
     "010010101000"
     "010010001001"
     "011010000001"
     "011001000001"
     "011100000001"
     "111100000000"
     "111000001000"
     "011000001010"
     "011000101000"
     "010000101100"
     "000100101100"
     "000101101000"
     "000101001010"
     "000101010010"
     "010001010010"
     "001001010010"
     "000001110010"
     "000000110110"
     "100000110100"
     "000000111100"
     "010000111000"
     "001000111000"
     "001000011001"
     "101000010001"
     "100000010011"
     "100000010101"]

(count (set (map to-bitstring (range -500 500)))) => 436

As we can see, this encoder is capable of storing 436 out of 495 passible encodings (note that each pair of encoded buckets differs only in one bit, so this is pretty good).

1.2.3 A larger encoding: 21-of-128 bits

Let's see how much capacity we can get with a more typical 128 bit encoding (standard 21 bits on).

(def encoder (r/random-sdr-encoder-1 :diameter 1.0 :bits 128 :on 21))
(def buckets (:buckets encoder))
(def to-bitstring (:encode-to-bitstring! encoder))
(str (binomial 512 21)) => "10133758886507113849867996785041062400"
(to-bitstring 0) => "11111111111111111111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
(to-bitstring 1) => "11111111111111111111000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000"

We'll put 10000 values into the encoder.

(def n 10000)
(to-bitstring n) => "10000000001010001100000000000100000000000010000011000000101010000000000000011100000001000110010000000000000000010000000000010000"
(println (time (count (set (map to-bitstring (range n))))))
; "Elapsed time: 34176.316 msecs"
(count (set (map to-bitstring (range n)))) => n

OK, everything's looking good. We could try using a smaller encoding and see if it still works.

(def encoder (r/random-sdr-encoder-1 :diameter 1.0 :bits 64 :on 21))

(def buckets (:buckets encoder))

(def to-bitstring (:encode-to-bitstring! encoder))

(str (binomial 64 21))

=> "41107996877935680"

(to-bitstring 0)

=> "1111111111111111111110000000000000000000000000000000000000000000"

(to-bitstring 1)

=> "1111011111111111111110000000000000000010000000000000000000000000"

(def n 1000)
(to-bitstring (dec n)) => "1010100100000010010000100001000000001000000110000011110001111101"
(to-bitstring n) => "1000100100000010010000100001000000001000001110000011110001111101"
(time (count (set (map to-bitstring (range n)))))
; "Elapsed time: 406.187 msecs"
(count (set (map to-bitstring (range n)))) => n
(count (set (map to-bitstring (range 20000)))) => 20000

1.2.4 Encoding is dependent on order of data

Observe the following

(def to-bitstring (:encode-to-bitstring! (r/random-sdr-encoder-1 :diameter 1.0 :bits 12 :on 4)))
(to-bitstring 0) => "111100000000"
(to-bitstring 1) => "111000000001"
(to-bitstring 2) => "101001000001" ;; first encoding of 2
;; reset
(def to-bitstring (:encode-to-bitstring! (r/random-sdr-encoder-1 :diameter 1.0 :bits 12 :on 4)))
(to-bitstring 0) =>  "111100000000"
(to-bitstring 1) =>  "111000000001"
(to-bitstring -1) => "110101000000"
(to-bitstring 2) =>  "101010000001" ;; a different encoding of 2

The encoding of 2 thus depends on the sequence of buckets already set up when 2 is presented. The only way to avoid this is to always build the buckets in the same order, regardless of what data is presented:

(def to-bitstring (:encode-to-bitstring! (r/random-sdr-encoder-1 :diameter 1.0 :bits 12 :on 4)))
(defn to-bitstring-pre [x] (precalculate x to-bitstring))

(to-bitstring-pre 0) => "001000110100" ;; causes (-10,10) to be encoded in advance
(to-bitstring-pre 1) => "100000110100"
(to-bitstring-pre 2) => "100001110000" ;; first encoding of 2
;; reset
(def to-bitstring (:encode-to-bitstring! (r/random-sdr-encoder-1 :diameter 1.0 :bits 12 :on 4)))
(defn to-bitstring-pre [x] (precalculate x to-bitstring))
(to-bitstring-pre 0) =>  "001000110100"
(to-bitstring-pre 1) =>  "100000110100"
(to-bitstring-pre -1) => "001010110000"
(to-bitstring-pre 2) =>  "100001110000" ;; the same encoding of 2

1.3 Conclusions and Further Improvements

It appears from the above tests that indeed the RDSE may be a significant improvement on the current scalar encoder. It comes a lot closer to exploiting the capacity of SDRs in the CLA.

We should investigate how we should measure the effectiveness and efficiency of our choice of encoders and encodings. While the RDSE looks promising, how can we be sure? Perhaps the swarming algorithms would be able to help us here.

This implementation, as a first draft, seems to be reasonably performant. There may be come tweaks which will speed it up a good bit (such as removing all the searching code in new-sdr and keeping a running cache of nearby SDRs).

Anyone interested in implementing this in C++ and Python, please do so and let us know how you get on.