# CljElixir: A Clojure-Syntax Language for the BEAM ## Overview CljElixir is Clojure for the BEAM. It combines Clojure's syntax, data-driven philosophy, and core vocabulary with the BEAM runtime — OTP, supervision trees, hot code swapping, distributed computing, and lightweight processes. CljElixir compiles to Elixir AST and delegates to the Elixir compiler for macro expansion, protocol consolidation, and BEAM bytecode generation. This means full interop with Elixir and Erlang libraries — any BEAM module is one `module/function` call away. BEAM's native data structures — maps (HAMT for >32 keys), lists (cons cells with tail sharing), and MapSets — are already persistent and immutable with structural sharing. CljElixir uses them directly. The one addition is PersistentVector (bit-partitioned trie for O(log32 n) indexed access), ported from ClojureScript. A core vocabulary (`get`, `assoc`, `dissoc`, `reduce`, `first`, `rest`, etc.) built on extensible protocols provides a uniform interface across all types. --- ## Architecture ``` .clje source files │ ▼ ┌─────────────┐ │ Reader │ S-expressions → CljElixir AST (Clojure-shaped) └─────┬───────┘ │ ▼ ┌──────────────┐ │ Analyzer │ AST validation (arity, recur position, map literals) └─────┬────────┘ │ ▼ ┌──────────────┐ │ Transformer │ CljElixir AST → Elixir AST ({atom, meta, args} tuples) └─────┬────────┘ │ ▼ ┌──────────────┐ │ Elixir │ Macro expansion, protocol consolidation, │ Compiler │ BEAM bytecode generation └──────────────┘ ``` Key insight: Elixir's AST is already a tagged s-expression — every node is `{operation, metadata, arguments}`. CljElixir's transformer maps from one s-expression format to another. The Elixir compiler handles everything hard. Implementation: A Mix compiler that reads `.clje` files, transforms them to Elixir AST, and feeds them into the standard compilation pipeline. ### Analyzer The analyzer validates the CljElixir AST before transformation. It catches structural errors early with source-mapped diagnostics: - **Special form arity** — `defmodule`, `let`, `if`, `case`, `cond`, `loop`, `fn`, `defn` checked for correct argument count - **Map literals** — must have even number of forms (key-value pairs) - **Recur position** — `recur` must be in tail position within `loop` or `defn` - **Binding vectors** — `let`/`loop` must have even-length binding pairs - **Receive clauses** — validates pattern/body or pattern/guard/body structure ### Protocol Compilation Both `extend-type` and `extend-protocol` compile to Elixir `defimpl` calls. The mapping is mechanical — `defimpl` is always a `(protocol, type, fns)` triple, and both Clojure forms decompose into a list of those triples: ``` extend-type: type fixed, iterate over protocols → N defimpls extend-protocol: protocol fixed, iterate over types → N defimpls ``` Example: ```clojure ;; CljElixir source: (extend-type Map ILookup (-lookup ([m k] (Map/get m k)) ([m k not-found] (Map/get m k not-found))) ICounted (-count [m] (map-size m))) ``` ```elixir # Emitted Elixir AST (two defimpl nodes): defimpl ILookup, for: Map do def lookup(m, k), do: Map.get(m, k) def lookup(m, k, not_found), do: Map.get(m, k, not_found) end defimpl ICounted, for: Map do def count(m), do: map_size(m) end ``` `defprotocol` compiles directly to Elixir's `defprotocol`. `reify` compiles to an anonymous struct + inline `defimpl` calls. The transformer hoists `defimpl` nodes to module scope when they appear inside function bodies. --- ## Syntax Reference ### Modules and Functions ```clojure ;; ns declares the module — all top-level forms become module members (ns Greeter (:require [clje.core :refer :all])) (defn hello [name] (str "hello " name)) ;; defmodule is available for scripts with multiple modules (defmodule Greeter (defn hello [name] (str "hello " name))) ;; Multi-clause pattern matching (same shape as multi-arity) (defn process ([[:ok data]] (handle data)) ([[:error reason]] (log reason))) ;; Multi-arity (defn greet ([name] (greet name "hello")) ([name greeting] (str greeting " " name))) ;; Both together: multi-clause + multi-arity (defn handle ([[:ok data]] (process data)) ([[:error reason]] (log reason)) ([[:error reason] opts] (log reason opts))) ;; Private functions (defn- internal-helper [x] (* x 2)) ;; Anonymous functions #(* % 2) ;; single arg #(+ %1 %2) ;; multiple args (fn [x y] (+ x y)) ;; explicit form ``` ### Data Literals ```clojure ;; === BEAM-native data (maps, lists, sets — already persistent) === {:name "Ada" :age 30} ;; Erlang map (HAMT for >32 keys) '(1 2 3) ;; Erlang list (cons cells, tail sharing) #{:a :b :c} ;; Erlang MapSet ;; === CljElixir-provided === [1 2 3] ;; PersistentVector (bit-partitioned trie) ;; === Raw BEAM tuple === #el[:ok value] ;; Erlang tuple ;; Keywords :ok :error :shutdown ;; Strings "hello world" ;; Regex (Clojure style) #"pattern" #"^\d{3}-\d{4}$" ``` Maps, lists, and sets are BEAM-native types. They flow freely between CljElixir and Elixir/Erlang with zero conversion. PersistentVector is the one CljElixir-specific type. `#el[]` creates BEAM tuples directly. Vectors in pattern position match BEAM tuples (since incoming messages and Elixir interop always produce tuples): ```clojure [:ok data] ;; in case/receive: matches tuple {:ok, data} [x y] ;; in defn: parameter list [x 1 y 2] ;; in let: binding pairs ``` ### Module Calls (FFI) All BEAM module calls use the same `module/function` syntax. The compiler distinguishes Elixir from Erlang modules by case — this isn't an invented rule, it's how the BEAM works. Erlang modules are lowercase atoms (`:crypto`, `:ets`). Elixir modules are uppercase-prefixed atoms (`:"Elixir.Enum"`, `:"Elixir.Map"`). ```clojure ;; Elixir modules (uppercase → :"Elixir.Enum".map(...)) (Enum/map list func) (String/split s ",") (Map/merge m1 m2) (GenServer/start-link MyServer args) ;; Erlang modules (lowercase → :crypto.strong_rand_bytes(...)) (erlang/system-time :millisecond) (crypto/strong-rand-bytes 16) (ets/new :my-table [:set :public]) (io/format "hello ~s~n" ["world"]) ;; No special syntax — it's all just module/function ``` ### Dynamic Vars ```clojure *self* ;; current process (replaces self()) *node* ;; current BEAM node (e.g. :"myapp@192.168.1.1") ``` ### Metadata ```clojure ;; Module metadata via ^{} (defmodule form) (defmodule ^{:author "Ada"} Greeter ...) ``` ### Docstrings Docstrings are metadata. The string between name and body is sugar for `^{:doc "..."}`: ```clojure ;; These are equivalent: (defn hello "Greets someone by name" [name] (str "hello " name)) (defn ^{:doc "Greets someone by name"} hello [name] (str "hello " name)) ``` Works everywhere metadata works — `defn`, `defmodule`, `defprotocol`, `defrecord`: ```clojure (defmodule Greeter "A module for greeting people" (defn hello "Greets someone by name" [name] (str "hello " name))) (defprotocol Describable "Protocol for human-readable descriptions" (describe [value] "Returns a description string")) (defrecord User "A user in the system" [name age email]) ``` --- ## Records CljElixir has `defrecord`. It does not have `deftype`. Clojure's `deftype` exists for mutable fields — `set!` on fields inside method bodies, used to build data structure internals. BEAM has no in-place mutation of heap terms, so `deftype`'s core capability cannot be supported. There is no `{:bare true}` mode either. On BEAM, a `defrecord` compiles to an Elixir struct, which is a map. Stripping map interfaces from something that *is* a map just hides functionality for no reason. If it's immutable and it's a map, let it be a map. ### `defrecord` `defrecord` compiles to an Elixir `defstruct` with auto-generated protocol implementations for `ILookup`, `IAssociative`, `IMap`, `ICounted`, `ISeqable`, `IEquiv`, `IHash`, `IMeta`, `IWithMeta`. You get keyword access, `assoc`, `dissoc`, equality, hashing, and destructuring for free. ```clojure (defrecord User [name age email]) ;; Positional constructor (->User "Ada" 30 "ada@example.com") ;; Map constructor (map->User {:name "Ada" :age 30 :email "ada@example.com"}) ;; Keyword access (auto ILookup) (:name user) ;; => "Ada" ;; Update (auto IAssociative) (assoc user :age 31) ;; Equality (auto IEquiv) (= (->User "Ada" 30 "ada@example.com") (->User "Ada" 30 "ada@example.com")) ;; => true ;; Destructuring (let [{:keys [name email]} user] (str name " <" email ">")) ``` Records can implement additional protocols inline: ```clojure (defrecord PriorityQueue [items comparator] ICounted (-count [_] (count items)) ICollection (-conj [_ item] (let [new-items (sort-by comparator (cons item items))] (->PriorityQueue new-items comparator))) ISeqable (-seq [_] items)) ``` Internal data structures use the same `defrecord` — there's no reason to strip map interfaces from something that is a map: ```clojure (defrecord VectorNode [edit arr]) (defrecord PersistentVector [meta cnt shift root tail] ICounted (-count [_] cnt) IIndexed (-nth [this n] (if (and (>= n 0) (< n cnt)) (let [node (unchecked-array-for this n)] (nth node (bit-and n 0x01f))) (throw (str "Index " n " out of bounds")))) ICollection (-conj [this val] ;; ... append to tail or create new level ) IMeta (-meta [_] meta) IWithMeta (-with-meta [_ new-meta] (->PersistentVector new-meta cnt shift root tail))) ;; VectorNode fields are accessible like any record (:arr node) ;; works (:edit node) ;; works ``` ### Impact of No `deftype` **PersistentVector is buildable.** The persistent operations — `nth`, `assoc`, `conj`, `pop` — are path-copying by design. You create new nodes along the changed path, sharing everything else. This doesn't require mutable fields: ```clojure ;; Path-copy: clone node, replace one slot (defn clone-and-set [node i val] (let [arr (Map/get node :arr) new-arr (put-elem arr i val)] (->VectorNode (Map/get node :edit) new-arr))) ``` **Hash caching is eager.** ClojureScript caches hash codes lazily in a `^:mutable __hash` field. On BEAM, hashes must be computed at construction time or recomputed on every call. --- ## Schemas (Malli) CljElixir uses Malli-style data-driven schemas instead of Elixir's `@spec` annotations. Schemas are plain data — maps, vectors, keywords — not macros or special syntax. ### Compile-Time Requirement Schemas are "just data," but that data must be available at compile time to be useful. Top-level `def` forms are evaluated at compile time (same as Clojure), so named schemas are compile-time constants: ```clojure ;; These are compile-time values — the compiler can see them (def User [:map [:name :string] [:age :int] [:email :string]]) (def PositiveInt [:and :int [:> 0]]) (def Status [:enum :active :inactive :pending]) ``` `m/=>` is a macro that runs at compile time. It reads the schema data, emits an Elixir `@spec` attribute for Dialyzer, and optionally registers the schema for runtime validation. Because the schema is a compile-time constant, the macro can walk it immediately: ```clojure (defn hello [name] (str "hello " name)) (m/=> hello [:=> [:cat :string] :string]) ;; At compile time: ;; 1. Emits @spec hello(String.t()) :: String.t() for Dialyzer ;; 2. Registers schema for optional runtime validation ``` If a schema references another schema by name (`PositiveInt` inside `Config`), that name must resolve to a compile-time value — it must be `def`'d before use: ```clojure (def PositiveInt [:and :int [:> 0]]) (def Config [:map [:host :string] [:port PositiveInt] ;; resolved at compile time [:ssl? :boolean]]) ``` ### Function Schemas ```clojure ;; Single arity (defn hello [name] (str "hello " name)) (m/=> hello [:=> [:cat :string] :string]) ;; Multi-arity (defn greet ([name] (greet name "hello")) ([name greeting] (str greeting " " name))) (m/=> greet [:function [:=> [:cat :string] :string] [:=> [:cat :string :string] :string]]) ``` ### Validation and Coercion (Runtime) These are runtime operations — the adapter phase. Not in the initial implementation: ```clojure ;; Validate (m/validate User {:name "Ada" :age 30 :email "ada@example.com"}) ;; => true ;; Explain failures (m/explain User {:name "Ada" :age "thirty"}) ;; => {:errors [{:path [:age] :value "thirty" :schema :int}]} ;; Coerce (m/coerce PositiveInt "42") ;; => 42 ``` ### Recursive Types Malli handles recursion via `:ref` within a local `:registry`. Without `:ref`, schemas expand eagerly and stack overflow: ```clojure ;; Recursive: linked list of ints (def IntList [:schema {:registry {::cons [:maybe [:tuple :int [:ref ::cons]]]}} [:ref ::cons]]) ;; Mutual recursion (def PingPong [:schema {:registry {::ping [:maybe [:tuple [:= "ping"] [:ref ::pong]]] ::pong [:maybe [:tuple [:= "pong"] [:ref ::ping]]]}} [:ref ::ping]]) ;; Tree (def Tree [:schema {:registry {::tree [:or :int [:tuple [:ref ::tree] [:ref ::tree]]]}} [:ref ::tree]]) ``` The adapter maps these to Elixir recursive typespecs at compile time. Registry entries become `@type` definitions, `:ref` becomes a named type reference: ```clojure (def Tree [:schema {:registry {::tree [:or :int [:tuple [:ref ::tree] [:ref ::tree]]]}} [:ref ::tree]]) ;; emits: ;; @type tree :: integer | {tree, tree} ``` This works because Elixir/Dialyzer resolves named types lazily — the self-reference is fine as long as it's a named `@type`. The initial adapter generates Elixir typespecs from Malli schemas at compile time: ```clojure (m/=> hello [:=> [:cat :string] :string]) ;; emits: @spec hello(String.t()) :: String.t() (def User [:map [:name :string] [:age :int] [:email :string]]) ;; emits: @type user :: %{name: String.t(), age: integer()} ``` This gives you Dialyzer static analysis from day one. Full Malli (validation, coercion, generation) and clojure.spec support come later. --- ## Data Structures ### BEAM's Native Persistent Data Structures Elixir/Erlang data structures are already persistent and immutable with structural sharing. This is not a Clojure-specific feature — it's how BEAM works: **Maps (≤32 keys):** Flat sorted tuple of keys + contiguous values. Updates copy the entire structure, but it's small enough to be fast. **Maps (>32 keys):** Hash Array Mapped Trie (HAMT) — the same data structure Clojure uses for PersistentHashMap. Updates share common parts of the trie. Implemented in C inside the BEAM VM since OTP 18. **Lists:** Singly-linked cons cells. Prepend is O(1) with tail sharing — `(cons x xs)` reuses the entire existing list. Same as Clojure's lists. **Tuples:** Flat contiguous arrays. No structural sharing — any update copies the entire tuple. Used for small fixed-size groups (like `{:ok, value}`), not for collections. This means CljElixir does **not** need to port Clojure's PersistentHashMap. BEAM already has one, implemented in C, battle-tested, and almost certainly faster than anything we'd write in CljElixir. The core vocabulary (`get`, `assoc`, `dissoc`, etc.) dispatches through protocols directly to native BEAM operations. ### What BEAM Doesn't Have **Persistent Vector.** BEAM has no equivalent of Clojure's PersistentVector — a bit-partitioned trie giving O(log32 n) indexed access, O(1) append, and structural sharing. Erlang tuples give O(1) indexed access but O(n) update. Erlang lists give O(1) prepend but O(n) indexed access. PersistentVector fills the gap for collections that need both. This is the one data structure worth porting from ClojureScript. For batch-building maps, Erlang's `maps:from_list/1` (build a list of pairs, then convert in one shot) is already efficient. For batch-building vectors, `(into [] some-list)` is the idiomatic approach. ### CljElixir Data Literal Summary ```clojure ;; BEAM-native persistent data {:name "Ada" :age 30} ;; Erlang map (HAMT for >32 keys) '(1 2 3) ;; Erlang list (cons cells, tail sharing) #{:a :b :c} ;; Erlang MapSet (backed by map) ;; CljElixir-provided [1 2 3] ;; PersistentVector (bit-partitioned trie) ;; Tuple literal #el[:ok value] ;; Erlang tuple ;; Keywords :ok :error :shutdown ;; Strings "hello world" ;; Regex (Clojure style) #"pattern" #"^\d{3}-\d{4}$" ``` Vectors in pattern position match BEAM tuples (since incoming messages and Elixir interop always produce tuples): ```clojure [:ok data] ;; in case/receive: matches tuple {:ok, data} [x y] ;; in defn: parameter list [x 1 y 2] ;; in let: binding pairs ``` Core vocabulary (`get`, `assoc`, `dissoc`, `first`, `rest`, `count`, etc.) works on all types through protocol dispatch. All persistent types implement the core protocols (`ILookup`, `IAssociative`, `ISeq`, `ICounted`, etc.), so the entire core vocabulary works on them. ### BEAM Interop Boundary Since CljElixir maps, lists, and sets ARE BEAM-native types, there's no conversion boundary for them. Data flows freely between CljElixir and Elixir/Erlang code without any conversion. `(assoc m :k v)` calls `Map.put` — it's the same map. The only CljElixir-specific data structure is PersistentVector (`[]`). For tuples, CljElixir provides both a reader macro and a function: ```clojure ;; Reader macro (literal) #el[:ok "data"] ;; Function (n-ary, apply-compatible) (tuple :ok "data") ;; => {:ok, "data"} (tuple) ;; => {} (empty tuple) (apply tuple args) ;; works (map (fn [x] (tuple :ok x)) items) ;; works ``` Conversions use `into` with an empty target: ```clojure ;; Vector → tuple (into (tuple) [1 2 3]) ;; => {1, 2, 3} ;; Tuple → vector (into [] some-tuple) ;; => [elem0, elem1, ...] ;; Tuple → list (into '() some-tuple) ;; => (elemN, ... elem1, elem0) ;; List → vector (into [] some-list) ;; Vector → list (into '() some-vector) ``` `tuple` is a regular function in `clje.core`. Tuples implement `ISeqable` and `ICounted`, so `into`, `count`, `seq`, `first`, `rest`, `nth` all work on them. ### Runtime Conversion: `clojurify` and `elixirify` Since maps and lists are already BEAM-native, conversion is only needed for vectors (CljElixir-specific) and tuples (BEAM-specific): ```clojure (defprotocol IClojurify "Convert BEAM tuples/lists to CljElixir vectors where appropriate." (-clojurify [o])) (defprotocol IElixirify "Convert CljElixir vectors to BEAM lists/tuples." (-elixirify [o])) ``` ```clojure ;; Tuple → vector (clojurify #el[:ok "data"]) ;; => [:ok "data"] ;; List → vector (clojurify '(1 2 3)) ;; => [1 2 3] ;; Vector → list (elixirify [1 2 3]) ;; => '(1 2 3) ;; Deep conversion walks nested structures (clojurify #el[:ok #el[:nested "data"]]) ;; => [:ok [:nested "data"]] ``` Since they're protocols, you can extend them to your own types: ```clojure (defrecord User [name age] IElixirify (-elixirify [u] {:name name :age age :type "user"})) ``` ### Implementation Note BEAM maps >32 keys already use the same HAMT data structure as Clojure's PersistentHashMap, implemented in C inside the VM. There is no need to reimplement this. Erlang lists are already cons-cell linked lists with tail sharing. The only data structure that needs porting from ClojureScript is PersistentVector (~600 lines in ClojureScript), which provides O(log32 n) indexed access and O(1) append — something BEAM has no native equivalent for. --- ## Core Protocols CljElixir uses the ClojureScript protocol naming convention. These are real Elixir protocols under the hood — users can extend them to custom types — but the compiler optimizes known cases to direct function calls. ### Protocol Hierarchy ``` ;; Persistent (structural sharing, immutable) ILookup — -lookup IAssociative — -assoc, -contains-key? extends ILookup IMap — -dissoc extends IAssociative ICollection — -conj ICounted — -count ISeqable — -seq ISeq — -first, -rest IIndexed — -nth IFn — -invoke IMeta — -meta IWithMeta — -with-meta IStack — -peek, -pop IMapEntry — -key, -val IKVReduce — -kv-reduce ``` Two tiers of data: **BEAM-native (maps, lists, sets)** — the default for maps, lists, and sets. Already persistent with structural sharing (maps >32 keys use HAMT, lists share tails). `assoc`, `dissoc`, `conj` return new values via BEAM operations. **PersistentVector** — CljElixir-provided. Bit-partitioned trie for O(log32 n) indexed access and O(1) append with structural sharing. The one data structure BEAM doesn't have natively. ```clojure ;; Maps are BEAM-native, already persistent (assoc {:a 1} :b 2) ;; => BEAM map, structural sharing for >32 keys ;; Vectors are CljElixir PersistentVector (assoc [1 2 3] 1 :x) ;; => [1 :x 3], structural sharing (conj [1 2 3] 4) ;; => [1 2 3 4], O(1) append ;; Batch building uses into (into [] (map (fn [x] (* x x)) items)) (into {} (map (fn [x] [x (* x x)]) items)) ``` ### Protocol Definitions ```clojure (defprotocol ILookup "Protocol for looking up a value in a data structure." (-lookup [o k] [o k not-found])) (defprotocol IAssociative "Protocol for adding associativity to collections." (-contains-key? [coll k]) (-assoc [coll k v])) (defprotocol IMap "Protocol for full map operations." (-dissoc [coll k])) (defprotocol ICounted "Calculates the count of a collection in constant time." (-count [coll])) (defprotocol ISeqable "Protocol for producing a sequence from a collection." (-seq [o])) (defprotocol ISeq "Protocol for sequential access." (-first [coll]) (-rest [coll])) (defprotocol ICollection "Protocol for generic collection operations." (-conj [coll o])) (defprotocol IIndexed "Protocol for numeric index access." (-nth [coll n] [coll n not-found])) (defprotocol IFn "Protocol for invocable things." (-invoke [o] [o a] [o a b] [o a b c])) (defprotocol IMeta "Protocol for accessing metadata." (-meta [o])) (defprotocol IWithMeta "Protocol for adding metadata." (-with-meta [o meta])) (defprotocol IStack "Protocol for stack operations." (-peek [coll]) (-pop [coll])) (defprotocol IMapEntry "Protocol for examining a map entry." (-key [coll]) (-val [coll])) (defprotocol IKVReduce "Protocol for key-value reduce." (-kv-reduce [coll f init])) ``` ### Extending Types Use `extend-type` (one type, many protocols) and `extend-protocol` (one protocol, many types), exactly as in Clojure: ```clojure ;; extend-type: extend one BEAM type with many protocols (extend-type Map ILookup (-lookup ([m k] (Map/get m k)) ([m k not-found] (Map/get m k not-found))) IAssociative (-contains-key? [m k] (Map/has-key? m k)) (-assoc [m k v] (Map/put m k v)) IMap (-dissoc [m k] (Map/delete m k)) ICounted (-count [m] (map-size m)) ISeqable (-seq [m] (Map/to-list m)) ICollection (-conj [m entry] (Map/merge m entry)) IFn (-invoke [m k] (Map/get m k)) IKVReduce (-kv-reduce [m f init] (Enum/reduce m init (fn [acc [k v]] (f acc k v))))) ;; extend-type for lists (extend-type List ISeq (-first [l] (hd l)) (-rest [l] (tl l)) ICounted (-count [l] (length l)) ISeqable (-seq [l] l) ICollection (-conj [l o] (cons o l))) ;; extend-protocol: one protocol across many BEAM types (extend-protocol ICounted Map (-count [m] (map-size m)) List (-count [l] (length l)) Tuple (-count [t] (tuple-size t)) BitString (-count [s] (byte-size s))) ``` ### Performance Core functions dispatch through protocols. The protocols are real Elixir protocols — the BEAM handles dispatch optimization natively (consolidated protocols use a lookup table, not dynamic dispatch). --- ## Core Vocabulary Core functions dispatch through protocols. Interop calls (`Module/function`) are always available as an escape hatch. ### Data Access | CljElixir | Dispatches through | Notes | |----------------------|--------------------------|--------------------------------| | `(get m k)` | `ILookup` | Maps, vectors, records | | `(get m k nf)` | `ILookup` | With not-found default | | `(get-in m ks)` | `ILookup` (nested) | Deep access | | `(assoc m k v)` | `IAssociative` | Maps and vectors | | `(assoc-in m ks v)` | `IAssociative` (nested) | Deep update | | `(dissoc m k)` | `IMap` | Maps only | | `(update m k f)` | `IAssociative` | Apply f to value at k | | `(update-in m ks f)` | `IAssociative` (nested) | Deep apply | | `(merge m1 m2)` | `IMap` | Maps only | | `(conj c x)` | `ICollection` | Type-dependent append | | `(into c1 c2)` | `ICollection` | Reduce c2 into c1 | | `(count c)` | `ICounted` | All collections | | `(contains? m k)` | `IAssociative` | Key presence | | `(keys m)` | `IMap` | Maps only | | `(vals m)` | `IMap` | Maps only | | `(select-keys m ks)` | `IMap` | Maps only | | `(empty? c)` | `ICounted`/`ISeqable` | All collections | | `(nth v n)` | `IIndexed` | Vectors (O(log32 n)) | ### Keyword-as-Function ```clojure (:name user) ;; => ILookup.-lookup(user, :name) (:name user "default") ;; => ILookup.-lookup(user, :name, "default") ``` ### Sequences | CljElixir | Compiles to | |-------------------|--------------------------| | `map` | `Enum.map` | | `filter` | `Enum.filter` | | `reduce` | `Enum.reduce` | | `reduce-kv` | `IKVReduce.-kv-reduce` | | `first` | `ISeq.-first` | | `rest` | `ISeq.-rest` | | `seq` | `ISeqable.-seq` | | `cons` | `[h \| t]` construction | | `concat` | `Enum.concat` | | `take` | `Enum.take` | | `drop` | `Enum.drop` | | `partition` | `Enum.chunk_every` | | `sort` | `Enum.sort` | | `sort-by` | `Enum.sort_by` | | `group-by` | `Enum.group_by` | | `frequencies` | `Enum.frequencies` | | `distinct` | `Enum.uniq` | | `mapcat` | `Enum.flat_map` | ### Arithmetic, Logic, and Type Checks | CljElixir | Notes | |-------------------|--------------------------| | `+`, `-`, `*`, `/`| Variadic arithmetic | | `>`, `<`, `>=`, `<=` | Comparisons | | `=` | Value equality (IEquiv) | | `==` | Numeric equality | | `not=`, `!=` | Inequality | | `inc`, `dec` | Increment/decrement | | `rem` | Remainder | | `not` | Logical negation | | `and`, `or` | Variadic logical ops | | `nil?` | Nil check | | `is-pid` | PID type check | | `is-binary` | Binary type check | | `is-list` | List type check | | `is-integer` | Integer type check | ### Vectors | CljElixir | Notes | |-------------------|--------------------------| | `vec` | Collection → PersistentVector | | `vector` | Args → PersistentVector | | `vector?` | PersistentVector check | | `subvec` | SubVector view (start, optional end) | | `peek` | Last element (IStack) | | `pop` | Remove last (IStack) | ### Tuples | CljElixir | Notes | |-------------------|--------------------------| | `tuple` | Args → BEAM tuple | | `tuple-size` | Tuple element count | | `elem` | Indexed access | | `put-elem` | Immutable tuple update | ### Lists | CljElixir | Notes | |-------------------|--------------------------| | `list` | Args → list | | `hd` | Head of list | | `tl` | Tail of list | ### Other Builtins | CljElixir | Notes | |-------------------|--------------------------| | `str` | String concatenation | | `println` | Print with newline | | `pr-str` | EDN-like string repr | | `pr` | Print EDN to stdout | | `prn` | Print EDN + newline | | `print-str` | Human-readable string | | `throw` | Raise exception | | `apply` | Dynamic function call | --- ## Equality CljElixir's `=` implements Clojure's value equality semantics, inspired by Baker's EGAL. On BEAM this is simpler than on JVM because everything is immutable — there are no mutable objects to special-case, and there is no pointer identity. ```clojure ;; = is deep value equality via IEquiv (= {:a 1} {:a 1}) ;; => true (= [1 2 3] [1 2 3]) ;; => true (= (->User "Ada" 30 "a@b") (->User "Ada" 30 "a@b")) ;; => true ;; Cross-type sequential equality (= [1 2 3] '(1 2 3)) ;; => true (both sequential, same elements) ;; == is numeric equality across types (== 1 1.0) ;; => true (== 1 2) ;; => false ``` Two operators: **`=`** — value equality. Dispatches through `IEquiv`/`-equiv`. For BEAM-native maps and lists, delegates to Erlang's structural comparison. For PersistentVector, element-by-element. For `defrecord`, auto-generated `IEquiv` compares type + all fields. Cross-type sequential equality: `(= [1 2] '(1 2))` is `true`. **`==`** — numeric equality across types. `(== 1 1.0)` is `true`. Throws on non-numbers. There is no `identical?`. BEAM has no pointer identity — all equality is value-based. This is a natural fit for EGAL: since nothing is mutable, value equality is the only equality that matters. `hash` is consistent with `=` — equal values produce the same hash. Dispatches through `IHash`/`-hash`. --- ## Printing CljElixir follows ClojureScript's print model: an `IPrintWithWriter` protocol that controls how values are represented as text. ### Two Print Families **Machine-readable (EDN):** `pr`, `prn`, `pr-str` — round-trippable. `(read-string (pr-str x))` gives back `x`. This is what the REPL uses. **Human-readable:** `print`, `println` — for display. Strings print without quotes, no escaping. ```clojure (pr-str "hello") ;; => "\"hello\"" (print-str "hello") ;; => "hello" (pr-str {:name "Ada"}) ;; => "{:name \"Ada\"}" (pr-str [1 2 3]) ;; => "[1 2 3]" (pr-str '(1 2 3)) ;; => "(1 2 3)" (pr-str #el[:ok "data"]) ;; => "#el[:ok \"data\"]" (pr-str (->User "Ada" 30 "a@b")) ;; => "#User{:name \"Ada\", :age 30, :email \"a@b\"}" ``` ### `IPrintWithWriter` All printing dispatches through the `IPrintWithWriter` protocol: ```clojure (defprotocol IPrintWithWriter (-pr-writer [o writer opts])) ``` Extend it to control how your types print: ```clojure (defrecord Money [amount currency] IPrintWithWriter (-pr-writer [this writer opts] (write writer (str "#Money[" amount " " currency "]")))) (pr-str (->Money 42.50 :USD)) ;; => "#Money[42.5 :USD]" ``` BEAM-native types have default implementations: maps as `{:k v}`, lists as `(1 2 3)`, tuples as `#el[:ok val]`, keywords as `:keyword`, strings as `"string"`. ### REPL The Read-Eval-Print Loop uses Elixir's `Code.eval_quoted/3` to compile and execute CljElixir AST at runtime: 1. **Read** — parse CljElixir source text into forms (s-expressions) 2. **Eval** — transform to Elixir AST, pass to `Code.eval_quoted/3` 3. **Print** — call `pr-str` on the result 4. **Loop** — carry the environment forward so `def`s persist across evaluations ``` clje> (assoc {:a 1} :b 2) {:a 1, :b 2} clje> (defrecord Point [x y]) Point clje> (->Point 3 4) #Point{:x 3, :y 4} clje> (+ 1 2) 3 ``` No special eval infrastructure — the compiler already produces Elixir AST, and Elixir can evaluate AST at runtime natively. --- ## Control Flow ### Binding Forms (use vectors) ```clojure (let [x 1 y 2] (+ x y)) (for [x (list 1 2 3 4 5) :when (> x 2)] (* x x)) (doseq [[name pid] members] (send pid :shutdown)) (if-let [val (get m :key)] (process val) :not-found) (when-let [val (get m :key)] (process val)) ``` ### `loop`/`recur` — Tail Recursion `loop` establishes bindings and a recursion point. `recur` jumps back to the nearest `loop` (or `defn`) with new binding values. Compiles to a tail-recursive function call — BEAM does tail call optimization natively, so no stack growth. ```clojure ;; loop with recur (loop [i 0 acc []] (if (< i 10) (recur (inc i) (conj acc (* i i))) acc)) ;; recur in defn (defn factorial [n] (loop [i n acc 1] (if (<= i 1) acc (recur (dec i) (* acc i))))) ;; recur to defn head (no loop needed) (defn count-down [n] (when (> n 0) (println n) (recur (dec n)))) ``` ### `with` — Sequential Pattern Matching `with` chains multiple pattern match bindings. Each binding matches the result of its expression. If any match fails, the non-matching value is returned immediately (short-circuit). This is not threading — each binding is independent, but later bindings can reference earlier ones. ```clojure ;; Basic: chain fallible operations (with [[:ok config] (load-config path) [:ok conn] (connect config) [:ok user] (authenticate conn creds)] (start-session user)) ;; if (load-config path) returns [:error :not-found], ;; the whole expression returns [:error :not-found] ;; With :else — handle the failure explicitly (with [[:ok config] (load-config path) [:ok conn] (connect config) [:ok user] (authenticate conn creds)] (start-session user) :else [:error :not-found] (log "config file missing") [:error :timeout] (retry) [:error reason] (log (str "failed: " reason))) ``` Without `:else`, the non-matching value passes through as the return value. With `:else`, you can pattern match on the failure to handle it explicitly. `with` is distinct from `if-let`: `if-let` binds a single value and branches on nil/truthiness. `with` chains N bindings and short-circuits on structural pattern mismatch. ### Branch Forms (bare clauses) ```clojure (case value [:ok x] x [:error _] nil) (cond (> x 0) :positive (< x 0) :negative :else :zero) (if (> x 0) :positive :non-positive) (when (> x 0) (do-something) :positive) ``` ### Threading Macros `->` inserts the threaded value as the first argument. `->>` inserts as the last argument. Both rewrite CljElixir AST at compile time (not runtime macros): ```clojure ;; Thread-first (-> "hello" (String/upcase) (str " WORLD")) ;; => "HELLO WORLD" ;; Thread-last (->> (list 1 2 3 4 5) (filter (fn [x] (> x 2))) (map (fn [x] (* x x)))) ;; => (9 16 25) ``` ### Exception Handling ```clojure ;; Basic try/catch (try (risky-operation) (catch e (str "error: " e))) ;; Typed catch (Elixir rescue) (try (risky-operation) (catch RuntimeError e (str "runtime: " e)) (catch ArgumentError e (str "argument: " e))) ;; Erlang-style catch (throw/exit/error) (try (Kernel/throw :boom) (catch :throw val val) (catch :exit reason reason) (catch :error e e)) ;; Finally (cleanup, doesn't affect return value) (try (open-resource) (catch e (log e)) (finally (close-resource))) ``` ### Variadic Parameters Functions can accept variable arguments with `& rest`: ```clojure (defn greet [greeting & names] (str greeting " " (Enum/join names ", "))) (greet "hello" "alice" "bob") ;; => "hello alice, bob" ;; Works in fn too (fn [x & rest] (cons x rest)) ``` ### Destructuring Works in `let`, `fn`, `defn`, `for`, and `doseq`: ```clojure ;; Map destructuring with :keys (let [{:keys [name age]} {:name "Ada" :age 30}] (str name " is " age)) ;; With :as to bind the whole map (let [{:keys [name] :as person} {:name "Ada" :age 30}] person) ;; String keys with :strs (let [{:strs [name]} {"name" "Ada"}] name) ;; Literal key binding (let [{x :x y :y} {:x 1 :y 2}] (+ x y)) ;; Sequential destructuring with & rest (let [[a b & rest] (list 1 2 3 4 5)] rest) ;; => (3 4 5) ;; Nested destructuring (let [{:keys [name] {:keys [city]} :address} {:name "Ada" :address {:city "London"}}] (str name " in " city)) ;; In defn (defn process [{:keys [name age]}] (str name " is " age)) ``` ### Guards Guards work in `receive`, `case`, and multi-clause `defn`/`fn`: ```clojure ;; Guard in receive (receive [:message from body] :guard [(is-binary body)] (handle body)) ;; Guard in case (case value x :guard [(> x 0)] :positive x :guard [(< x 0)] :negative _ :zero) ``` --- ## Concurrency ### Process Primitives ```clojure (spawn (fn [] (loop-fn initial-state))) (spawn-link (fn [] (loop-fn initial-state))) (send pid [:message data]) (receive [:join username pid] (handle-join state username pid) [:message from body] :guard [(is-binary body)] (handle-message state from body) :shutdown :ok :after 60000 :timeout) (monitor :process pid) (link pid) ``` ### Dynamic Vars ```clojure *self* ;; current process *node* ;; current BEAM node ``` --- ## User-Defined Protocols Protocols are defined with `defprotocol`, same as Clojure. They compile to Elixir protocols. ```clojure (defprotocol Describable "Protocol for human-readable descriptions." (describe [value])) ``` Extend to built-in types with `extend-type`: ```clojure (extend-type Integer Describable (describe [n] (str "the integer " n))) (extend-type List Describable (describe [l] (str "a list with " (count l) " elements"))) ``` Or extend across multiple types at once with `extend-protocol`: ```clojure (extend-protocol Describable Integer (describe [n] (str "the integer " n)) List (describe [l] (str "a list with " (count l) " elements")) Any (describe [x] (str "something: " x))) ``` ### `reify` — Anonymous Protocol Implementations `reify` creates a one-off instance implementing one or more protocols, with lexical closure over the surrounding scope. It's the anonymous version of `defrecord`: ```clojure ;; Adapter wrapping an Elixir resource (defn wrap-ets-table [table-id] (reify ILookup (-lookup ([_ k] (ets/lookup table-id k)) ([_ k not-found] (or (ets/lookup table-id k) not-found))) ICounted (-count [_] (ets/info table-id :size)))) ``` --- ## GenServer (OTP) ```clojure (ns MyServer (:require [clje.core :refer :all])) (use GenServer) (defn init [args] [:ok {:count 0}]) (defn handle-call ([:get _from state] [:reply (:count state) state]) ([:increment _from state] (let [new-state (update state :count inc)] [:reply :ok new-state]))) (defn handle-cast ([:reset state] [:noreply (assoc state :count 0)])) ``` --- ## Reference Example: ChatRoom ```clojure (ns ChatRoom (:require [clje.core :refer :all])) (defn loop [state] (receive [:join username pid] (let [members (assoc (:members state) username pid)] (send pid [:welcome username (count members)]) (loop (assoc state :members members))) [:message from body] :guard [(is-binary body)] (do (doseq [[_name pid] (:members state)] (send pid [:chat from body])) (loop state)) [:leave username] (loop (update state :members dissoc username)) [:kick username reason] :guard [(!= username (:owner state))] (if-let [pid (get-in state [:members username])] (do (send pid [:kicked reason]) (loop (update state :members dissoc username))) (loop state)) :shutdown (do (doseq [[_name pid] (:members state)] (send pid :room-closed)) :ok) :after 60000 (if (== (count (:members state)) 0) :empty-timeout (loop state)))) ``` ### Usage ```clojure (def room (spawn (fn [] (ChatRoom/loop {:owner "alice" :members {}})))) ;; alice joins (send room [:join "alice" *self*]) ;; bob joins (send room [:join "bob" *self*]) ;; carol joins (send room [:join "carol" *self*]) ;; bob says hi (send room [:message "bob" "hey everyone"]) ;; alice kicks bob (send room [:kick "bob" "being rude"]) ;; carol leaves (send room [:leave "carol"]) ``` ### ChatRoom with Schemas ```clojure (def ChatState [:map [:owner :string] [:members [:map-of :string :pid]]]) (def ChatMessage [:or [:tuple [:= :join] :string :pid] [:tuple [:= :message] :string :string] [:tuple [:= :leave] :string] [:tuple [:= :kick] :string :string] [:= :shutdown]]) ``` --- ## Project Structure CljElixir follows the ClojureScript model: the compiler is written in the host language (Elixir), the runtime is written in CljElixir itself. ``` clj_elixir/ ├── mix.exs ├── bb.edn # Babashka runner scripts │ ├── lib/ # Bootstrap compiler (Elixir) │ ├── clj_elixir/ │ │ ├── reader.ex # S-expression reader/parser │ │ ├── analyzer.ex # AST validation │ │ ├── transformer.ex # CljElixir AST → Elixir AST │ │ ├── compiler.ex # Compilation pipeline orchestrator │ │ ├── printer.ex # EDN-like printer for all BEAM types │ │ ├── repl.ex # REPL engine (eval, bindings, history) │ │ ├── malli.ex # Malli schema → typespec adapter │ │ └── nrepl/ # nREPL server │ │ ├── server.ex # TCP server │ │ ├── handler.ex # Message handler (ops dispatch) │ │ ├── session.ex # Session manager (GenServer + Agents) │ │ └── bencode.ex # Bencode codec │ └── mix/ │ └── tasks/ │ ├── compile.clj_elixir.ex # Mix compiler plugin │ ├── clje.repl.ex # mix clje.repl │ ├── clje.nrepl.ex # mix clje.nrepl [--port PORT] │ ├── clje.eval.ex # mix clje.eval EXPR │ ├── clje.run.ex # mix clje.run FILE │ └── clje.build.ex # mix clje.build (compile to BEAM) │ ├── src/ # Runtime library (CljElixir) │ └── clje/ │ ├── core.clje # Core functions (get-in, assoc-in, etc.) │ └── core/ │ ├── protocols.clje # 16 protocols + type extensions │ └── persistent_vector.clje # VectorNode, PersistentVector, SubVector │ ├── stubs/ # Editor completion stubs (.clj) │ ├── Enum.clj, Map.clj, ... # Elixir module stubs │ ├── erlang.clj, gen_tcp.clj, ... # Erlang module stubs │ └── clje/core.clj # Core vocabulary stubs │ ├── test/ # All ExUnit tests (.exs) │ └── clj_elixir/ │ ├── reader_test.exs │ ├── transformer_test.exs │ ├── compiler_test.exs │ ├── phase2_test.exs # Protocols & core data ops │ ├── phase3_test.exs # PersistentVector │ ├── phase4_test.exs # clojurify/elixirify │ ├── phase5_test.exs # BEAM concurrency & GenServer │ ├── phase6_test.exs # Control flow, macros, destructuring │ ├── phase7_test.exs # Malli schemas & type specs │ ├── malli_test.exs # Malli unit tests │ ├── phase8_test.exs # Printing & source maps │ ├── repl_test.exs # REPL engine │ └── nrepl_test.exs # nREPL server │ └── examples/ ├── chat_room.clje # Single-VM actor-based chat room ├── tcp_chat_server.clje # TCP chat server (gen_tcp) └── tcp_chat_client.clje # TCP chat client ``` ### Bootstrap Sequence The compiler builds in two passes: 1. **Compile the compiler** — `mix compile` builds the Elixir-based reader, analyzer, transformer, and mix compiler plugin. 2. **Compile the runtime** — The mix compiler plugin reads `.clje` files from `src/`, transforms them to Elixir AST, and feeds them to the Elixir compiler. This produces BEAM modules for `CljElixir.Core`, `CljElixir.Core.Protocols`, `CljElixir.Core.PersistentVector`, etc. User projects depend on the `clj_elixir` package, which provides both the compiler plugin and the precompiled runtime modules. ### Implementation Phases **Phase 1: Bootstrap Compiler (Elixir)** The compiler must support the three primitives that everything else is built on: - Reader: parse s-expressions into CljElixir AST - Transformer: core forms → Elixir AST - **Primitives:** `defrecord`, `defprotocol`, `extend-type`, `extend-protocol`, `reify` - **Core forms:** `ns`, `defmodule`, `defn`, `defn-`, `fn`, `let`, `if`, `case`, `cond`, `do`, `loop`/`recur` - **Data:** maps `{}`, vectors `[]`, sets `#{}`, lists, tuples `#el[]` - **Interop:** `Module/function` calls (uppercase = Elixir, lowercase = Erlang) - **Naming:** hyphen-to-underscore conversion - Mix compiler plugin: `.clje` → Elixir AST → BEAM bytecode - Verify: compile and call `.clje` modules from Elixir **Phase 2: Core Protocols (CljElixir)** Written in CljElixir using the Phase 1 compiler: - All core protocols: `ILookup`, `IAssociative`, `IMap`, `ICounted`, `ISeqable`, `ISeq`, `ICollection`, `IIndexed`, `IFn`, `IMeta`, `IWithMeta`, `IStack`, `IMapEntry`, `IKVReduce`, `IHash`, `IEquiv` - Extend protocols to BEAM built-in types (Map, List, Tuple, etc.) - Core functions: `get`, `assoc`, `dissoc`, `update`, `count`, `first`, `rest`, `seq`, `conj`, `into`, `keys`, `vals`, `merge`, `select-keys`, `reduce`, `map`, `filter` - Keyword-as-function dispatch **Phase 3: PersistentVector (CljElixir)** Built with `defrecord` + protocols, ported from ClojureScript: - `PersistentVector` (bit-partitioned trie) — the one data structure BEAM doesn't have - `SubVector` for efficient subvec - Benchmark against Erlang tuples and lists for indexed-access workloads **Phase 4: Domain Tools (CljElixir)** - `clojurify` / `elixirify` protocols and functions - `tuple` function (n-ary, apply-compatible) **Phase 5: BEAM Concurrency (CljElixir)** - `receive` with pattern matching, guards (`:guard`), `:after` - `spawn`, `spawn-link`, `send` - `*self*`, `*node*` - GenServer integration via `(use GenServer)` **Phase 6: Control Flow and Macros (CljElixir)** - `with` (sequential pattern matching with `:else`) - `for`, `doseq`, `if-let`, `when-let`, `if-some`, `when-some` - `->` and `->>` threading macros - `#()` anonymous function shorthand - Destructuring in `let`, `fn`, `defn`, `for`, `doseq` - `defmacro` with quasiquote, unquote, splice-unquote, auto-gensym **Phase 7: Malli Schema Adapter (CljElixir)** - Malli schema definitions as data - `m/=>` function schema annotations - Schema → Elixir typespec generation for Dialyzer **Phase 8: REPL and Ecosystem Integration** - REPL engine (`CljElixir.REPL`): eval with binding persistence, history, namespace tracking, balanced-input detection - `mix clje.repl` — interactive REPL with multi-line input, `:help`/`:quit`/`:history` commands - Printer (`CljElixir.Printer`): EDN-like repr for all BEAM types (maps, lists, tuples, sets, PIDs, etc.) - Transformer builtins: `pr-str`, `pr`, `prn`, `print-str` - `IPrintWithWriter` protocol for user-extensible printing - nREPL server — TCP-based, Bencode protocol, `.nrepl-port` file - `mix clje.nrepl [--port PORT]` to launch - Ops: `clone`, `close`, `eval`, `describe`, `ls-sessions`, `load-file`, `interrupt`, `completions` - Session isolation via GenServer + per-session Agents - Stdout capture via `StringIO` + `Process.group_leader` swap - Source-mapped metadata: `elixir_meta/1` propagates `%{line: L, col: C}` through transformer to Elixir AST - Mix tasks: `mix clje.eval EXPR`, `mix clje.run FILE`, `mix clje.build` --- ## Open Design Questions 1. **Lazy sequences.** Elixir has `Stream` for laziness. How much of Clojure's lazy-seq model do we port? BEAM's process model often replaces what laziness does in Clojure. 2. **PersistentVector performance on BEAM.** The bit-partitioned trie is implemented but not yet benchmarked. BEAM's memory model and per-process GC may affect trie node allocation patterns differently than JVM/JS. Needs benchmarking against Erlang tuples and lists for indexed-access workloads to confirm the tradeoff is worth it. 3. **Full Malli port.** The current adapter generates Elixir typespecs from Malli schemas for Dialyzer. Full Malli (validation, coercion, generation) and clojure.spec support may come later. 4. **Vector-as-function.** `([1 2 3] 0)` doesn't work yet — needs the transformer to dispatch non-function call heads through IFn.