Electronic Trading with Low Latency Java

We’re getting close to unleashing AC3, our new trading infrastructure powered by an ultra-low latency framework, implemented in Java 8 using various techniques to crank it up:

  • garbage free design (which meant we had to throw out 99% of the Java ecosystem),

  • off-heap memory allocation,

  • lock-free, single-writer algorithms,

  • NUMA-aware software affinity,

  • High end Intel CPUs, overclocked for good measure, and, for fun,

  • non-volatile memory (NVMe) to prevent any data loss while avoiding the performance penalty of syncing memory-mapped data, and

  • Other interesting features we will get into later.

Reactive Streams

One interesting part of the project was the creation of a reactive streams library using a garbage-free solution. (The libraries available on the market do produce Eden space garbage and thus could not be used for the project.) The reactive stream processing follows the idea of reactive processing to the events being pumped in, with the following components: 

  • Publisher: Generation of the events in the reactor 

  • Subscriber: The end listener of the events in the reactor

  • Map: Convert from one event to another in the event stream

  • Filter: Filter out events based on some acceptance criteria 

  • Selector: Select different processors based on some parameters in the event and pass the event to that processor

  • User Defined Processor: Any Processor can be built and can be plugged into the chain to process the stream of events and morph them and pass on to the next processor in the chain

  • Reducer: Coalesce or Combine many events to a single event and push them onto the stream of events

A simple example will illustrate:


    public final void testSyncReactorPerformance() {

        final long ITERATIONS = 1_000_000;

        SynchronousReactor reactor = new SynchronousReactor();

        reactor.subscribe((event, context) -> true)

                .process((event, context) -> true)

                .map((event) -> event)

                .filter((event) -> true)

                .publish(publisher ->


                    long start = System.nanoTime();

                    for (long i = 0; i < ITERATIONS; i++) {




                    long end = System.nanoTime();

                    System.out.println("Time taken for single transfer [" + (end - start) / ITERATIONS + "]ns");




This is how easy it is to build a reactive stream!

All the above processing operators also can be state-full along with being stateless by simply implementing a Context interface. This gives the power of maintaining the state based on a specific event. So, when the same event is pumped again the previous state that was saved as a memento is provided along with the event. The cache to maintain this state can be implemented by the user. An Off-heap & On-heap cache is provided, or a developer can implement a different plug-in.

Stay Tuned

It is well known that the standard to match is the disruptor queue made first available by LMAX team (Martin Thomson) to the public. So why bother with ours?

While for a single producer/consumer, ours works almost identical in that we use the single writer principle: a single thread is used to mutate the elements in the queue, and many readers can be reading the queue elements. Now it works all without any form of locks. How is this possible? That’s for another post!

Lindsey Edmonds