blog

Jan Kotek

speakerImage

Biography

Jan Kotek lives and works in Galway, Ireland. For the past 8 years he has been working on Java Desktop applications. At day he works at APC, at night he is an amateur astronomer. He works on number of open-source projects including embeded key value database and astronomical applications. He also programs in Scala and Kotlin.

Lectures

JDBM3

JDBM3 provides TreeMap, HashMap and other collections backed up by disk storage. Now you can handle billions of items without ever running out of memory. JDBM is probably the fastest and the simpliest pure Java database.

JDBM is tiny (160KB nodeps jar), but packed with features such as transactions, instance cache and space efficient serialization. It also has outstanding performance. It is tightly optimized and has minimal overhead. It scales well from Android phone to multi-terabyte data sets.

In this presentation Jan will talk about performance optimizations. Outline of presentation:

  • quick JDBM history (12 years old project, DBM idea goes back to 1970)
  • why new project, fast persistence for desktop app
  • quick performance charts (1 milion inserts per second with sequential write, 200 000 inserts per second with random access).
  • common optimizations techniques (stack overhead, heap overhead, minimizing GC, primitive variables)
  • why multi-threaded NIO just does not work in real, and single threaded NIO is faster
  • Mapped byte buffer versus heap byte buffer. Mixing it for best performance
  • Practical aspects of using mapped byte buffers (in H2 db and JDBM)
  • BTree implementation, self balancing and why it is not so easy to implement correctly
  • Serialization in JDBM (it has very fast serialization framework tightly integrated with DB)
  • space usage minimization (btree delta compression, records overhead)
  • implementing instance cache with minimal overhead (GC trashing, soft reference overhead, reference queue overhead)
  • minimalistic API design. How-to put a lot of features into minimal interface.
  • database layers (store, cache, serialization) and why it is better to integrate everything tightly together.


Organizers & Key partners

Tweet