db4o Developer Community

db4o open source object database, native to Java and .NET
Welcome to db4o Developer Community Sign in | Join
in Search
More Search Options

Product News from the Core Team

This blog features product news right from the core developer team, once new features and functions get checked into Subversion, available as Continuous Build every 2 hours.

Lazy Queries

How do you write a fast program? Easy: It should only do what is absolutely necessary with the smallest number of steps possible. Isn't that ultimate laziness?

Lazy == Fast !

With this idea in mind we were looking for ways to do less work in our query processor and to empower your application to be able to tell db4o to do less work. The solution we came up with is simple and straightforward: Instead of fully processing queries upon calls to Query#execute(), we only choose the best index, create an iterator against it and do all the rest of the query processing object-per-object while your application iterates through the ObjectSet and calls #next()/#MoveNext().

This new feature is completed, it is in SVN and we are very happy with the excellent benchmark results that we are measuring. We find that lazy queries have even more advantages than we had thought about:

(1) Lazy queries are extremely fast for partial resultsets.

When you run a query you are sometimes not interested in iterating through the entire result ObjectSet. Sometimes you just want one good result or you are maybe only interested in 50 good results so you can display them in a GUI. In this case you do not want to run the entire query completely before you get the first result back. You are perfectly OK with getting the first 50 results and stopping the query there.

 

(2) No more long running queries

Have you ever experienced the phenomena that you fire a query and get a blocked server for seconds or even minutes? With lazy queries it just can't happen. Your application gets back control after every single object or after a configured number of objects. Your application can decide how much CPU power it wants to use for a specific query. If a query does not work good enough because of a user entry that returns millions of results, you can time query processing out after any number of objects, whenever you want.

 

(3) Zero memory consumption

Our lazy queries do not need an intermediate representation as a set of IDs. We work exclusively with iterators. With this approach a lazy query ObjectSet does not have to cache a single object or ID. The memory consumption for a query is practically zero, no matter how large the resultset is going to be.

 

(4) Parallel query execution and query result processing

Lazy queries start delivering query results before the entire query has completed. That's cool! You can start using these results already, while another thread completes the rest of the query.

 

So much for theory, here are some concrete results, running our adapted Poleposition benchmark suite with db4o 5.7 and db4o 6.0 preview in IMMEDIATE, SNAPSHOT and LAZY query mode:

http://www.db4o.com/downloads/Pp60.pdf

All results look very good.

Our marketing department loves to publish concrete "factor x" improvements. Please take a look at Indianapolis#getOneFromBigRangeQuery(), this one is a new high score. For this specific usecase db4o performance has improved from 140166 milliseconds to 6 milliseconds. Let's convert the ratio: Something that used to take more than six hours is now done in a single second.

(I can already see the headline when marketing releases this to the press: db4o v6 is now 23361 x faster.)

 

To use this new feature you only need to know one single configuration method:

Db4o.configure().queries().evaluationMode(QueryEvaluationMode mode);

(Our .NET team will surprise you with what they make of the above Java syntax. More news to come soon.)

 

You can pass three different constants to this method:

QueryEvaluationMode.IMMEDIATE

QueryEvaluationMode.SNAPSHOT

QueryEvaluationMode.LAZY

 

The behaviour can also be configured directly for an ObjectContainer, by calling

ObjectContainer#ext().configure().queries().evaluationMode()

Since the client configuration determines the mode to be used in a Client/Server setup, every single query can be configured individually.

For a more detailed explanation and a a differentiation between LAZY mode and SNAPSHOT mode please refer to the API documentation for QueryConfiguration#evaluationMode().

 

There are some points to be aware of when you use lazy queries:

- ObjectSet#size() is a very expensive method for lazy queries. It forces the query to be fully evaluated.

- LAZY mode is not fully compatible to concurrent Client/Server updates. For now we recommend SNAPSHOT mode for Client/Server, if other transactions can possibly modify candidate objects while a lazy query is processing. It is also a possible option to use a system of semaphores, for instance to prevent updates while lazy queries are being processed.

- If you iterate through a lazy ObjectSet and update or delete the returned objects, you may possibly still influence the result of the query at this time. If you use lazy queries and if you want to update or delete objects, it can make sense to put all objects that you want to touch into your own list first, just like you would do it to prevent concurrent modifications in collections.

Because of the above issues we decided to make the behaviour configurable, so all our users can get what they really need. For now the default setting continues to be IMMEDIATE mode, so the default behaviour is 100% consistent with previous versions.

 

From looking at our benchmark results we expect that many users will experience huge performance improvements. Please let us know how lazy queries work for you!

The feature is in SVN and will be made available in a development build on next tuesday, November 14.

Published Friday, November 10, 2006 8:59 AM by Carl Rosenberger

Comments

 

Christof said:

> I can already see the headline when marketing releases this to the press: db4o v6 is now 23361 x faster.

We'll refrain from that...  Basically one can say:  It now works!

Great job!

Christof

November 10, 2006 3:21 AM
 

db4o en Español said:

db4objects, Inc. ( http://www.db4o.com ) acaba de liberar la versión 6 de db4o, su base de objetos de

November 20, 2006 8:30 PM
 

db4o Newsletter said:

db4o Version 6.0 Debuts to the Community How to Contribute to db4o Seagate Personal Servers get a boost

November 21, 2006 1:10 AM
 

db4o auf Deutsch said:

db4o Version 6.0 ...ist erschienen und ab sofort fuer die Community als development (beta) Release zum

November 21, 2006 10:42 AM
 

mikeon said:

I have read an entry on your blog about Lazy Queries. As good as it looks I was wandering how good it would perform in a Client/Server scenario where the Client connection is over a network to a different Server machine. It seams that in this case there would be a large number of round trips. Am I right?

November 22, 2006 9:12 AM
 

db4o in Chinese said:

11-12月事件列表: 11/18/2006 - Itasca, IL, USA 11/23/2006 - Vienna, Austria 11/24/2006 - Bangalore, India 11/28/2006

November 26, 2006 2:22 PM
 

db4o News and Press Releases said:

SAN MATEO, Calif., Dec. 14, 2006 - db4objects ( www.db4o.com ), creator of the open source object database,

December 14, 2006 4:40 AM
 

db4o Newsletter said:

Welcome to the January newsletter! db4o Version 6.0 Released as Production-Ready dOCL, the New db4o Open

December 14, 2006 4:58 AM
 

db4o in Chinese said:

节日快乐! db4objects 祝你有一个快乐的圣诞节.我们感谢你在过去一年对我们的支持, 我们希望2007能迎来更成功的一年! K U D O O F T H E M O N T H "面向对象数据库...我并没有预料到他们是如此简单和实效.想象一下,db4o只是一个dll.仅仅一个dl这么简单.不用任何的设置和配置.高效率地存储数据.

December 15, 2006 8:36 AM
 

db4o auf Deutsch said:

db4objects wünscht allen eine schöne Weihnachtszeit! Wir danken euch für eure wertvollen Beiträge und

December 19, 2006 6:27 AM
 

Kudos said:

http://www.infoq.com/news/2006/12/db40-6 Db4Object has released version 6.0 of their open source object

December 21, 2006 11:31 PM
Anonymous comments are disabled

This Blog

Syndication RSS Feeds

News

Get the latest features every 2 hours with the Continuous Build!