Hi,
I'm currently evaluating db4o.
The API is very neat and I can see developer productivity getting a massive boost using an OODBMS with such a clean API.
Anyway, got excited about db4o and did some tests..... which resulted in disappointment. The query performance is bad; read bad. I must be doing something wrong!
My test is based on one persistent POJO:
package db4ostress;
import com.db4o.config.annotations.Indexed;
public class Person {
@Indexed
private String name;
private String surname;
int age;
public Person(String name, String surname, int age) {
this.name = name;
this.surname = surname;
this.age = age;
}
public String getName() {
return name;
public String getSurname() {
return surname;
public int getAge() {
return age;
@Override
public String toString() {
return String.format("%s %s: %d", name, surname, age);
I used a tool to generate a CSV file with realistic data for this class. I had it generate 300,000 records.
Then I wrote a simple import routine that read the data from the CSV into a db4o ObjectContainer. I did have problems with this too; I could not load all the objects in one transaction because the VM kept running out of heap space. I had to load the data using transactions limited to 1000 new objects. This is something that we might live with, but it's still a limitation. Bdw... loading this data took more than 5+ minutes - which is a bit slow.
Anyway, the major problem I hit was when I tried to retrieve all Person instance having an age >= 25. The query (using SODA) takes about 15 seconds!! Way too long for it to be feasible in a production environment. I had tried Native queries using a Predicate, but this was much worse - it seemed to me that it was walking through the whole list of Person instances and invoking my predicate for each one - ignoring any indexes I had set up (using the @Indexed annotation). So after reading the docs, I tried SODA; which improved things, but still is far from what I would consider as acceptable. The same query against the same data on a MySQL database takes about 250ms.
One thing I noticed was that if I changed the query to find objects having age == 25, the query executed in approx 1.2 seconds!! Surely this must indicate that something is wrong no?
This is the code for my SODA query:
ObjectSet r;
Query q = oc.query(); //oc is the ObjectContainer
q.constrain(Person.class);
q.descend("age").constrain(new Integer(24)).greater();
long now = System.currentTimeMillis();
System.out.println("Starting query...");
r = q.execute();
System.out.println("Query Duration: " + (System.currentTimeMillis() - now));
Please tell me that I am doing something wrong! I am impressed with the db4o API and how easy it is to work with. But such poor performance is surely a stumbling block for our adoption of it.
I did the tests with v7.12.
Tks.
- Adrian.
Thanks for your answer.
I have actually analysed the data and found that the data is evenly distributed as regards the 'age' field. There are +/- 3000 records for each age; e.g. about 3000 for age 12, 13, 14 etc up to 99. So the data for this indexed field is evenly distributed.
I have experimented on what you said, by trying queries for age >= 99; This results in the query returning the results very fast. When I did the query for age >= 95, the query slowed down significantly. The lesser the age I specify (and hence the number of records in the resultant result set grows) the query takes longer and longer to execute.
This, imho, is somewhat of a flaw. Using BTree index, when one queries the data in the above scenario for say age >= 45, locating the first record, using the 'age' index, having age >= 45 should be very fast - btree's are very fast. Then, having located this first matching record, there would be no need to re-analyse all the candidates to the right of the tree node, because, following the btree algorithm/logic, all records to the right of the matching record will always be >= 45. The flaw therefore is the analysing of all the candidate records to the right for them having age >= 45, when this is already known from the btree index.
I might be wrong, but this correlates perfectly with my findings that the lesser the age I specify, the longer the query takes to return.
Is this something that you would consider looking at?
Surely others here must have come across this issue.
Any comments?
Performance improvements like this one certainly have priority in the near future. If you vote for the issue, you can increase the likelihood that we do it:
tracker.db4o.com/browse/COR-1133
So far there is only one vote.
Phew! Good to hear I'm not the only one who have come across this.
I've just voted for it to be fixed.
Tks - Adrian.