db4o Developer Community

db4o open source object database, native to Java and .NET
Welcome to db4o Developer Community Sign in | Join
in Search
More Search Options

Your Requirements for Persistent Collections ?

Last post 07-17-2008, 07:26 AM by Marek Istvanek. 14 replies.
Sort Posts: Previous Next
  •  05-19-2007, 12:50 PM 36773

    Your Requirements for Persistent Collections ?

    Hi all,

    we are now starting to work on new collection implementations for db4o, the project
    is dubbed 'Fast Collections' and it has been in Jira for quite some time as #COR-113.

    This feature has received the most votes in our issue tracker.

    Some thoughts about the requirements are already posted to our Wiki here:
    http://developer.db4o.com/ProjectSpaces/view.aspx/Db4o_Product_Design/Fast_Collections

    We would love to hear how you as our users expect Persistent Collections to work.
    Please let us know about your requirements.
    Thank you!

  •  05-21-2007, 02:19 PM 36820 in reply to 36773

    Re: Your Requirements for Persistent Collections ?

    I think the wiki summarises the important points nicely. Some comments follow.

    Regarding the "Easy of Use Options", I prefer "Return db4o-specific implementations the first time a persistent collection is loaded from db4o (requires users to declare interfaces in their classes only)". This allows the entity objects to be used completely independent from db4o until persistence is required. I don't think the requirement to declare interfaces for collections is onerous.

    One important use-case for our app (RSSOwl2) is dealing with the case where a user has a lot of News for a given Feed. At the moment, we simply load the whole feed (with all the news) into memory because we do not expose db4o or an activation API outside the persistence layer. This works well for the case where a feed has a small or medium amount of news, but it can be a problem for heavy users.

    With transparent activation and fast collections (if I understand correctly), we could instead load the feed and rely the news to be loaded in pages into memory. This is great. One problem, however, is sorting. Depending on the user settings, the order for the news to be loaded would vary (at runtime). I don't know if it's possible, but it would be very nice for us to be able to dynamically set the sorting criteria for a child collection while activating/querying the parent object.

    This is not an essential feature because we can instead query for the News directly and sort/limit the amount there, but it would certainly make things a lot cleaner and easier. The kind of thing that db4o is known for. :)

    Ismael
     

  •  05-22-2007, 06:07 PM 36864 in reply to 36820

    Re: Your Requirements for Persistent Collections ?

    I think ijuma touched on it, but one thing that would be really useful would be the ability to query a collection.  I.e. if object Foo has a collection field Bar, we could query Bar for any object class/field value within that collection.  As a use case, say for example the racing team Ferrari had five drivers and Honda had four.  I'd like to be able to query for a given driver from the Ferrari team (Ferrari.Drivers collection) without worrying about getting drivers from the Honda team (Honda.Drivers collection).
  •  05-29-2007, 11:52 PM 37084 in reply to 36864

    Re: Your Requirements for Persistent Collections ?

    For me the support of Sets would be very helpful. As ijuma already said, it would be perfect, if we don't need to use the ObjectContainer to obtain a new List, Map or Set.

    Additionally paging of Collections is important as well. There must be a way to sort a collection without instantiate all the Objects it contains. On the sorted result we must be able to select the entries 10 to 20 (e.g.)

    Another issue is querying. We must be able to query keys and values in a Map.

    Since Collections store references of Objects it would also be helpful to have some opportunities for referential integrity. So if I delete a collection, only the containing objects, which got no more reference, should be deleted. (At the moment I got a very bad code fragment I need to run, for not deleting referenced Objects)

  •  05-31-2007, 03:38 PM 37143 in reply to 37084

    Re: Your Requirements for Persistent Collections ?

    Hi

     

     

    Sorry if it a little off the track, but when I read :

     

    • .NET Generic Collections
      • Currently db4o stores the internal implementation of the concrete object provided by .NET with no special handling at all
      • This approach is inherently broken because it bloats the database, is slow, is not safe for concurrent access and some collection implementations do not work correctly at all.

    .

     

    Does that mean I should not use generic list, as _usCollection = new List<US>();, in the objects I want to save with db4o?


    Richard
  •  06-01-2007, 01:37 PM 37162 in reply to 37143

    Re: Your Requirements for Persistent Collections ?

    wrote:
    > Does that mean I should not use generic list, as
    > _usCollection = newList();, in the objects I
    > want to save with db4o?

    This is for .NET only, there are no known issues with
    generic collections on the Java side:

    Functionality does work as expected in single user mode
    but if you want to be on the safe side with collections
    and if you can use non-generic collections just as well,
    I would recommend to stay away from generic collections
    until the new fast collections are available.

    http://tracker.db4o.com/browse/COR-113
  •  06-04-2007, 08:13 AM 37200 in reply to 37162

    Re: Your Requirements for Persistent Collections ?

    Ok, thank you.

     

     


    Richard
  •  06-12-2007, 10:40 AM 37408 in reply to 36773

    Re: Your Requirements for Persistent Collections ?

    I was wondering if you could indicate roughly when this feature might be available?  Even when it would be available as a beta release.

    The reason for asking is that we use collections extensively and I would like to investigate how db4o could help us manage and query large amounts of data.  I think features such as lazy loading could be of great value to us.

    I'm also excited about the possiblity of using LINQ with "fast collections" --- would I be right in thinking that the design of "fast collections" considers integration with LINQ?  For example, if I query a "fast collection" using LINQ or Native Query then only the objects in the resultset are brought back from the database, rather than filling the entire collection in memory and then sifting those.

    Thanks,
    Richard
  •  06-12-2007, 01:57 PM 37409 in reply to 37408

    Re: Your Requirements for Persistent Collections ?

    rich257:
    I was wondering if you could indicate roughly when this feature might be available? Even when it would be available as a beta release.

    It always goes faster if you contribute to the db4o Project, either in time/code or money...

    Christof 


    Christof Wittig » db4objects
  •  07-07-2007, 02:13 PM 39697 in reply to 36773

    Re: Your Requirements for Persistent Collections ?

    Hi,

    I have some ideas:

    • implement own Set<T> collection (so that database is binary compatible between C# and Java)
    • make queries (native+LINQ?) possible on it
    • basically, current database is like one Set<object> collection where db.Set/db.Delete are set.Add/set.Delete
    • introduce top-level object and implement garbage collection (no need for explicit db.Delete ())
    • implement transparent object modification (no need for db.Set ())
    • the database would consist of one top-level object (Set<object> when migrating from old version) and everything reachable through it, where Set<T> objects could be queried, indexed, ...
    • my last thing in db4o wishlist is serializable transaction isolation level with MVCC :)

    Jan

  •  07-10-2007, 06:57 AM 39754 in reply to 36773

    Re: Your Requirements for Persistent Collections ?

    I am not that familiar with the underline design of Db4Objects. (Where are the design docs?) I have used many databases over the last 25 years both hierarchial (Raima), relational (Oracle, Sybase, Postgre), B-Tree and ISAM (FairCom, Vermont) using Perl, C/C++, C#, Modula-3 and Haskell.

    My Db4Objects needs will focus mainly on a few essential requirements:

    Ÿ          Need to address memory as well as performance issues for small (00) medium (0000) and large (000000) collections. This will be an ongoing exercise in efficiency.

    Ÿ          Need to allow user defined Predicate<T>, IComparer<T> and Enumerator definitions with multiple constraints and global external search strings (usually primary unique keys). These constraints may be included in many user defined overloaded methods. (ie: BinarySearch, Exits, FindAll, FindIndex, FindLast, FindLastIndex, GetEnumerator, Sort) Maybe this can be integrated with the ‘Query’ processor?

    Ÿ          Need to allow user defined search and sort algorithms to be applied against specific persistent objects. Thus the user can control the memory chunks which acts as a constraint on memory resources. The purpose is to allow the user to design efficient iterators against specific persistent objects. (May be a need to externalize the handles) May also want to give consideration to concurrency, cooprocessing and partitioning. (ie: future design enhancements)

    Ÿ          Need to align the set of persisted collection classes with the current transient collection sets which many developers are very familiar. Some of the persistent classes should be abstract to allow user inheritance.

    Ÿ          Need to use ‘Event’ and ‘Exception’ handlers (callbacks) where possible to allow fine grain control of the ‘add’, ‘update’ and ‘removal’ process. May want to differentiate ‘dirty’ vs ‘destructive’ updates along with a test for using unique keys. Currently DB4Objects combines add and update functionality whereas using collections with unique keys requires separation.

    My suggestion is try not to rush this difficult project! Upon each milestone, try to get as much user feed back as possible. Also, please publish the process otherwise it may wilt on the vine due to a lack of user participation. This takes time but will pay big dividents later on. I am sure that collisions and conflict will occur with the legacy artifact resulting in deprecation and outright abandonment of some prior code.

    The idea of manufacturing ‘Fast Collections’ is very important but ‘Memory’ considerations are just as important. It’s a delicate balancing act!

    “Lean manufacturing is the continuous process (kaizen) of reducing ‘muda’ (waste), ‘mura’ (unevenness of work load) and ‘muri’ (overburden of man and machine) in manufacturing operations to improve overall customer value by focusing on speed, flexibility and quality.” – food for thought.

    I am willing to test each stage against my ongoing ‘Dependency Matrix’ tracking project which encapsulates millions of items with many large inter-related collections. (still in development)

    Best of luck,  John Sealey

  •  01-12-2008, 10:48 PM 45798 in reply to 36773

    Re: Your Requirements for Persistent Collections ?

    I just found this thread, so I thought I would collect here some questions I had previous posted elsewhere relating to collection needs:

    Re: equest: lazy activation for collections
    http://developer.db4o.com/forums/post/45692.aspx
    - about collections needing to load/activate elements on demand, instead of the entire set:
    - from the fast collections project space, it seems like you guys are already quite aware of this issue

    Re: ArrayMap4 is not a hash map. linear key lookup performance instead of "constant time" lookup:
    http://developer.db4o.com/forums/post/45633.aspx
    - it occurs to me that this could be easily fixed with a transient private HashMap that gives speedy access to in memory elements.
    - the _keys and _values fields are primarily then use for persistence only

    Re: ObjectContainer.set() is expensive? [for collections]
    http://developer.db4o.com/forums/post/45776.aspx
    - could/should the set() call be made more efficient for repeated (and unpredictable) occasions of needing to store the same collection?

    Lastly, how to I contribute to possible candidates for com.db4o.collections? I don't know if any "fast collections" are in the current 7.x sources yet, but perhaps some lazy collection implementations would be useful? I am still working out some issues with them, but so far it seems promising.
  •  05-21-2008, 05:14 PM 49176 in reply to 36773

    Re: Your Requirements for Persistent Collections ?

    Definitely fast Batch activation (even on C/S environment) for collections would be a big achivement.


  •  07-15-2008, 11:26 AM 50144 in reply to 36773

    Re: Your Requirements for Persistent Collections ?

    Carl,

     
    I am watching the Jira entry since a while, but haven't seen any updates COR-113. Subversion logs show some activity around collections, but i am not sure if they connected to 'Fast Collections'.

     
    Can you please provide some info about what can be expected in the near future. Will Fast Collections (at least partially) show up in upcoming releases, or it will be accessible in later releases e.g. db4o 8.x?

     

    Any comment (even negative) would be appriciated

     
    Thanks

     Andras
     

  •  07-17-2008, 07:26 AM 50176 in reply to 50144

    Re: Your Requirements for Persistent Collections ?

    Yes, Carl, please, tell us more about fast collections.
    Best regards
    Marek
View as RSS news feed in XML