Pencils Down

This weblog is about my experiences in software development

Browsing Posts tagged Hibernate

I have recently been working on a project looking for memory leaks.  (BTW, I am extremely happy with the tools available in the JDK – jconsole, jmap, jhat).  While poking around at some changes we could make in our code I kept finding quite a few Hibernate objects hanging around.  One in particular was the number of logging objects (we log user actions in the db) that were in memory just for a login step.  Examining the code there was no apparent flaw.  Putting in some debugging statements showed a tremendous amount of logging objects being instantiated during the parent.getChildren() type of statement in the code that adds the logging event to the database table.

If you find the documentation somewhere (I have a personal nit that I think most of the Hibernate documentation is atrocious.  I can NEVER find what I am looking for) you will see that the lazy attribute for a set has three choices:

  1. false
  2. true (default)
  3. extra

false means load the collection when the primary object is loaded.  I think everyone expected that.

true means wait until some code asks for a collection. Then load the entire set!  I don’t think most people expected that.

extra means return a Hibernate interceptor for the collection to the caller and only load individual members of the set when explicitly called for.  I think this is the one most people expect to have happen.

We have been running with Hibernate for over a year. Like everyone else we thought we were taking advantage of lazy loading of collections.  We have code all over the application that looks like:

Child child = new Child()

child.setParent(parent)

parent.getChildren().add(child)

session.flush()

Of particular note is the parent.getChildren statement.  This function call is intercepted by Hibernate and we thought would lazy load the collection.  Our mapping for the set in the Parent mapping had no setting for lazy, as the documentation says that lazy is on by default.

This means in our code where we just wanted to add a logging event we were loading the entire logging event table every time we added a record to the table!

We have changed all of our one-to-many sets to use extra.  So far we are guessing that the many-to-many sets will likely traverse every matching member so the default true setting is good.

Yes, I know, technically HQL doesn’t support derived tables or columns.  However, you can count on the underlying SQL to do some of the work for you.

For example if you needed to order by any of 3 columns in left joined tables, depending on whether the join worked or not you could do something like:

select distinct t1,
case when t2.t2Id is null
then
(case when t3.t3Id is null
then t1.orderingColumn
else t3.orderingColumn
end)
else t2.orderingColumn
end
from table1 t1
left join t1.t2s t2
left join t1.t3s t3
order by 2, t1.description

This builds on allowing case in the select clause (thereby giving you your pseudo-derived column) and the result order number (2) which falls directly back to the underlying SQL.

This is a really ugly exception thrown by Oracle from the bowels of Hibernate. In my case I had just added a table to the mapping so I at least had some idea where to look.

Looking at the integer values in the table I went through and verified each as correct in the db, mapping, entity and sequence. No luck there.

Poking around the internet though told me the error probably had nothing to do with a LONG mapping and could be ANY size mapping on any field. It’s a generic sizing error.

Luckily the table I created only had a few fields so I was able to narrow done the problem pretty quickly. In my case a VARCHAR2(4000) was attempted to be stored with much more. So, the String in the Java entity had to be controlled to make sure it was small enough.

It’s not much, but thought I would pass the info along and maybe save you some time.

We have recently run across a couple of situations that are really batch operations.  You know read N records of type Y or write M records of type Z, all within a nice Hibernate session/transaction of course.

If you do the brain-dead coding looping over calls to object.setX(), object.save(), you are making a round trip from your logic layer all the way to the database (including the cache) for every loop step.

There is also the slight problem that you may run out of memory with a large operation as all of this has to sit in the cache until flushed out.

Looking at some literature points to the Hibernate batch size, but others say this really doesn’t work.

Others are more extreme and point to using raw SQL to perform your batch.

Why aren’t there batch calls sitting in Hibernate?  This is not a new persistence problem.

This is one of those errors that shouldn’t happen.  However, due to lazy-loading Hibernate is smart enough to realize you have loaded an object once before during a session and does not attempt to do so again – even though you have deleted it during the session and it should not be a part of any collection anymore.

A workaround I have started to use is to set the version of objects I am deleting to -1 and the later in time when I am about to persist or merge an object I check to see if the version is still valid first.  This includes objects I delete by hand and objects that may be about to be cascade deleted that I may later on attempt to persist/merge.

I am sure there is a smarter way to do this, but this works well.

One thing I was toying with was the error message that comes up for this exception usually mentions the object using the standard notation of <object>@<hashcode> where the hashcode that prints out is null.  This is a little wierd as the hashcode that I have in my entities is a int not an Integer, so I can’t tell how to poke at a null int to get the same effect as my invalid version/deleted flag.

We have several wizards in the application we are developing under JBoss/icefaces/Spring.  We had decided some time ago to provide long running sessions to allow the user to move between wizard screens and then finalize (commit) later in time.

This all works pretty well until you get to a delete operation where there is some non-primary constraint involved.  The typical culprit is a unique ‘name’ for an object.  As you all know Hibernate takes all of the underlying SQL operations and lines them up in roughly the order of INSERT, UPDATE and then the DELETEs.  (There are a few other cases that do not apply here).

So, in a wizard if the user does something that boils down to add a thing called X, delete a thing called X and then add a thing called X again – your app will die.  Using the above Hibernate ordering you end up with INSERT X, INSERT X, and DELETE x throwing a constraint violation exception.

We have played with a few ideas here but none of them ‘feel’ like we are using Hibernate correctly:

  • Don’t delete, just mark the record – doesn’t solve the unique constraint
  • Cache deletes by hand – painful coding
  • Delete forces a flush – violates the whole point of the long transaction/session

Leaning towards the last one now, but client may want #2.

The normal Hibernate entity collection returns a Set<?>.  This is slightly deceptive for Java developers because it would appear that you can perform standard Set operations on the entity collection and have Hibernate do the right thing.  This is an invalid assumption.

For example, Set.clear() would appear to remove all elements in the collection.  In reality it is not clear what Hibernate may do when you perform this step.

Once you realize the clear() call did not work the typical work might be to create a Set in memory from the entity.getCollection() call assuming that your memory Set exists as a separate object.  Again, this is an invalid assumption.  What you have done with the statement Set<?> set = entity.getCollection() is actually copy the Hibernate proxied object into your memory Set object.  As such it will be maintained just as if you left it in the original entity in the first place.

The result is to take the Set and copy all the materialized objects out of the Hibernate Set into some other collection, as in:

List<?> list = new ArrayList<?>();

list.addAll(hibernateSet);

You now finally have a collection that will NOT be maintained by Hibernate that you can operate on directly.

This error message usually means your database has bad references between tables.  As Hibernate will enforce the integrity of these relations this is usually caused via some data load operation.

A recurring situation with us was using a data load script that was O/S dependent.  So, it would work great on Windows and then the same script running on Redhat would throw spurious data into elements.

Particularly hard to detect are String keys where trailing spaces do not normally display when using SQL tools, but are present and cause a mismatch on keys.

One of our developers had the following type of code in place:

Parent parent = new Parent();
dao.create(parent);
Child child = new Child();
parent.getChildren().add(child);
dao.saveOrUpdate(parent);
dao.create(child);

All of this code worked!  We never noticed the error until we put a Hibernate wrapper in play that has asserts for all kinds of extraneous conditions.  In the above code the child has not been created in Hibernate yet.

So, why did it work in straight Hibernate?  We think it didn’t work and only gave a silent error/exception.  All of this would be cleaned up later as a cleanup task by the create(child) call or the even later flush() call.

Of course, once we switched the order of the create(child) call with the saveOrUpdate(parent) it all worked.

We had noticed similar silent errors for delete() calls as well.

One of our developers came up with some sample code like the following:

//create an entity

Entity e = new Entity();

e.setEntityName(“x”);

EntityId id = dao.create(e);

//attempt to find it

Entity e2 = dao.createQuery(“from Entity where entityName=’x').list.get(0);

//???? e2 is null !!!!!!

//look a different way

Entity e3 = dao.findById(id);

//???? e3 is correctly loaded

The Hibernate cache is id based. So, HQL queries ignore the cache when looking around for matches and go direct to the db (e2), unless id’s are involved (e3).

The choices are then:

- If you need to reference cache objects by other than id (as above), keep a map as you progress

HashMap<String,EntityId> map.put(“x”,id);

Entity e4 = dao.findById(map.get(“x”));

- or flush() as needed so the HQL will ‘work’

I think most of the time this (keeping id’s around) works. For example most DTOs and UI representations are maintaining the id of the object and later retrieving by that id. Unfortunately, I know there are a couple of pain points where the id is not the key of interest so additional coding will have to be used.

Also, flushing continually defeats the whole purpose of having a long running session/transaction.  So, that option is of limited use as well.