Hibernate Performance-Tuning
Today, I spent some time to dive deeply into the heart of the Semtinel Core to fix a performance leak that occured since I did some experiments with rather large annotation sets. I ran into several difficulties that might be interesting to other developers as well, so this post covers the following topics:
- Tracing of SQL statements for performance tuning of the H2 database.
- Generating statistics for H2 in order to see, where the time is lost.
- How to configure a single-value property in Hibernate for lazy fetching using annotations.
- Extending the Netbeans build.xml, so that the necessary bytecode instrumentation is done automatically at build time.
First of all, I had to find the reason for the weak performance of the running analysis in Semtinel. In Semtinel, Hibernate is used to access a H2 database, where all the data about thesauri, record sets and annotations is stored. Thanks to Hibernate, the data access is almost transparent throughout the application and there are only very rare situations, where JDBC/SQL has to be used. The drawback is, that the actual SQL statements are generated automatically by Hibernate and if performance is somewhat critically, a closer look on these statements is strongly recommended.
H2 statement tracing and analysis:
In order to view the actual SQL statements that are performed on the database, you have to adjust the trace level. This can be done programmatically or by appending an option to the database URL. This is, what I did, the URL has to look like this:
jdbc:h2:~/test;TRACE_LEVEL_FILE=3;
3 is the most verbose level (DEBUG). In this case, trace information is written to the standard trace file, which is located in your database directory and named database.trace.db.
More information about the trace level adjustment can be found here: Trace Options
Then, you can run your application as usual and have a look into the tracefile to see, what happens. But for performance tuning, it is useful to get some accumulated statistics for the statements. This is achieved by the ConvertTraceFile utility, that is part of the H2 distribution. So just enter your database directory and execute the following:
java -cp PATH_TO_LIB/h2.jar org.h2.tools.ConvertTraceFile -traceFile database.trace.db -script out.sql
Afterwards, you find all SQL statements in out.sql and at the end of the file a summarising statistic like this one (I inserted [...] where I skipped something for brevity):
----------------------------------------- -- SQL Statement Statistics -- time: total time in milliseconds (accumulated) -- count: how many times the statement ran -- result: total update count or row count ----------------------------------------- -- self accu time count result sql -- 78% 78% 6773 4509 4509 select conceptann0_.concept_id as concept5_7_, [...], select count(*) from Annotation a where a.annotationSet_id=annotation2_.id as formula1_1_ [...] from ConceptAnnotationSetAnalysisValue conceptann0_ [...] where conceptann0_.concept_id=?; -- 13% 91% 1198 3601 10659 select narrower0_.broader_id as broader2_3_ [...] -- 4% 95% 351 4509 4509 select [...] -- 1% 97% 158 12729 12729 SELECT VALUE FROM INFORMATION_SCHEMA.SETTINGS WHERE NAME=?; [...]
Here, I found the reason for the performance-leak. I used a Hibernate formula to calculate the actual size of an annotation-set by a select count(*) statement. As part of the join, this statement is executed on every single select of a concept. So I have to adjust this property to lazy-fetching, so that it is only calculated, if the size of the annotation-set is actually needed.
Configuring a single-value property for lazy-fetching:
This part seemed to be rather easy, I just had to add another annotation for the fetching strategy:
@Entity @Table(name="AnnotationSet") public class AnnotationSetImpl implements AnnotationSet { [...] @Formula(value="select count(*) from Annotation a where a.annotationSet_id=id") @Basic(fetch=FetchType.LAZY) private int size;
So far, so easy, unfortunately, Hibernate completely ignored my wish for lazyness and a short glimpse in the documentation reveals the reason:
Lazy property loading requires buildtime bytecode instrumentation! If your persistent classes are not enhanced, Hibernate will silently ignore lazy property settings and fall back to immediate fetching.
Automate bytecode instrumentation in Netbeans.
The last step for today. Extending the build-process in Netbeans should be quite easy, as it is totally based on ant. It should… First of all, here is, what I did and what worked for me: I extended the build.xml of my Netbeans module in question and inserted the following task:
<target name="jar-prep"> <taskdef name="instrument" classname="org.hibernate.tool.instrument.cglib.InstrumentTask"> <classpath refid="cp"/> </taskdef> <instrument verbose="true"> <fileset dir="${build.classes.dir}/org/semtinel/core/data/hibernate/"> <exclude name="DatabaseSettings*.class"/> <exclude name="*.properties"/> <exclude name="*.java"/> </fileset> </instrument> <mkdir dir="${cluster}/${module.jar.dir}"/> <tstamp> <format property="buildnumber" pattern="yyMMdd" timezone="UTC"/> </tstamp> </target>
The target “jar-prep” is executed after the build, directly before the jar-creation. Worth to mention is the definition of the classpath, which is globally defined via the variable cp in the Netbeans build process, and the declaration of the fileset, which should only contain Hibernate data beans, here realised via an exclude of all other files in the package.
The mkdir and tstamp commands are copied from the build-impl.xml, as this target definition overwrites the original “jar-prep” target and thus have to execute these commands. And this is ugly, of course! Normally, you should overwrite the target “-post-compile” or “-pre-jar”, which exist exactly for this purpose. Unfortunately, this did not work for me, the targets were completely ignored. Maybe you have some hints for me, but for now, I can live with my “dirty little workaround”.








