Hibernate: Much faster with @NamedEntityGraphA Noteworthy Annotation

A Java application running on fast hardware, with a fast connection to the database cluster - and yet, the SOAP service keeps hitting the 60 seconds time limit, which is when the load balancer cuts any open connections. Where are the time-sinks?

The Java application in question manages units in a database that is integrated with the Java application via Hibernate (as a JPA implementation). Each unit has an m:n association with authorizations, which are being held within a separate table.

The SOAP service to be optimized executed the following steps:

  • user authentication by user name and password
  • request about 22,000 unit data sets including their authorizations from the database
  • convert data sets for distribution via SOAP
  • marshal and distribute the response

Some of those steps, like marshalling or authentication, hadn't been newly implemented in this service. The list is just meant to illustrate the amount of work that has to be done in order to complete a request.

60 second time limit

The application we are talking about is installed on a cluster within our application platform appNG. A load balancer distributes incoming requests among all nodes within the cluster. The load balancer is configured to interrupt any open connections after a maximum of 60 seconds.

That means that transmission of the service response necessarily has to begin within those 60 seconds. And that was exactly what the application repeatedly failed at - fast hardware and a fast connection to the database cluster notwithstanding.

In my (if not entirely comparable, concerning runtime) development environment, I initially used SoapUI to raise some data, so as to record the initial situation.

[...Soap11.unitExport:Request 8] in 44950ms (17755730 bytes)
[...Soap11.unitExport:Request 8] in 37042ms (17755730 bytes)
[...Soap11.unitExport:Request 8] in 37160ms (17755730 bytes)
[...Soap11.unitExport:Request 8] in 40836ms (17755730 bytes)

In my development environment, the process including transmission of all data took an average of 39.975 seconds. This, consequently, was the reference value for all of my optimization measures' success. For various reasons, the test reading with me, locally, is lower than the one in the productive environment, but that doesn't make a difference for a qualitative consideration of the optimization measures.

Looking for the time-sinks

If you want to optimize this kind of service, your first task is to find out which steps consume the biggest amounts of time. For that, I find it convenient to improve logging by one or several StopWatches - just to get a rough idea. This measurement was integrated into the method executed by the SOAP request. Here, authentication stands for the user's log-in, fetch stands for loading all the necessary data from the database and convert means converting the data for SOAP distribution.

StopWatch '': running time (millis) = 37444
-----------------------------------------
ms     %     Task name
-----------------------------------------
00875  002%  authentication
36521  098%  fetch
00048  000%  convert

Turns out requesting the data from the database was using up the biggest amount of time. So this was where I directed my efforts.

As mentioned above, the application is linked to a MySQL database via Hibernate. The good thing about Hibernate is that exactly how data sets are being requested or written is, initially at least, irrelevant to the Java developer. Sadly, even if Hibernate does a great job, you can't automatically be certain that all the queries that Hibernate built for itself are ideal.

Oftentimes, the unsuspecting developer might not be familiar with important annotations and tricks that Hibernate uses to create efficient requests - the developer, anyway, doesn't exactly see what kinds of queries are being executed. That again, though, can be brought to light by raising the log level in the log4j configuration file.

log4j.logger.org.hibernate.SQL=DEBUG
log4j.logger.org.hibernate.type=TRACE

Thus, what has been opaque to the developer so far, suddenly becomes crystal clear: As mentioned above, the unit has a list of authorizations. The authorizations' definitions are stored in a separate table. There is an m:n relationship between unit and authorizations.

Now if Hibernate is set to get 22,000 data sets of units including their authorizations, firstly, a query is started to read the units including their direct attributes, and then another one for each data set (meaning 22,000 times), to determine the respective unit's authorizations. That does take time! That is what we call the n+1 select problem. Instead of blaming Hibernate, though, you might want to take the time to investigate potential mistakes or neglect on the developer's side.

At that point, without too much of an effort, we had already gathered quite some insight, so no further measurements and examinations seemed indicated. Be that as it may, the JDBC-Performance-Logger deserves to be mentioned here, as it is another viable option for deeper inspection of potential optimization to database queries. Plus, it gives you a lot more options to analyse separate queries concerning their duration and frequency. Furthermore, it is able to show you the query as it has originally been sent, including its de facto data - much different from Hibernate's log output.

Named Entity Graphs

If you newly develop the request described above by yourself and without the help of abstraction levels and frameworks, you'd help yourself with a more or less complex join, in order to avoid too many unnecessary queries to the database. By skillfully formulating annotations, you can make Hibernate optimize its own requests. The feature of choice in this case is called @NamedEntityGraph.

@NamedEntityGraph(
    name = "unitWithAuthorizations",
    attributeNodes = { @NamedAttributeNode("authorizations") })
public class Unit {
...
}

The @NamedEntityGraph annotation at the class "Unit" defines a mode in which the unit is always loaded together with all its authorizations. By referencing this mode at our query method in the corresponding Spring Data JPA repository, Hibernate is told to actually use this mode:

public interface UnitRepository extends JpaRepository<Unit,Integer> {
    @EntityGraph("unitWithAuthorizations")
    List<Unit> findUnitsByCountry(Country c);
}

As of now, Hibernate only uses one single query for all the data - and that saves lots of time.

[...Soap11.unitExport:Request 8] in 11357ms (17755730 bytes)
[...Soap11.unitExport:Request 8] in 6258ms (17755730 bytes)
[...Soap11.unitExport:Request 8] in 5019ms (17755730 bytes)
[...Soap11.unitExport:Request 8] in 5928ms (17755730 bytes)

Persistence Context

The optimization above being complete, another thing came up. One StopWatch was set to measure the single steps within the method that processes the SOAP request; another one exclusively for the process of the database query. The difference was 300ms. That doesn't sound like much, but here, nothing happens, apart from the method being exited.

The reason for this is that the method that reads the data from the database belongs to a class annotated @Transactional. Thus, every call to that method takes place in a so-called persistence context. When exiting the method, it has to be made certain that changes to one of the objects read from the database are written to the database, too. Therefore, every single object has to be examined accordingly.

Since this method exclusively reads and processes data from the database, it makes perfect sense to tell the EntityManager so. In this case, the objects don't have to be examined. The correct annotation for that is:

@Transactional(readOnly = true)
public List<Unit> getUnits() {
...
}

Researching this topic, you'll find the recommendation to generally annotate the entire class like that in the Spring Documentation. If you have a method that is definitely supposed to change data, and only then, you should note that accordingly. I like this idea and will definitely follow it in my next project.

Conclusion

While digging into this matter, I definitely found room for improvement that I hadn't thought possible before. For one, that was because until then, it hadn't been necessary to squeeze maximum performance out of the thing, as in our case, just a few units at a time were requested; plus, since existing frameworks offer good and easy access that works very well with most use-cases in the first place.

With the above course of action, we were able to reduce time for processing and transmitting the SOAP request from 39.975 seconds to 7.12 seconds on average - an improvement by the factor 5.6.

What's much more important in that context, though, is that in the productive system, the service became usable in the first place, while sticking with the existing infrastructure.

Are you interested? Please get in touch!