On the one hand, JPA and Hibernate provide an abstraction to do the CRUD operations against various databases, enable us to develop database applications quickly
On the other hand, JPA and Hibernate also create some performance issues that we need to be taken care of
This article will explore some common problems you may encounter when working with JPA and Hibernate in Spring Boot and Spring Data applications
Let's get started!
Problem 1: EAGER FetchType
There are two types of fetching strategy in JPA and Hibernate to load data for the associated (child) entities, EAGER and LAZY. While LAZY only fetches data when they are needed (at the first time accessed of the associated entities), EAGER always fetches child entities along with its parent
While LAZY can cause LazyInitializationException, EAGER causes Hibernate to generate and execute unnecessary SELECT SQLs to the database and so increasing the memory consumption for unnecessary data
Let's take a look at the following example
@Entity
class Company {
@OneToMany(fetch = FetchType.EAGER)
private Set<Employee> employees;
@OneToOne(fetch = FetchType.EAGER)
private Address address;
//...
}
interface CompanyRepository extends JpaRepository<Company, Integer> {
List<Company> findFirst10ByOrderByNameAsc();
}
@Service
class CompanyService {
//...
public List<String> findCompanyNames() {
List<Company> companies = companyRepository.findFirst10ByOrderByNameAsc();
return companies.stream()
.map(Company::getName)
.collect(Collectors.toList());
}
}
The above findCompanyNames()
method only needs company names. However, besides fetching Company data, Hibernate also eagerly fetches data in the background for other EAGER FetchType associations including Set<Employee>
and Address
. The problem would become more significant when you have more EAGER associations or more entities in each
Apart from consuming memory for unnecessary data, EAGER FetchType can also cause the N+1 SQL problem which you can find more details in the latter part of this article
Approach 1: LAZY FetchType
Avoid using EAGER fetch in JPA and Hibernate. EAGER can fix the LazyInitializationException caused by the LAZY but it creates a performance pitfall especially when you have multiple EAGER associations in the entity
EAGER FetchType is the default in JPA ToOne mappings (@ManyToOne, @OneToOne), make sure you change them to LAZY explicitly
@ManyToOne(fetch = FetchType.LAZY)
private BookCategory bookCategory;
@OneToOne(fetch = FetchType.LAZY)
private BookDetail bookDetail;
Problem 2: LazyInitializationException
LazyInitializationException is caused by uninitialized LAZY FetchType associations when they are accessed outside the transactional context
@Entity
public class Quote {
@ManyToOne(fetch = FetchType.LAZY)
private Author author;
//...
}
interface QuoteRepository extends JpaRepository<Quote, Integer>{
List<Quote> findFirst10ByOrderByName();
}
@Service
class QuoteService
//...
public List<QuoteWithAuthor> findQuotesWithAuthor() {
return of(quoteRepository.findFirst10ByOrderByName());
}
List<QuoteWithAuthor> of(List<Quote> quotes) {
return quotes.stream().map(q -> {
Author a = q.getAuthor();
AuthorIdName ain = new AuthorIdName(a.getId(), a.getName());
return new QuoteWithAuthor(q.getId(), q.getName(), ain);
}).collect(Collectors.toList());
}
}
In the above findQuotesWithAuthor()
method, LazyInitializationException
is thrown as the LAZY associated entity author
is not initialized, the quoteRepository.findFirst10ByOrderByName()
only fetchs the parent and EAGER associations then close the transaction
There are several ways to resolve the LazyInitializationException. However, some of them are pitfalls
Pitfall 2.1: EAGER FetchType
Use EAGER FetchType for the author
association, so its data will be available along with its parent. However, this approach can create a performance pitfall as we learned in the previous section. It also can create the N+1 problem
Pitfall 2.2: @Transactional
Add @Transacional
annotation to findQuotesWithAuthor()
method can keep the associations fetching executed in database transactional context (so resolving the LazyInitializationException) but it can create the N+1 problem. You can learn more about this problem in the latter part of this tutorial
Pitfall 2.3: Hibernate.initialize(Object)
Hibernate.initialize(Object) forces initialization of a proxy or persistent collection. It helps initialize LAZY FetchType associations manually but also brings some drawbacks
It generates and executes an additional SQL against the database each time
It does not initialize child entities of the input object
Try to fix the symptoms and do outside the repository (data access layer) are the common theme of pitfall approaches
On the opposites, the right approaches will try to do inside the repository and fix the root cause by hinting JPA and Hibernate to generate and execute only 1 SQL to fetch the necessary association's data as there are no throwing LazyInitializationException errors if all the necessary data are initialized before using
Approach 2.1: Projection
Spring Data can help retrieve partial view of a JPA @Entity with interface-based or class-based projection (DTO classes)
interface QuoteProjection {
Integer getId();
String getName();
AuthorProjection getAuthor();
}
interface AuthorProjection {
Integer getId();
String getName();
}
interface QuoteRepository extends JpaRepository<Quote, Integer> {
List<QuoteProjection> findFirst10ByOrderByNameDesc();
}
QuoteProjection and AuthorProjection are the partial views of 2 JPA entities, Quote and Author respectively
In findFirst10ByOrderByNameDesc()
method of QuoteRepository
, Spring Data JPA fetchs and converts the return data to List<QuoteProjection>
Approach 2.2: @EntityGraph
@EntityGraph annotation is used on the repository methods, providing a declarative way to fetch associations in a single query
interface QuoteRepository extends JpaRepository<Quote, Integer> {
@EntityGraph(attributePaths = "author")
List<Quote> findFirst10ByOrderByName();
}
Approach 2.3: JPQL JOIN FETCH
JPQL supports JOIN FETCH to fetch associations data in a single query
interface QuoteRepository extends JpaRepository<Quote, Integer> {
@Query(value="FROM Quote q LEFT JOIN FETCH q.author")
List<Quote> findWithAuthorAndJoinFetch(Pageable pageable);
}
Problem 3: N+1 SQL statements
The N+1 SQL problem occurs on the associated (child) entities when JPA and Hibernate have to execute N additional SQLs to do the operation
The issue can happen on both EAGER and LAZY FetchType and on reading or deleting operation against a database
Problem 3.1: N+1 on SELECT
Let's take a look at the following example
@Entity
class Company {
@OneToMany(fetch = LAZY, mappedBy = "company", cascade = ALL)
private Set<Employee> employees;
@OneToMany(fetch = EAGER, mappedBy = "company", cascade = ALL)
private Set<Department> departments;
//...
}
interface CompanyRepository extends JpaRepository<Company, Integer> {
List<Company> findFirst10ByOrderByNameAsc();
}
@Service
class CompanyService {
//...
public List<String> findCompanyNames() {
List<Company> companies = companyRepository.findFirst10ByOrderByNameAsc();
return companies.stream()
.map(Company::getName)
.collect(Collectors.toList());
}
@Transactional
public List<CompanyIdName> findCompanyIdNames() {
return companyRepository.findFirst10ByOrderByNameAsc().stream()
.map(CompanyIdName::of)
.collect(Collectors.toList());
}
}
Suppose you have 10 Companies and each Company has some Employees in the database. With JPA show-sql
and Hibernate generate_statistics
settings are enabled in your Spring Boot application properties
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.generate_statistics=true
When running the findCompanyNames()
method, you may end up with the following output in the console
... nanoseconds spent preparing 11 JDBC statements;
... nanoseconds spent executing 11 JDBC statements;
The first SQL statement is executed to load the list of 10 companies
List<Company>
The 10 consecutive SQLs, corresponding to the number of elements returned from the result of the first SQL statement, are executed to fetch the
Set<Department>
EAGER associations
Separately, when running the findCompanyIdNames()
method, you may end up with the following output in the console
... nanoseconds spent preparing 21 JDBC statements;
... nanoseconds spent executing 21 JDBC statements;
The first SQL statement is executed to load the list of 10 companies
List<Company>
The 20 consecutive SQLs, corresponding to the number of elements returned from the result of the first SQL statement, are executed to fetch the EAGER
Set<Department>
and LAZYSet<Employee>
associations
Approach 3.1.1: @EntityGraph and Projection
@EntityGraph annotation is used on the repository methods, providing a declarative way to fetch associations in a single query
Spring Data can help to retrieve partial view of a JPA @Entity with interface-based or class-based projection (DTO classes)
interface CompanyProjection {
int getId();
String getName();
List<EmployeeProjection> getEmployees();
}
interface EmployeeProjection {
int getId();
String getName();
}
@EntityGraph(attributePaths = "employees")
List<CompanyProjection> findFirst10ByOrderByIdDesc();
CompanyProjection and EmployeeProjection are the partial views of 2 JPA entities, Company and Employee respectively
In findFirst10ByOrderByIdDesc()
method, Spring Data JPA fetchs and converts the return data to List<CompanyProjection>
Approach 3.1.2: JOIN FETCH
JPQL supports JOIN FETCH to fetch associations data in a single query
@Query(value="FROM Company c LEFT JOIN FETCH c.employees")
List<Company> findWithEmployeesAndJoinFetch(Pageable pageable);
Try to hint JPA and Hibernate to generate and execute only 1 SQL is the common pattern of the above approaches. Now
when you trigger the findFirst10ByOrderByIdDesc()
or findWithEmployeesAndJoinFetch()
method, you would observe the expected output
... nanoseconds spent preparing 1 JDBC statements;
... nanoseconds spent executing 1 JDBC statements;
Problem 3.2: N+1 on DELETE
N+1 may also occur on Delete operations. Suppose you'd like to build a repository method to delete employees by a company id
interface EmployeeRepository extends JpaRepository<Employee, Integer> {
@Transactional
void deleteByCompanyId(int companyId);
}
When triggering deleteByCompanyId()
method and looking into the output console, you would find out that Hibernate deletes one by one employee. It generates N+1 queries to do the operation
The first query selects a list of employees of companies with id as companyId
The second query selects company data with id as companyId
Then N consecutive queries are generated and executed to delete one by one employee, N is the size of the result list of the first query
Approach 3.2: Define the DELETE query explicitly
In the repository, define the DELETE query explicitly
interface EmployeeRepository extends JpaRepository<Employee, Integer> {
@Modifying
@Transactional
@Query("DELETE FROM Employee e WHERE e.company.id = ?1")
void deleteInBulkByCompanyId(int companyId);
@Transactional
void deleteByCompanyId(int categoryId);
}
With the deleteInBulkByCompanyId()
method, only 1 DELETE query is executed to delete employes in bulk instead of one by one
Conclusion
In this article, we explored some of the common problems when using JPA and Hibernate in Spring Boot and Spring Data applications. We also learned about the pitfalls and the right approaches to resolving the issues