Bulk Updation/Insertion of Database Tables in Java using Hibernate — Optimized Way

Sahil Aggarwal
5 min readJun 26, 2021

Hibernate is the most popular orm framework used to interact with databases in java . In this article we will see what are the various ways using which bulk selection and updation in any table can be done and what is the most effective way when using the hibernate framework in java .

I experimented with three ways which are as follows :

  • Using Hibernate’s Query.list() method.
  • Using ScrollableResults with FORWARD_ONLY scroll mode.
  • Using ScrollableResults with FORWARD_ONLY scroll mode in a StatelessSession.

To decide which one gives best performance for our use case, following tests i performed using the above three ways listed.

Let’s see the Code and results by applying above three ways to the operation stated above one by one.

Using Hibernate’s Query.list() method.

Code Executed :

List rows;
Session session = getSession();
Transaction transaction = session.beginTransaction();
try {
Query query = session.createQuery("FROM PersonEntity WHERE id > :maxId ORDER BY id").setParameter("maxId",
MAX_ID_VALUE);
query.setMaxResults(1000);
rows = query.list();
int count = 0;
for (Object row : rows) {
PersonEntity personEntity = (PersonEntity) row;
personEntity.setName(randomAlphaNumeric(30));
session.saveOrUpdate(personEntity);
//Always flush and clear the session after updating 50(jdbc_batch_size specified in hibernate.properties) rows
if (++count % 50 == 0) {
session.flush();
session.clear();
}
}
} finally {
if (session != null && session.isOpen()) {
transaction.commit();
session.close();
}
}

Tests Results :

  • Time taken:- 360s to 400s
  • Heap Pattern:- gradually increased from 13m to 51m(from jconsole).

Using ScrollableResults with FORWARD_ONLY scroll mode.

With this we are expecting that it should consume less memory that the 1st approach . Let’s see the results

Code Executed :

Session session = getSession();
Transaction transaction = session.beginTransaction();
ScrollableResults scrollableResults = session
.createQuery("FROM PersonEntity WHERE id > " + MAX_ID_VALUE + " ORDER BY id")
.setMaxResults(1000).scroll(ScrollMode.FORWARD_ONLY);
int count = 0;
try {
while (scrollableResults.next()) {
PersonEntity personEntity = (PersonEntity) scrollableResults.get(0);
personEntity.setName(randomAlphaNumeric(30));
session.saveOrUpdate(personEntity);
if (++count % 50 == 0) {
session.flush();
session.clear();
}
}
} finally {
if (session != null && session.isOpen()) {
transaction.commit();
session.close();
}
}

Tests Results :

  • Time taken:- 185s to 200s
  • Heap Pattern:- gradually increased from 13mb to 41mb (measured same using jconsole)

Using ScrollableResults with FORWARD_ONLY scroll mode in a StatelessSession.

A stateless session does not implement a first-level cache nor interact with any second-level cache, nor does it implement transactional write-behind or automatic dirty checking, nor do operations cascade to associated instances. Collections are ignored by a stateless session. Operations performed via a stateless session bypass Hibernate’s event model and interceptors.

These type of session is always recommended in case of bulk updation as we really do not need these overheads of hibernate features in these type of usecases .

Code Executed :

StatelessSession session = getStatelessSession();
Transaction transaction = session.beginTransaction();
ScrollableResults scrollableResults = session
.createQuery("FROM PersonEntity WHERE id > " + MAX_ID_VALUE + " ORDER BY id")
.setMaxResults(TRANSACTION_BATCH_SIZE).scroll(ScrollMode.FORWARD_ONLY);
try {
while (scrollableResults.next()) {
PersonEntity personEntity = (PersonEntity) scrollableResults.get(0);
personEntity.setName(randomAlphaNumeric(20));
session.update(personEntity);
}
} finally {
if (session != null && session.isOpen()) {
transaction.commit();
session.close();
}
}

Tests Results :

  • Time taken:- 185s to 200s
  • Heap Pattern:- gradually increased from 13mb to 39mb

I also performed the same tests with 2000 rows and the results obtained were as follows:-

Results:-

  • Using list():- time taken:- approx 750s, heap pattern:- gradually increased from 13mb to 74 mb
  • Using ScrollableResultSet:- time taken:- approx 380s, heap pattern:- gradually increased from 13mb to 46mb
  • Using Stateless:- time taken:- approx 380s, heap pattern:- gradually increased from 13mb to 43mb

Blocker Problem with all above approaches Tried

ScrollableResults and Stateless ScrollableResults give almost the same performance which is much better than Query.list(). But there is still one problem with all the above approaches. Locking, all the above approaches select and update the data in same transaction, this means for as long as the transaction is running, the rows on which updates have been performed will be locked and any other operations will have to wait for the transaction to finish.

Solution :

There are two things which we should do here to solve above problem :

  • we need to select and update data in different transactions.
  • And updation of these types should be done in Batches

So again I performed the same tests as above but this time update was performed in a different transaction which was commited in batches of 50.

Note:- In case of Scrollable and Stateless we need a different session also, as we need the original session and transaction to scroll through the results.

Results using Batch Processing

  • Using list():- time taken:- approx 400s, heap pattern:- gradually increased from 13mb to 61 mb
  • Using ScrollableResultSet:- time taken:- approx 380s, heap pattern:- gradually increased from 13mb to 51mb
  • Using Stateless:- time taken:- approx 190s, heap pattern:- gradually increased from 13mb to 44mb

Observation:- This temporal performance of ScrollableResults dropped down to become almost equal to Query.list(), but performance of Stateless remained almost same.

Summary and Conclusion

As from all the above experimentation , in cases where we need to do bulk selection and updation, the best approach in terms of memory consumption and time is as follows :

  • Use ScrollableResults in a Stateless Session.
  • Perform selection and updation in different transactions in batches of 20 to 50 (Batch Processing) (Note -*- Batch size can depend on the case to case basis)

Sample Code with the best approach

StatelessSession session = getStatelessSession();
Transaction transaction = session.beginTransaction();
ScrollableResults scrollableResults = session
.createQuery("FROM PersonEntity WHERE id > " + MAX_ID_VALUE + " ORDER BY id")
.setMaxResults(TRANSACTION_BATCH_SIZE).scroll(ScrollMode.FORWARD_ONLY);
int count = 0;
try {
StatelessSession updateSession = getStatelessSession();
Transaction updateTransaction = updateSession.beginTransaction();
while (scrollableResults.next()) {
PersonEntity personEntity = (PersonEntity) scrollableResults.get(0);
personEntity.setName(randomAlphaNumeric(5));
updateSession.update(personEntity);
if (++count % 50 == 0) {
updateTransaction.commit();
updateTransaction = updateSession.beginTransaction();
}
}
updateSession.close();
} finally {
if (session != null && session.isOpen()) {
transaction.commit();
session.close();
}
}

With the java frameworks like spring and others this code may be even more smaller , like one not needing to take care of session closing etc . Above code is written in plain java using hibernate.

Please try with large data and comment us the results , Also if you have some other better approach to do this please comment .

Thank You for reading the article

Originally published at http://hello-worlds.in on June 26, 2021.

--

--