Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020

Knowledge Base Resources Deals Join Us About Login Register      è Search...

ANDROID JAVA JVM LANGUAGES SOFTWARE DEVELOPMENT AGILE CAREER COMMUNICATIONS DEVOPS META JCG

⌂ Home » Java » Enterprise Java » Spring Batch Tutorial – The ULTIMATE Guide

ABOUT DANI BUIZA


Daniel Gutierrez Diez holds a Master in Computer Science Engineering from the University of Oviedo (Spain) and a Post Grade as
Specialist in Foreign Trade from the UNED (Spain). Daniel has been working for different clients and companies in several Java projects
as programmer, designer, trainer, consultant and technical lead.

Spring Batch Tutorial – The ULTIMATE Guide


 Posted by: Dani Buiza  in Enterprise Java  March 17th, 2015  17 Comments  8329 Views NEWSLETTER
This is Spring batch tutorial which is part of the Spring framework. Spring Batch provides reusable functions that are essential in
Insiders are already enjoying weekly u
processing large volumes of records, including logging/tracing, transaction management, job processing statistics, job restart, skip, and
complimentary whitepapers!
resource management. It also provides more advanced technical services and features that will enable extremely high-volume and high
performance batch jobs through optimization and partitioning techniques. Join them now to gain exclu
Here, you can find a clear explanation about its main components and concepts and several working examples. This tutorial is not about the
access to the latest news in the Jav
Spring framework in general; it is expected that you are familiar with mechanisms like Inversion of Control and Dependency Injection, that are well as insights about Android, Scala, G
the main pillars of the Spring framework. It is also assumed that you know how to configure the Spring framework context for basic other related technologies.
applications and that you are used to work with both annotations and configuration files based Spring projects.

If this is not the case, I would really recommend to go to the Spring framework official page and learn the basic tutorials before starting to Enter your e-mail...
learn what is Spring batch and how it works. Here is a very good one: http://docs.spring.io/docs/Spring-MVC-step-by-step/. I agree to the Terms and Privacy

Sign up
Want to master Spring Framework ?
Subscribe to our newsletter and download the Spring
Framework Cookbook right now!
In order to help you master the leading and innovative Java framework, we have compiled a
kick-ass guide with all its major features and use cases! Besides studying them online you
may download the eBook in PDF format!

Enter your e-mail...

I agree to the Terms and Privacy Policy

Sign up

At the end of this tutorial, you can find a compressed file with all the examples listed and some extras.
JOIN US
The software used in the elaboration of this tutorial is listed below:
With 1,240,6
Java update 8 Version 3.1 unique visitors
Apache Maven 3.2.5 500 authors
placed among
Eclipse Luna 4.4.1 related sites a
Spring Batch 3.0.3 and all its dependencies (I really recommend to use Maven or Gradle to resolve all the required dependencies and avoid Constantly bei
headaches) lookout for pa
encourage you
Spring Boot 1.2.2 and all its dependencies (I really recommend to use Maven or Gradle to resolve all the required dependencies and avoid So If you have
headaches) unique and interesting content then yo
check out our JCG partners program. Y
MySQL Community Server version 5.6.22
be a guest writer for Java Code Geek
MongoDB 2.6.8 your writing skills!

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 1/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
HSQLDB version 1.8.0.10

This tutorial will not explain how to use Maven although it is used for solving dependencies, compiling and executing the examples provided.
More information can be found in the following article http://examples.javacodegeeks.com/enterprise-java/maven/log4j-maven-example/.

The module Spring boot is also heavily used in the examples, for more information about it please refer to the official Spring Boot
documentation: http://projects.spring.io/spring-boot/.

Table Of Contents
1. Intro
2. Concepts
3. Use Cases
4 Controlling flow
5. Custom Writers, Readers and Processors
6. Flat file example
7. MySQL example
8. In Memory example
9. Unit testing
10. Error Handling
11. Parallel Processing
12. Repeating jobs
13. JSR 352
14. Summary
15. Resources
16. Download

1. Intro
Spring Batch is an open source framework for batch processing. It is built as a module within the Spring framework and depends on this
framework (among others). Before continuing with Spring Batch we are going to put here the definition of batch processing:

“Batch processing is the execution of a series of programs (“jobs”) on a computer without manual intervention” (From the Wikipedia).

So, for our matter, a batch application executes a series of jobs (iterative or in parallel), where input data is read, processed and written
without any interaction. We are going to see how Spring Batch can help us with this purpose.

Spring Batch provides mechanisms for processing large amount of data like transaction management, job processing, resource management,
logging, tracing, conversion of data, interfaces, etc. These functionalities are available out of the box and can be reused by applications
containing the Spring Batch framework. By using these diverse techniques, the framework takes care of the performance and the scalability
while processing the records.

Normally a batch application can be divided in three main parts:

Reading the data (from a database, file system, etc.)


Processing the data (filtering, grouping, calculating, validating…)
Writing the data (to a database, reporting, distributing…)

Spring Batch contains features and abstractions (as we will explain in this article) for automating these basic steps and allowing the
application programmers to configure them, repeat them, retry them, stop them, executing them as a single element or grouped (transaction
management), etc.

It also contains classes and interfaces for the main data formats, industry standards and providers like XML, CSV, SQL, Mongo DB, etc.

In the next chapters of this tutorial we are going to explain and provide examples of all these steps and the difference possibilities that Spring
Batch offers.

2. Concepts
Here are the most important concepts in the Spring Batch framework:

Jobs
Jobs are abstractions to represent batch processes, that is, sequences of actions or commands that have to be executed within the batch
application.

Spring batch contains the following interface to represent Jobs: http://docs.spring.io/spring-


batch/apidocs/org/springframework/batch/core/Job.html. Simple Jobs contain a list of steps and these are executed sequentially or in parallel.

In order to configure a Job it is enough to initialize the list of steps, this is an example of an xml based configuration for a dummy Job:

01 <job id="eatJob" xmlns="http://www.springframework.org/schema/batch">


02 <step id="stepCook" next="stepEntries">
03 <tasklet>
04 <chunk reader="cookReader" writer="cookProcessor"
05 processor="cookWriter" commit-interval="1" />
06 </tasklet>

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 2/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
07 </step>
08 <step id="stepEntries" next="stepMeat">
09 <tasklet>
10 <chunk reader="entriesReader" writer="entriesProcessor"
11 processor="entriesWriter" commit-interval="1" />
12 </tasklet> 
13 </step>
14 <step id="stepMeat" next="stepWine">
15 <tasklet ref="drinkSomeWine" />
16 </step>
17 <step id="stepWine" next="clean">
18 <tasklet>
19 <chunk reader="wineReader" writer="wineProcessor"
20 processor="wineWriter" commit-interval="1" />
21 </tasklet>
22 </step>
23 <step id="clean">
24 <tasklet ref="cleanTheTable" />
25 </step>
26 </job>

Job launcher
This interface http://docs.spring.io/spring-batch/apidocs/org/springframework/batch/core/launch/JobLauncher.html represents a Job
Launcher. Implementations of its
run()

method take care of starting job executions for the given jobs and job parameters.

Job instance
This is an abstraction representing a single run for a given Job. It is unique and identifiable. The class representing this abstraction is
http://docs.spring.io/spring-batch/apidocs/org/springframework/batch/core/JobInstance.html.

Job instances can be restarted in case they were not completed successfully and if the Job is restart able. Otherwise an error will be raised.

Steps
Steps are mainly the parts that compose a Job (and a Job instance). A
Step

is a part of a
Job

and contains all the necessary information to execute the batch processing actions that are expected to be done at that phase of the job.
Steps in Spring Batch are composed of
ItemReader

,
ItemProcessor

and
ItemWriter

and can be very simple or extremely complicated depending on the complexity of their members.

Steps also contain configuration options for their processing strategy, commit interval, transaction mechanism or job repositories that may be
used. Spring Batch uses normally chunk processing, that is reading all data at one time and processing and writing “chunks” of this data on a
preconfigured interval, called commit interval.

Here is a very basic example of a xml based step configuration using an interval of 10:

1 <step id="step" next="nextStep">


2 <tasklet>
3 <chunk reader="customItemReader" writer="customItemWriter" processor="customItemProcessor" commit-interva
4 </tasklet>
5 </step>

And the following snippet is the annotation based version defining the readers, writers and processors involved, a chunk processing strategy
and a commit interval of 10 (this is the one that we are using in the majority of examples in this tutorial):

1 @Bean
2 public Step step1(StepBuilderFactory stepBuilderFactory,
3 ItemReader reader, ItemWriter writer,
4 ItemProcessor processor) {
5 /* it handles bunches of 10 units */
6 return stepBuilderFactory.get("step1")
7 . chunk(10).reader(reader)
8 .processor(processor).writer(writer).build();
9 }

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 3/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020

Job Repositories
Job repositories are abstractions responsible of the storing and updating of metadata information related to Job instance executions and Job
contexts. The basic interface that has to be implemented in order to configure a Job Repository is http://docs.spring.io/spring-
batch/apidocs/org/springframework/batch/core/repository/JobRepository.html. 
Spring stores as metadata information about their executions, the results obtained, their instances, the parameters used for the Jobs executed
and the context where the processing runs. The table names are very intuitive and similar to their domain classes counterparts, in this link
there is an image with a very good summary of these tables: http://docs.spring.io/spring-batch/reference/html/images/meta-data-erd.png.

For more information about the Spring Batch metadata schema, please visit http://docs.spring.io/spring-
batch/reference/html/metaDataSchema.html.

Item Readers
Readers are abstractions responsible of the data retrieval. They provide batch processing applications with the needed input data. We will see
in this tutorial how to create custom readers and we will see how to use some of the most important Spring Batch predefined ones. Here is a
list of some readers provided by Spring Batch:

AmqpItemReader
AggregateItemReader
FlatFileItemReader
HibernateCursorItemReader
HibernatePagingItemReader
IbatisPagingItemReader
ItemReaderAdapter
JdbcCursorItemReader
JdbcPagingItemReader
JmsItemReader
JpaPagingItemReader
ListItemReader
MongoItemReader
Neo4jItemReader
RepositoryItemReader
StoredProcedureItemReader
StaxEventItemReader

We can see that Spring Batch already provides readers for many of the formatting standards and database industry providers. It is
recommended to use the abstractions provided by Spring Batch in your applications rather than creating your own ones.

Item Writers
Writers are abstractions responsible of writing the data to the desired output database or system. The same that we explained for Readers is
applicable to Writers: Spring Batch already provides classes and interfaces to deal with many of the most used databases, these should be
used. Here is a list of some of these provided writers:

AbstractItemStreamItemWriter
AmqpItemWriter
CompositeItemWriter
FlatFileItemWriter
GemfireItemWriter
HibernateItemWriter
IbatisBatchItemWriter
ItemWriterAdapter
JdbcBatchItemWriter
JmsItemWriter
JpaItemWriter
MimeMessageItemWriter
MongoItemWriter
Neo4jItemWriter
StaxEventItemWriter
RepositoryItemWriter

In this article we will show how to create custom writers and how to use some of the listed ones.

Item Processors

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 4/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
Processors are in charge of modifying the data records converting it from the input format to the output desired one. The main interfaces
used for Item Processors configuration is http://docs.spring.io/spring-
batch/trunk/apidocs/org/springframework/batch/item/ItemProcessor.html.

In this article we will see how to create our custom item processors. 
The following picture (from the Spring batch documentation) gives a very good summary of all these concepts and how the basic Spring Batch
architecture is designed:

Spring Batch Reference Model

3. Use Cases
Although it is difficult to categorize the use cases where batch processing can be applied in the real world, I am going to try to list in this
chapter the most important ones:

Conversion Applications: These are applications that convert input records into the required structure or format. These applications can
be used in all the phases of the batch processing (reading, processing and writing).
Filtering or validation applications: These are programs with the goal of filtering valid records for further processing. Normally
validation happens in the first phases of the batch processing.
Database extractors: These are applications that read data from a database or input files and write the desired filtered data to an
output file or to other database. There are also applications that updates large amounts of data in the same database where the input
records come from. As a real life example we can think of a system that analyzes log files with different end user behaviors and, using this
data, produces reports with statistics about most active users, most active periods of time, etc.
Reporting: These are applications that read large amounts of data from a database or input files, process this data and produce
formatted documents based on that data that are suitable for printing or sending via other systems. Accounting and Legal Banking
systems can be part of this category: at the end of the business day, these systems read information from the databases, extract the data
required and write this data into legal documents that may be sent to different authorities.

Spring Batch provides mechanisms to support all these scenarios, with the elements and components listed in the previous chapter
programmers can implement batch applications for conversion of data, filtering records, validation, extracting information from databases or
input files and reporting.

4. Controlling flow
Before starting talking about specific Jobs and Steps I am going to show how a Spring Batch configuration class looks like. The next snippet
contains a configuration class with all the components needed for batch processing using Spring Batch. It contains readers, writers,
processors, job flows, steps and all other needed beans.

During this tutorial we will show how to modify this configuration class in order to use different abstractions for our different purposes. The
class bellow is pasted without comments and specific code, for the working class example please go to the download section in this tutorial
where you can download all the sources:

01 @Configuration
02 @EnableBatchProcessing
03 public class SpringBatchTutorialConfiguration {
04
05 @Bean
06 public ItemReader reader() {
07 return new CustomItemReader();
08 }
09
10 @Bean
11 public ItemProcessor processor() {
12 return new CustomItemProcessor();
13 }
14
15 @Bean
16 public ItemWriter writer(DataSource dataSource) {
17 return new CustomItemItemWriter(dataSource);
18 }
19
20 @Bean
21 public Job job1(JobBuilderFactory jobs, Step step1) {
22 return jobs.get("job1").incrementer(new RunIdIncrementer())

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 5/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
23 .flow(step1).end().build();
24 }
25
26 @Bean
27 public Step step1(StepBuilderFactory stepBuilderFactory,
28 ItemReader reader, ItemWriter writer, 
29 ItemProcessor processor) {
30 /* it handles bunches of 10 units */
31 return stepBuilderFactory.get("step1")
32 . chunk(10).reader(reader)
33 .processor(processor).writer(writer).build();
34 }
35
36 @Bean
37 public JdbcTemplate jdbcTemplate(DataSource dataSource) {
38 return new JdbcTemplate(dataSource);
39 }
40
41
42 @Bean
43 public DataSource mysqlDataSource() throws SQLException {
44 final DriverManagerDataSource dataSource = new DriverManagerDataSource();
45 dataSource.setDriverClassName("com.mysql.jdbc.Driver");
46 dataSource.setUrl("jdbc:mysql://localhost/spring_batch_annotations");
47 dataSource.setUsername("root");
48 dataSource.setPassword("root");
49 return dataSource;
50 }
51
52 ...

In order to launch our spring context and execute the configured batch shown before we are going to use Spring Boot. Here is an example of
a program that takes care of launching our application and initializing the Spring context with the proper configuration. This program is used
with all the examples shown in this tutorial:

01 @SpringBootApplication
02 public class SpringBatchTutorialMain implements CommandLineRunner {
03
04 public static void main(String[] args) {
05
06 SpringApplication.run(SpringBatchTutorialMain.class, args);
07 }
08
09
10 @Override
11 public void run(String... strings) throws Exception {
12
13 System.out.println("running...");
14 }
15
16 }

I am using Maven to resolve all the dependencies and launching the application using Spring boot. Here is the used
pom.xml

01 <?xml version="1.0" encoding="UTF-8"?>


02 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
03 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
04 <modelVersion>4.0.0</modelVersion>
05
06 <groupId>com.danibuiza.javacodegeeks</groupId>
07 <artifactId>Spring-Batch-Tutorial-Annotations</artifactId>
08 <version>0.1.0</version>
09
10 <parent>
11 <groupId>org.springframework.boot</groupId>
12 <artifactId>spring-boot-starter-parent</artifactId>
13 <version>1.2.1.RELEASE</version>
14 </parent>
15
16 <dependencies>
17 <dependency>
18 <groupId>org.springframework.boot</groupId>
19 <artifactId>spring-boot-starter-batch</artifactId>
20 </dependency>
21 <dependency>
22 <groupId>org.hsqldb</groupId>
23 <artifactId>hsqldb</artifactId>
24 </dependency>
25 <dependency>
26 <groupId>mysql</groupId>
27 <artifactId>mysql-connector-java</artifactId>
28 </dependency>
29 </dependencies>
30
31
32 <build>
33 <plugins>
34 <plugin>
35 <groupId>org.springframework.boot</groupId>
36 <artifactId>spring-boot-maven-plugin</artifactId>
37 </plugin>
38 </plugins>
39 </build>
40 </project>

And the goal used is:

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 6/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
mvn spring-boot:run

Now we are going to go through the configuration file shown above step by step. First of all we are going to explain how
Jobs 

and
Steps

are executed and what rules they follow.

In the example application pasted above we can see how a Job and a first step are configured. Here we extract the related piece of code:

01 @Bean
02 public Job job1(JobBuilderFactory jobs, Step step1) {
03 return jobs.get("job1").incrementer(new RunIdIncrementer())
04 .flow(step1).end().build();
05 }
06
07 @Bean
08 public Step step1(StepBuilderFactory stepBuilderFactory,
09 ItemReader reader, ItemWriter writer,
10 ItemProcessor processor) {
11 /* it handles bunches of 10 units */
12 return stepBuilderFactory.get("step1")
13 . chunk(10).reader(reader)
14 .processor(processor).writer(writer).build();
15 }

We can observe how a Job with the name “job1” is configured using just one step; in this case an step called “step1”. The class
JobBuilderFactory creates a job builder and initializes the job repository. The method
flow()

of the class JobBuilder creates an instance of the class JobFlowBuilder using the step1 method shown. This way the whole context is initialized
and the Job “job1” is executed.

The step processes (using the processor) in chunks of 10 units the


CustomPojo

records provided by the reader and writes them using the past writer. All dependencies are injected in runtime, Spring takes care of that since
the class where all this happens is marked as a configuration class using the annotation
org.springframework.context.annotation.Configuration

5. Custom Writers, Readers and Processors


As we already mentioned in this tutorial, Spring Batch applications consist basically of three steps: reading data, processing data and writing
data. We also explained that in order to support these 3 operations Spring Batch provides 3 abstractions in form of interfaces:

ItemReader
ItemWriter
ItemProcessor

Programmers should implement these interfaces in order to read, process and write data in their batch application jobs and steps. In this
chapter we are going to explain how to create custom implementations for these abstractions.

Custom Reader
The abstraction provided by Spring Batch for reading records of data is the interface
ItemReader

. It only has one method (


read()

) and it is supposed to be executed several times; it does not need to be thread safe, this fact is very important to know by applications using
these methods.

The method
read()

of the interface
ItemReader

has to be implemented. This method expects no input parameters, is supposed to read one record of the data from the desired queue and
returns it. This method is not supposed to do any transformation or data processing. If null is returned, no further data has to be read or

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 7/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
analyzed.

01 public class CustomItemReader implements ItemReader {


02
03 private List pojos; 
04
05 private Iterator iterator;
06
07 @Override
08 public CustomPojo read() throws Exception, UnexpectedInputException,
09 ParseException, NonTransientResourceException {
10
11 if (getIterator().hasNext()) {
12 return getIterator().next();
13 }
14 return null;
15
16 }
17 . . .

The custom reader above reads the next element in the internal list of
pojos

, this is only possible if the iterator is initialized or injected when the custom reader is created, if the iterator is instantiated every time the
read()

method is called, the job using this reader will never end and cause problems.

Custom Processor
The interface provided by Spring Batch for data processing expects one input item and produces one output item. The type of both of them
can be different but does not have to be different. Producing null means that the item is not required for further processing any more in case
of concatenation.

In order to implement this interface, it is only necessary to implement the


process()

method. Here is a dummy example:

01 public class CustomItemProcessor implements ItemProcessor {


02
03 @Override
04 public CustomPojo process(final CustomPojo pojo) throws Exception {
05 final String id = encode(pojo.getId());
06 final String desc = encode(pojo.getDescription());
07
08 final CustomPojo encodedPojo = new CustomPojo(id, desc);
09
10 return encodedPojo;
11
12 }
13
14 private String encode(String word) {
15 StringBuffer str = new StringBuffer(word);
16 return str.reverse().toString();
17 }
18
19 }

The class above may not be useful in any real life scenario but shows how to override the
ItemProcessor

interface and do whatever actions (in this case reversing the input pojo members) are needed in the process method.

Custom Writer
In order to create a custom writer programmers need to implement the interface
ItemWriter

. This interface only contains one method


write()

that expects an input item and returns


void

. The write method can do whatever actions are wanted: writing in the database, writing in a csv file, sending an email, creating a formatted
document etc. The implementations of this interface are in charge of flushing the data and leave structures in a safe state.

Here is an example of a custom writer where the input item is written in the standard console:

1 public class CustomItemWriter implements ItemWriter {


2

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 8/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
3 @Override
4 public void write(List pojo) throws Exception {
5 System.out.println("writing Pojo " + pojo);
6 }
7
8 } 
Also not very useful in real life, only for learning purposes.

It is also important to mention that for almost all real life scenarios Spring Batch already provides specific abstractions that cope with most of
the problems. For example Spring Batch contains classes to read data from MySQL databases, or to write data to a HSQLDB database, or to
convert data from XML to CSV using JAXB; and many others. The code is clean, fully tested, standard and adopted by the industry, so I can
just recommend to use them.

These classes can also be overridden in our applications in order to fulfil our wishes without the need of re implement the whole logic.
Implementing the provided classes by Spring may be also useful for testing, debugging, logging or reporting purposes. So before discovering
the wheel again and again, it would be worth to check the Spring Batch documentation and tutorials because probably we will find a better
and cleaner way to solve our specific problems.

6. Flat file example


Using the example above, we are going to modify the readers and writers in order to be able to read from a csv file and write into a flat file as
well. The following snippet shows how we should configure the reader in order to provide a reader that extracts the data from a flat file, csv
in this case. For this purpose Spring already provides the class FlatFileItemReader that needs a resource property where the data should be
coming from and a line mapper to be able to parse the data contained in that resource. The code is quite intuitive:

01 @Bean
02 public ItemReader reader() {
03 if ("flat".equals(this.mode)) {
04 // flat file item reader (using an csv extractor)
05 FlatFileItemReader reader = new FlatFileItemReader();
06 //setting resource and line mapper
07 reader.setResource(new ClassPathResource("input.csv"));
08 reader.setLineMapper(new DefaultLineMapper() {
09 {
10 //default line mapper with a line tokenizer and a field mapper
11 setLineTokenizer(new DelimitedLineTokenizer() {
12 {
13 setNames(new String[] { "id", "description" });
14 }});
15 setFieldSetMapper(new BeanWrapperFieldSetMapper() {
16 {
17 setTargetType(CustomPojo.class);
18 }});
19 }
20 });
21 return reader;
22 }
23 else {
24 . . .

The following piece of code shows the modifications that are needed in the writer. In this case we are going to use a writer of the class
FlatFileItemWriter that needs an output file to write to and an extractor mechanism. The extractor can be configured as shown in the snippet:

01 @Bean
02 public ItemWriter writer(DataSource dataSource) {
03 ...
04 else if ("flat".equals(this.mode)) {
05 // FlatFileItemWriter writer
06 FlatFileItemWriter writer = new FlatFileItemWriter ();
07 writer.setResource(new ClassPathResource("output.csv"));
08 BeanWrapperFieldExtractor fieldExtractor = new CustomFieldExtractor();
09 fieldExtractor.setNames(new String[] { "id", "description" });
10 DelimitedLineAggregator delLineAgg = new CustomDelimitedAggregator();
11 delLineAgg.setDelimiter(",");
12 delLineAgg.setFieldExtractor(fieldExtractor);
13 writer.setLineAggregator(delLineAgg);
14 return writer;
15 }
16 else {
17 . . .
18 }
19

7. MySQL example
In this chapter we are going to see how to modify our writer and our data source in order to write processed records to a local MySQL DB.

If we want to read data from a MySQL DB we first need to modify the configuration of the data source bean with the needed connection
parameters:

01 @Bean
02 public DataSource dataSource() throws SQLException {
03 . . .
04 else if ("mysql".equals(this.mode)) {
05 // mysql data source
06 final DriverManagerDataSource dataSource = new DriverManagerDataSource();
07 dataSource.setDriverClassName("com.mysql.jdbc.Driver");
08 dataSource.setUrl("jdbc:mysql://localhost/spring_batch_annotations");
09 dataSource.setUsername("root");
10 dataSource.setPassword("root");
11 return dataSource;

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 9/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
12 } else {
13 . . .

Here is how the writer can be modified using an SQL statement and a
JdbcBatchItemWriter

that gets initialized with the data source shown above:

01 @Bean
02 public ItemWriter writer(DataSource dataSource) {
03 ...
04 else if ("mysql".equals(this.mode)) {
05 JdbcBatchItemWriter writer = new JdbcBatchItemWriter();
06 writer.setSql("INSERT INTO pojo (id, description) VALUES (:id, :description)");
07 writer.setDataSource(dataSource);
08 writer.setItemSqlParameterSourceProvider(
09 new BeanPropertyItemSqlParameterSourceProvider());
10 return writer;
11 }
12 .. .

It is good to mention here that there are problem with the required Jettison library:
http://stackoverflow.com/questions/28627206/spring-batch-exception-cannot-construct-java-util-mapentry.

8. In Memory DB (HSQLDB) example


As third example we are going to show how to create readers and writers in order to use an in memory database, this is very useful for
testing scenarios. By default, if nothing else is specified, Spring Batch choose HSQLDB as data source.

The data source to be used is in this case the same one as for a MySQL DB but with different parameters (containing the HSQLDB
configuration):

01 @Bean
02 public DataSource dataSource() throws SQLException {
03 . . .
04 } else {
05 // hsqldb datasource
06 final DriverManagerDataSource dataSource = new DriverManagerDataSource();
07 dataSource.setDriverClassName("org.hsqldb.jdbcDriver");
08 dataSource.setUrl("jdbc:hsqldb:mem:test");
09 dataSource.setUsername("sa");
10 dataSource.setPassword("");
11 return dataSource;
12 }
13 }

The writer does not differ (almost) from the MySQL one:

01 @Bean
02 public ItemWriter writer(DataSource dataSource) {
03 if ("hsqldb".equals(this.mode)) {
04 // hsqldb writer using JdbcBatchItemWriter (the difference is the
05 // datasource)
06 JdbcBatchItemWriter writer = new JdbcBatchItemWriter();
07 writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider());
08 writer.setSql("INSERT INTO pojo (id, description) VALUES (:id, :description)");
09 writer.setDataSource(dataSource);
10 return writer;
11 } else
12 . . .

If we want that Spring takes care of the initialization of the DB to be used we can create an script with the name schema-all.sql (for all
providers, schema-hsqldb.sql for Hsqldb, schema-mysql.sql for MySQL, etc.) in the resources project of our project:

1 DROP TABLE IF EXISTS POJO;


2
3 CREATE TABLE POJO (
4 id VARCHAR(20),
5 description VARCHAR(20)
6 );

This script is also provided in the download section at the end of the tutorial.

9. Unit testing
In this chapter we are going to see briefly how to test Batch applications using the Spring Batch testing capabilities. This chapter does not
explain how to test Java applications in general or Spring based ones in particular. It only covers how to test from end to end Spring Batch
applications, only Jobs or Steps testing is covered; that is why unit testing of single elements like item processors, readers or writers is
excluded, since this does not differ from normal unit testing.

The Spring Batch Test Project contains abstractions that facilitate the unit testing of batch applications.

Two annotations are basic when running unit tests (using Junit in this case) in Spring:

@RunWith(SpringJUnit4ClassRunner.class): Junit annotation to execute all methods marked as tests. With the
SpringJunit4ClassRunner

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 10/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
class passed as parameter we are indicating that this class can use all spring testing capabilities.
@ContextConfiguration(locations = {. . .}): we will not use the “locations” property because we are not using xml configuration files but
configuration classes directly.


Instances of the class http://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/test/JobLauncherTestUtils.html can be
used for launching jobs and single steps inside the unit test methods (among many other functionalities. Its method
launchJob()

executes a Job and its method


launchStep("name")

executes an step from end to end. In the following example you can see how to use these methods in real jUnit tests:

01 @RunWith(SpringJUnit4ClassRunner.class)
02 @ContextConfiguration(classes=SpringBatchTutorialConfiguration.class, loader=AnnotationConfigContextLoader.c
03 public class SpringBatchUnitTest {
04
05 @Autowired
06 private JobLauncherTestUtils jobLauncherTestUtils;
07
08 @Autowired
09 JdbcTemplate jdbcTemplate;
10
11 @Test
12 public void testLaunchJob() throws Exception {
13
14 // test a complete job
15 JobExecution jobExecution = jobLauncherTestUtils.launchJob();
16 assertEquals(BatchStatus.COMPLETED, jobExecution.getStatus());
17
18 }
19
20 @Test
21 public void testLaunchStep() {
22
23 // test a individual step
24 JobExecution jobExecution = jobLauncherTestUtils.launchStep("step1");
25 assertEquals(BatchStatus.COMPLETED, jobExecution.getStatus());
26 }
27 }

You can assert or validate the tests checking the status of the Job execution for complete Jobs unit tests or asserting the results of the writer
for single steps tests. In the example shown we do not use any xml configuration file, instead we use the already mentioned configuration
class. In order to indicate the unit test to load this configuration, the annotation
ContextConfiguration

with the properties “classes” and “loader” is used:

1 @ContextConfiguration(classes=SpringBatchTutorialConfiguration.class,
2 loader=AnnotationConfigContextLoader.class)

More information about Spring Batch unit testing can be found in the following tutorial: http://docs.spring.io/spring-
batch/trunk/reference/html/testing.html.

10. Error handling and retrying Jobs


Spring provides mechanisms for retrying Jobs but since the release 2.2.0 is not any more part of the Spring Batch framework but included in
the Spring Retry: http://docs.spring.io/spring-retry/docs/api/current/. A very good tutorial can be found here: http://docs.spring.io/spring-
batch/trunk/reference/html/retry.html.

Retry policies, callbacks and recovery mechanism are part of the framework.

11. Parallel Processing


Spring Batch supports parallel processing in two possible variations (single process and multi process) that we can separate into the following
categories. In this chapter we are just going to list these categories and explain briefly how Spring Batch provides solutions to them:

Multi-threaded Step (single process): Programmers can implement their readers and writers in a thread safe way, so multi threading
can be used and the step processing can be executed in different threats. Spring batch provides out of the box several
ItemWriter

and
ItemReader

implementations. In their description is stated normally if they are thread safe or not. In case this information is not provided or the
implementations clearly state that they are not thread safe, programmers can always synchronize the call to the
read()

method. This way, several records can be processed in parallel.

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 11/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
Parallel Steps (single process): If an application modules can be executed in parallel because their logic do not collapse, these
different modules can be executed in different steps in a parallel way. This is different to the scenario explained in the last point where
each step execution process different records in parallel; here, different steps run in parallel.
Spring Batch supports this scenario with the element
split

.Here is an example configuration that may help to understand it better:

01 <job id="havingLunchJob">
02 <split id="split1" task-executor="taskExecutor" next="cleanTableStep">
03 <flow>
04 <step id="step1" parent="s1" next="eatCakeStep"/>
05 <step id=" eatCakeStep " parent="s2"/>
06 </flow>
07 <flow>
08 <step id="drinkWineStep" parent="s3"/>
09 </flow>
10 </split>
11 <step id=" cleanTableStep" parent="parentStep1"/>
12 . . .

Remote Chunking of Step (single process): In this mode, steps are separated in different processes, these are communicated with
each other using some middleware system (for example JMX). Basically there is a master component running locally and several multiple
remote processes, called slaves. The master component is a normal Spring Batch Step, its writer knows how to send chunks of items as
messages using the middleware mentioned before. The slaves are implementations of item writers and item processors with the ability to
process the messages. The master component should not be a bottleneck, the standard way to implement this pattern is to leave the
expensive parts in the processors and writers and light parts in the readers.
Partitioning a Step (single or multi process): Spring Batch offers the possibility to partition Steps and execute them remotely. The
remote instances are Steps.

These are the main options that Spring Batch offers to programmers to allow them to process their batch applications somehow in parallel.
But parallelism in general and specifically parallelism in batch processing is a very deep and complicate topic that is out of the scope of this
document.

12. Repeating jobs


Spring Batch offers the possibility to repeat Jobs and Tasks in a programmatic and configurable way. In other words, it is possible to configure
our batch applications to repeat Jobs or Steps until specific conditions are met (or until specific conditions are not yet met). Several
abstractions are available for this purpose:

Repeat Operations: The interface RepeatOperations is the basis for all the repeat mechanism in Spring Batch. It contains a method to
be implemented where a callback is passed. This callback is executed in each iteration. It looks like the following:

1 public interface RepeatOperations {


2
3 RepeatStatus iterate(RepeatCallback callback) throws RepeatException;
4
5 }

The RepeatCallback interface contains the functional logic that has to be repeated in the Batch:

1 public interface RepeatCallback {


2
3 RepeatStatus doInIteration(RepeatContext context) throws Exception;
4
5 }

The
RepeatStatus

returned in their
iterate()

and
doInIteration()

respectively should be
RepeatStatus.CONTINUABLE

in case the Batch should continue iterating or


RepeatStatus.FINIHSED

in case the Batch processing should be terminated.

Spring already provides some basic implementations for the


RepeatCallBack

interface.

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 12/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
Repeat Templates: The class RepeatTemplate is a very useful implementation of the
RepeatOperations

interface that can be used as starting point in our batch applications. It contains basic functionalities and default behavior for error 
handling and finalization mechanisms. Applications that do not want this default behavior should implement their custom Completion
Policies.
Here is an example of how to use a repeat template with a fixed chunk termination policy and a dummy iterate method:

01 RepeatTemplate template = new RepeatTemplate();


02 template.setCompletionPolicy(new FixedChunkSizeCompletionPolicy(10));
03 template.iterate(new RepeatCallback() {
04
05 public ExitStatus doInIteration(RepeatContext context) {
06 int x = 10;
07 x *= 10;
08 x /= 10;
09 return ExitStatus.CONTINUABLE;
10 }
11
12 });

In this case the batch will terminate after 10 iterations since the iterate() method returns always
CONTINUABLE

and leaves the responsibility of the termination to the completion policy.

Repeat Status: Spring contains an enumeration with the possible continuation status:
RepeatStatus .CONTINUABLE

RepeatStatus.FINISHED

Indicating that the processing should continue or it is finished can be successful or unsuccessful).http://docs.spring.io/spring-
batch/trunk/apidocs/org/springframework/batch/repeat/RepeatStatus.html
Repeat Context: It is possible to store transient data in the Repeat Context, this context is passed as parameter to the Repeat Callback
doInIteration()

method. Spring Batch provides the abstraction RepeatContext for this purpose.
After the
iterate()

method is called, the context no longer exists. The repeat context have a parent context in case iterations are nested, in these cases, it is
possible to use the parent context in order to store information that can be shared between different iterations, like counters or decision
variables.
Repeat Policy: Repeat template termination mechanism is determined by a CompletionPolicy. This policy is also in charge of creating a
RepeatContext

and pass it to the callback in every iteration. Once an iteration is completed, the template calls the completion policy and updates its state,
which will be stored in the repeat context. After that, the template asks the policy to check if the processing is complete.Spring contains
several implementations for this interface, one of the most simple ones is the SimpleCompletionPolicy; which offers the possibility to
execute the Batch just a fixed number of iterations.

13. JSR 352 Batch Applications for the Java Platform


Since Java 7, batch processing is included in the Java Platform. The JSR 352 (Batch applications for the Java Platform) specifies a model for
batch applications and a runtime for scheduling and executing jobs. At the moment of writing this tutorial, the Spring Batch implementation
(3.0) implements completely the specification of the JSR-352.

The domain model and the vocabulary used is pretty similar to the one used in Spring Batch.
JSR 352: Batch Applications for the Java Platform:
Jobs

,
Steps

,
Chunks

,
Items

,
ItemReaders

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 13/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
,
ItemWriters

, 
ItemProcessors

etc. are present in the Java Platform JSR 352 model as well. The differences are minor between both frameworks and configuration files looks
almost the same.

This is a good thing for both programmers and the industry; since the industry profits from the fact that a standard has been created in the
Java Platform, using as basis a very good library like Spring Batch, which is widely used and well tested. Programmers benefit because in case
Spring Batch is discontinued or cannot be used for any reason in their applications (compatibility, company policies, size restrictions…) they
can choose the Java standard implementation for Batch processing without much changes in their systems.

For more information about how Spring Batch has been adapter to the JSR 352, please visit the link http://docs.spring.io/spring-
batch/reference/html/jsr-352.html.

14. Summary
So that’s it. I hope you have enjoyed it and you are able now to configure and implement batch applications using Spring Batch. I am going to
summarize here the most important points explained in this article:

Spring Batch is a batch processing framework built upon the Spring Framework.
Mainly (simplifying!) it is composed of <code<Jobs, containing
Steps

, where
Readers

,
Processors

and
Writers

and configured and concatenated to execute the desired actions.


Spring Batch contains mechanism that allow programmers to work with the main providers like MySQL, Mongo DB and formats like SQL,
CSV or XML out of the box.
Spring Batch contains features for error handling, repeating
Jobs

and
Steps

and retrying
Jobs

and
Steps

.
It also offers possibilities for parallel processing.
It contains classes and interfaces for batch applications unit testing.

In this tutorial I used no xml file (apart from some examples) for configuring the spring context, everything was done via annotations. I did it
this way for clarity reasons but I do not recommend to do this in real life applications since xml configuration files may be useful in specific
scenarios. As I said, this was a tutorial about Spring Batch and not about Spring in general.

15. Resources
The following links contain a lot of information and theoretical examples where you can learn all the features of the Spring Batch module:

http://docs.spring.io/spring-batch/reference/html/index.html
https://jcp.org/en/jsr/detail?id=352
https://spring.io/guides/gs/batch-processing/
https://kb.iu.edu/d/afrx

16. Download Spring Batch Tutorial source Code


https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 14/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
Download
You can download the full source code of this Spring Batch Tutorial here: spring_batch_tutorial.

Tagged with: SPRING SPRING BATCH ULTIMATE 

(+1 rating, 1 votes)


You need to be a registered member to rate this.  17 Comments  8329 Views  Tweet it!

Do you want to know how to develop your skillset to become a Java


Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
Enter your e-mail...

I agree to the Terms and Privacy Policy

Sign up

LIKE THIS ARTICLE? READ MORE FROM JAVA CODE GEEKS

 Subscribe 

Join the discussion

{} [+] 

This site uses Akismet to reduce spam. Learn how your comment data is processed.

17 COMMENTS   Oldest 

Ashutosh Sharma  5 years ago

Good work…few corrections needed.


Correct the mySQL connection string as :
jdbc:mysql://localhost:3306/spring_batch_annotations in this class:
SpringBatchTutorialConfiguration

Also refer to this:


https://github.com/spring-projects/spring-batch/blob/master/spring-batch-
core/src/main/resources/org/springframework/batch/core/schema-mysql.sql

for creating the mysql tables which are needed for Springbatchframework metadata/product tables

0 Reply

balakishan  5 years ago

it is very useful to devolper

0 Reply

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 15/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
balakishan  5 years ago

nice website

0 Reply

raghave shukla  5 years ago

One of the most well structured technical blog. I think you should write a book. There are lot of
great developers but very few have the ability to express to clearly in most effective and least
number of words. Very nice explanation. Well, coming back to the topic. There are certain doubts
which result from probably overlapping of the batch process concept with other technologies. Like its
necessary to explain difference of use-cases between Batch Processing, Scheduling and Messaging
Services. Batch processes are sequential, Messaging services are asynchronous, Scheduling can be
sequential or asynchronous. Say i need to process thousands… Read more »

0 Reply

Neha  5 years ago

Could you please only share the Spring Batch mongodb Item Reader and Writter example? regards,
Neha

0 Reply

Prasanna Kumar  5 years ago

Its very clear and helpful for Initiators like me Im very thank ful to u.

0 Reply

kalyan  5 years ago

how can i get the pdf document?

0 Reply

dasdasd  5 years ago

May I just say what a relief to discover somebody


that genuinely understands what they’re talking about on the internet.

You definitely know how to bring an issue to light and make


it important. More people really need to read this and understand
this side of your story. I was surprised that you are not more popular since you surely have the gift.

0 Reply

Devendra  4 years ago

Very nice tutorial

0 Reply

Manu Gupta  4 years ago

How can I get this tutorial in pdf format.


Please help.

0 Reply

Savani  4 years ago

Could you please provide XML version of this same project ? I also see current code don’t have DS
configuration details? please do the needful.

Thank & Regards,


Savani

0 Reply

Sundar  4 years ago

 Reply to Savani

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 16/17
28/10/2020 Spring Batch Tutorial – The ULTIMATE Guide | Java Code Geeks - 2020
How can i get the PDF version here?

1 Reply


Goshlive  4 years ago

Hello

Could you please share the full example including the Spring Batch’s XML configuration as well?

0 Reply

Savani  4 years ago

Could you please developed code for the Mongo DB to XML Spring Batch example ?

0 Reply

Matiass  3 years ago

How we can use Oracle Database instead of Mysql ? please Help

0 Reply

 1 year ago

where to download?

1 Reply

Eleftheria Drosopoulou  1 year ago

 Reply to
Editor
The zip file is at the end of the article.

0 Reply

KNOWLEDGE BASE HALL OF FAME ABOUT JAVA CODE GEEKS


JCGs (Java Code Geeks) is an independent online community focused on crea
Courses “Android Full Application Tutorial” series
ultimate Java to Java developers resource center; targeted at the technical ar
technical team lead (senior developer), project manager and junior developer
Examples 11 Online Learning websites that you
JCGs serve the Java, SOA, Agile and Telecom communities with daily news wr
should check out
domain experts, articles, tutorials, reviews, announcements, code snippets an
Minibooks
source projects.
Advantages and Disadvantages of Cloud
Resources Computing – Cloud computing pros and
cons DISCLAIMER
Tutorials
Android Google Maps Tutorial All trademarks and registered trademarks appearing on Java Code Geeks are
PARTNERS Android JSON Parsing with Gson Tutorial
property of their respective owners. Java is a trademark or registered tradem
Oracle Corporation in the United States and other countries. Examples Java C
is not connected to Oracle Corporation and is not sponsored by Oracle Corpor
Android Location Based Services
Mkyong Application – GPS location

Android Quick Preferences Tutorial


THE CODE GEEKS NETWORK
Difference between Comparator and
Comparable in Java
.NET Code Geeks
GWT 2 Spring 3 JPA 2 Hibernate 3.5
Java Code Geeks Tutorial

System Code Geeks Java Best Practices – Vector vs ArrayList


vs HashSet
Web Code Geeks

Java Code Geeks and all content copyright © 2010-2020, Exelixis Media P.C. | Terms of Use | Privacy Policy | Contact   

https://www.javacodegeeks.com/2015/03/spring-batch-tutorial.html 17/17

You might also like