Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Migrating AS400-COBOL to Java

A Report from the Field

Harry M. Sneed Katalin.Erdoes


Anecon GmbH Anecon GmbH
Vienna, Austria Budapest, Hungary
harry.sneed@t-online.de Katalin.erdos1@t-online.hu

Abstract: This paper describes an industrial project aimed at environment and now they are trapped in it. Fortunately IBM
migrating legacy COBOL programs running on an IBM-AS400 to continues to support it, but no one knows for how long.
Java for running in an open environment. The unique aspect of this
migration is the reengineering of the COBOL code prior to migration. The AS400 application in question is for planning and
The programs were in their previous form hardwired to the AS400
screens as well as to the AS400 file system. The goal of the
allocating resources for the production of ultra heat resistant
reengineering project was to free the code from these proprietary steel melting ovens. The main business goal of the system is to
dependencies and to reduce them to the pure business logic. ensure that resources required for the oven production are
Disentangling legacy code from it’s physical environment is a major delivered just in time just where they are needed. The software
prerequisite to converting that code to another environment. The goal was originally developed in the second half of the 1980’s by a
is the virtualization of program interfaces. That was accomplished firm specialized in that field and has been enhanced by the user
here in a multistep automated process which led to small, over the past 20 years. As of last year the system consisted of:
environment independent COBOL modules which could be readily
converted over into Java packages. The pilot project has been 900 COBOL programs with 2.065.865 statements = 137 KFpts
completed for a sample subset of the production planning and control
system. The conversion to Java is pending the test of the reengineered
121 RPG programs with 58.000 statements = 3.8 KGFpts
COBOL modules. 1.608 Database files with 48.336 data attributes
Keywords: Reengineering, IBM-I-Series, COBOL, Modularization, 498 AS400 user interface screens with 4316 display fields
Restructuring, Business logic, Refactoring.
COBOL was intended to be a portable language but like
Java it cannot operate on its own. It is embedded in a technical
I.REASONS FOR THIS MIGRATION framework consisting of user interfaces, system interfaces,
database accesses and utility routines. The I-series COBOL is
Migration of existing software systems is an evergreen bound to the I-series machine. The interface and the database
subject within the IT world. Software is packaged knowledge
handling are purely I-series specific, and that makes up for at
and knowledge is the basis of any enterprise. As pointed out by
least one third of the COBOL code. Therefore, even if the user
the Austrian economist Friedrich Hayek the preservation-of
would choose to remain in COBOL but move to another
capital in the form of corporate knowledge is a key concern to
platform, he would still have to convert the programs in this
all enterprises [1]. The knowledge contained within IT systems application.
is embedded in the code. Preserving that knowledge means
preserving the code or at least the logic contained therein.
The other problem with the COBOL language is that it is
no longer taught in the technical high schools and universities.
Unfortunately software is dependent on the environment in
Therefore, it is very difficult to recruit COBOL programmers.
which it is operating. It is wired to a particular machine and to
In Austria it is almost impossible to find programmers with
supporting software systems such as database and knowledge of COBOL. Those few that are left are all close to
communication systems. Machines become obsolete and the retirement. That leaves the enterprise with few options when it
supporting software systems, i.e. the framework in which the
comes to staffing the maintenance project. On the other hand,
code exists, are replaced, thus making the code unusable. It has
the market is abundant with young Java programmers, who are
always been propagated that software application systems
much cheaper than the COBOL programmers available.
should be independent of the platforms on which they run.
Portability has been a major quality goal from the very Finally, there is yet another reason for migrating to Java.
beginning. However, vendors of hardware and software
This middle size enterprise is operating worldwide. It has
systems have paid little heed to this noble goal. Their interest
subsidiaries in the U.S.A., India and China where it has
is to trap the users of their products and to make it very
production sites. The current application system is tuned to
difficult for them to escape their trap. Such is the case with the
optimize the distributed production processes. There are now
IBM AS400 machine, now the I-series. Scores of middle size many standard software packages for supporting this process,
enterprises throughout the world have invested their corporate packages which also support the local language. It is cheaper
knowledge into application systems running in this
to use such a standard ERP system than to maintain one’s own.
But here too there is a hitch. The company has special features A 4 2'Lnr Ebene Pos'
A 4 19'VP-Menge ME'
in their current production planning and control systems which A APROG 4A B 1 4 OVRDTA
are not covered by the standard systems offered. If a standard A DSPATR(UL)
system were to be introduced, the enterprise specific A COLOR(YLW)
functionality would have to be added on to the standard A 03 DSPATR(RI PC)
A ACOD 6A B 1 17 OVRDTA
functionality. That is only possible if the add-on components A DSPATR(ND)
are compatible with the standard components. For this they A ABILD 1A B 1 26 OVRDTA
have to be implemented in Java [2]. A DSPATR(UL)
A COLOR(YLW)
II.PROBLEMS OF THIS MIGRATION A 04 DSPATR(RI PC)

The dependence of the current COBOL programs on the I- These MAP macros are generated from a prototype screen
Series operating system for communication and data designed by the user. A pre-compiler then converts them to a
management is the main obstacle to migration. The I-Series COBOL data structure contained in a COPY member. The
grew out of the former AS400 architecture in which hardware COBOL program moves data in and out of the variable fields.
and software were tightly integrated in a medium-sized It reads the entire screen as a single object with an ACCEPT
computer for small and middle-size enterprises. The software command and writes the entire screen with a DISPLAY
is made up of user interface maps generated from screen command. ACCEPT and DISPLAY statements are standard
prototypes, database tables generated from fixed record COBOL instructions for sending and receiving messages. Here
descriptions and programs written in either RPG or COBOL they are used for the reading and displaying the user interface.
containing the business rules. On top of there are control
procedures written in the CL language for controlling the Of course a Java program will not work in this way. The
execution of the programs. All four artifact types are tightly DISPLAY and ACCEPT statements have to be replaced by
coupled with each other (see figure 1) calls to a Java class which describes and handles the GUI with
set and get operations. These are referred to as screen proxies
and can be traced to the work of Bodhuin and Totorella at
Map Data RCOST [3]. Every reference to a map field in the COBOL
Prototype Model
AS400 Programs
are composed of
program must also be converted to a reference to the set or get
Maps, Files & Rules operation for that particular field. Since all of the data in the I-
AS400 AS400
Map Descr File Descrp
series panels is in string format, it is up to the Get end Set
COBOL operations to convert the strings to the proper Java data type.
Program
Map Copy Framework Data Copy
Generator Generator Besides creating a Java class for each I-series map, a
HTML template has to be generated to depict the data as a web
Cobol
Data
Division COBOL
page. In so doing the attributes of the original map are
Map (Files, Record converted to HTML attributes. To make this GUI acceptable to
Copys Rec ords & Copys
Fields the user, the conversion engineer has to enhance the HTML
Procedure
Division Control
source manually and add additional attributes. This is an
Rules (Sections &
Paragraphs)
Logic example of where automated conversion reaches its limits. It is
not possible to go from a primitive panel construction to a
modern GUI without adding features that were not in the
Figure 1: Integrated As400 Environment
original user interface. The same problem was encountered by
a project in Italy to migrate the software of a local government
The user interfaces are fixed screens defined in separate
[4]. (see Sample 2)
MAP macros. A MAP macro describes the position of fields
on the screen. Every field has a unique identifier, a position <param name="bgcolor" value="255,255,255">
and a length. The position is defined by the line number and <param name="onsbtext" value="PPS-System">
column number, Fields can be either variable or constant. <param name="mheight" value="-1">
<param name="topoffset" value="0">
<param name="mindent" value="10">
Constant fields are titles whose value is set in the screen <param name="mfont" value="Helvetica, bold, 12">
definition. The contents of the variable fields are set by the <param name="miconindent" value="7">
<param name="mtextcolor" value="153,153,153">
program at runtime or typed in by the user. There are also field
<!--Specific Settings-->
attributes for defining color, boldness, underlining etc. (See <param name="maindesc0" value="AFMT00">
Sample 1) <param name="maindesc1" value="Verpackungsstruktur">
<param name="desc1-1" value="AA_Kurzbezeichnung">
A R AFMT00 <param name="desturl1-2" value="N_Blatt">
A 98 CSRLOC(APZL APSP) <param name="desc1-3" value="Lnr_Ebene_Pos">
A CHGINPDFT <param name="desturl1-3" value="VP_Menge_ME">
A 1 29'Verpackungsstruktur' <param name="desc1-4" value="BD/ZB_Ben">
A 2 30'AA Kurzbezeichnung'
The second migration problem is that of data management. COBOL file accesses to SQL was made by hand because of
Modern I-series programs now use embedded SQL to access a the need to optimize them. (See Sample 4)
relational data base, but this application still used indexed
sequential files. The file descriptions are given in the COBOL ***********************************************
LFBD SECTION.
Environment Division. The record structures are contained in MOVE PMDK TO DBDMDK OF IDBDPD-REC.
COPY members included in the COBOL File Section. These * READ IDBDPD FIRST RECORD FORMAT IS "BDPD"
COPY members are generated from the AS400 file AT END GO TO ………….
descriptions where there is a line for each field. In this line  SELECT IDBDPD-REC FROM IBDDPD
WHERE RECORD-KEY = PMDK
there is a field-id, a field length and the title of the field. In IF SQLCODE NOT = ZERO
case of numeric fields with a decimal point, there is also the MOVE "10” TO PDBSTAT
number of digits after the decimal point. These data definition END-IF.
macros are, just as the map definition macros, generated from a EXIT.
************************************************
record description panel. In this respect the AS400
There conversion of the AS400 record descriptions to SQL
development environment was highly advanced at the time it
tables is a very straight forward transformation. For every
appeared on the market. It was intended to be used by end
record a corresponding table was generated. For files with
users with rudimentary programming skills. They could design
multiple records several tables were generated. They were tied
their database structures and their user interfaces by creating
to a master record by means of foreign keys. The techniques
prototypes on the screen. The fact that the AS400 developing
involved here have been well covered by the literature on data
environment is so highly integrated makes it all the harder to
reengineering and applied in previous projects [5]. The results
migrate. (See Sample 3)
are CREATE TABLE statements, one for each converted table.
A DBDMDK 6 0 TEXT(‘Record-ID') (see Sample 5)
A DBDFM1 11 3 TEXT('Fertigmaß 1')
A DBDFM2 11 3 TEXT('Fertigmaß 2') CREATE TABLE IDBDPD-REC (
A DBDFGG 11 3 TEXT('Fertigmasse in kg ge') DBDMDK NUMBER (6) NOT NULL,
A DBDSVL 4 TEXT('Steuerungsvorgangsnu') Fertigmass_1 NUMBER (11,3) NOT NULL,
A DBDRM1 11 3 TEXT('Rohmaß 1') Fertigmass_2 NUMBER (11,3) NOT NULL,
A DBDRM2 11 3 TEXT('Rohmaß 2') Fertigmasse_in_kg NUMBER (11,3) NOT NULL,
A DBDRGG 11 3 TEXT('Rohmasse in kg gesan') Steuerungsvorgangsnu CHAR (4) NOT NULL,
A DBDWGG 11 3 TEXT('Werkstättengewicht') Rohmass_1 NUMBER (11,3) NOT NULL,
A DBDRBL 5 0 TEXT('Restblechnummer') Rohmass_2 NUMBER (11,3) NOT NULL,
A DBDFOL 5 0 TEXT('Format Länge') Rohmasse_in_kg_gesan NUMBER (11,3) NOT NULL,
A DBDFOB 4 0 TEXT('Format Breite') Werkstättengewicht NUMBER (11,3) NOT NULL,
Restblechnummer CHAR (5) NOT NULL,
Format_Länge CHAR (5) NOT NULL,
The statements for accessing these database files are Format_Breite CHAR (4) NOT NULL,
included in the procedural code. They are standard COBOL …………………………………………………………………………………………………………
file processing statements: CONSTRAINT Record-Key
PRIMARY KEY (DBDMDK),
CONSTRAINT Schlüssel_Materialke
OPEN, READ, WRITE, REWRITE, DELETE and CLOSE FOREIGN KEY (Tafeln_Soll, Tafeln_Ist)
REFERENCES IDBDPX-REC,
Fortunately in this application the developers designed
special access modules – one for each file – in which the The greater problem is that of converting the processing
access operations for that file were collected together. logic. Here as in many cases of legacy code, the code is not
Unfortunately, these modules were not implemented as structured, but is driven by GOTO statements. There is no
separate subprograms which are called, but as COPY members justification for this. The COBOL code could have been
included at compile time. Each access operation is a paragraph written with PERFORM UNTIL loops, EVALUATE cases and
which is performed from within the compiled program. The IF ELSE END-IF statements, but they were not because the
access logic is very much intertwined with the AS400 database programmers were accustomed to thinking in terms of a test
model, so it was clear from the start that these modules would and branch. As a result the program control logic was in the
have to be rewritten. usual spaghetti mode. As a prerequisite to converting to Java
the GOTO statements have to be removed [6].
In converting to Java it is necessary to create a new access
class for every file with embedded SQL statements. These Besides being unstructured, the code also contained many
classes must include the record structure as data attributes and hard-wired data constants which makes it difficult to change
the access operations as methods. Only the structure of these the code. The number 30 may be a date at one location and an
classes was generated, the access operation had to be rewritten. age at another. Changing all 30’s to 31’s would in this case
For every COBOL IO-statement there was found a affect both dates and ages and cause an error. It also makes the
corresponding SQL statement. READs became SELECTs, code inflexible and non reusable. By carrying over the hard-
WRITEs became INSERTs, REWRITES became UPDATEs wired data into Java it would only make the Java code equally
and DELETEs remained DELETEs. The conversion of the
inflexible and non reusable. It also makes the code difficult to • relocating IO-operations
relocate. • restructuring the control logic
• refactoring the procedures (see Figure 2 )
Another detriment to conversion are the criptic data names.
In the older AS400 COBOL programs the eight-character Code is stepwise refined via multiple transformations starting with the formatting
names from the Map and File macro descriptions were often of the Code and ending with the refactoring.

used to designate the COBOL variables even though this is not


Original Refactored
required. Programmers could have used 30-character names Code SoftRedo Transformation Functions Code

but did not. As with the GOTO control logic they were Refactor
Realign &
accustomed to thinking in terms of short names. Therefore it Reformat
Oversized
Procedures
was typical to find statements like Refine & Restructure re-
realigned
Cleanse Control structured
Code Flow Code
Code
MOVE HBDWGG TO ABDWWG
Rename Relocate
refined relocated
Data IO,DB,TP
Code Code
Variables Operations
It would not make sense to convert such statements to a
Remove Replace
Java environment without first changing the names. renamed
Incompatible hardwired
replaced
Code DataTypes Code
Data
Fortunately, in the map and file descriptions long names are
available as titles and can be taken from there to replace the compatible
Code
short names in the statements. This is yet another task to be
done prior to converting the code.

Finally there is a problem with incompatible data types. The Figure 2: SoftRedo COBOL Reengineering Process
map and file fields are all in character format, but inside the
programs, programmers can use binary and fixed-decimal The advantage of this tool is that the user can select which
fields. To ensure compatibility and to avoid data type errors transformations he wants to make on the code and configure
these data variables are converted to standard character format. his reengineering process accordingly. Not all of the functions
Then all data within the COBOL code is in character format – were used in this project. It was not necessary to relocate the
either strings or decimal numeric. This eases the conversion to IO-operations since these were already collected in separate
Java, especially in the case of redefinitions, a problem COPY members. The refactoring was not necessary since this
addressed by Tonelli and his colleagues in Trento [7]. was left to be done in the converted Java code.
Thus, in summarizing the problems to be solved, it is clear A. Refining and cleaning the code
that most of the obstacles to a COBOL to Java conversion have
to do with the way the COBOL code is written. It is senseless In the COBOL code involved here there are no obsolete
to try and solve them at conversion time. They have be solved statements such as ALTER, EXAMINE and NEXT
within the COBOL before conversion can begin. The STATEMENT but there are PERFORM THRUs and missing
reengineering measures required include: END-Ifs. There are also comments missing where they should
be, such as at the beginning of each paragraph. The Refine
• renaming the variables Step converts the PERFORM THRU statement to a sequence
• replacing the hard-wired data of PERFORM’s in which each paragraph within the perform
• revising the incompatible data types range is invoked separately, e.g.
• refining the control structure and PERFORM A THRU C.
• removing the GOTO branches becomes
PERFORM A.
III.REENGINEERING THE AS400 COBOL CODE PERFORM B.
The tool for reengineering the COBOL Code is PERFORM C.
COBREDO. The original REDO tool was presented at the
second CSMR conference in 1998 [8]. Since then it has gone Where an IF statement is terminated only by a period, an END-
through many transitions. The current version has a number of IF is inserted.
separate functions, each with the same basic COBOL parser IF a < b
but a different interpreter built upon it. It is implemented as a MOVE a TO c
chain of optional code transformation steps. These steps are GO TO x
• reformatting the COBOL code is converted to
• refining and cleaning the code IF a < b
• renaming data variables MOVE a TO c
• revising incompatible data types GO to x
• replacing hard-wired data END-IF.
Where a paragraph name is neither preceded nor followed length of the field extended. If the data field happens to be in a
by a comment, an empty comment block is inserted for the redefining or redefined data structure, the length of the shorter
responsible programmer to fill in. The result of the refinement structure must be padded to coincide with the length of the
step is an easier to convert source member that is better longer structure. The result of the type revision step is a code
structured and easier to read. (See Sample 6) containing only of string types. This type conversion has been
the subject of much debate, but it has proven itself in previous
CALL "IPBPRF" USING PARA1, PARA2, PARA3. conversions. (See Sample 8)
IF AFK = "A" GO TO ST30A.
IF AFK = "E" GO TO ST30E.
01 ZOBSTG.
MOVE 1 TO PTXNR.
05 FILLER PIC X(0042).
PERFORM FTXT.
* 05 NSL PIC 9(0004) COMP-4.
GO TO ST10.
REMOVE 05 NSL PIC 9(0004).
ST30I.
01 ZIBSTG.
PERFORM I000.
05 FILLER PIC X(0147).
IF AFK = "E" GO TO ST98.
05 ZWZLN PIC X(0001).
GO TO ST10.
05 ZWSPN PIC X(0001).
* 05 ZBIN PIC 9(0004) COMP-4.
Is transformed to:
REMOVE 05 ZBIN PIC 9(0004).
05 FILLER REDEFINES ZBIN.
CALL "IPBPRF" USING PARA1, PARA2, PARA3. 10 FILLER PIC X(0001).
IF AFK = "A" 10 ZBIN2 PIC X(0001).
GO TO ST30A
END-IF.
IF AFK = "E" D. Replacing hard-wired data
GO TO ST30E
END-IF. The COBOL code involved here is full of hard-wired
MOVE 1 TO PTXNR.
PERFORM FTXT. constants in procedural statements, either numbers or literals.
GO TO ST10. They could be left as they are but that would have only push
ST30I. the problem off to the Java code. Here it is possible to replace
PERFORM I000. the constants with variable names. The variable names are
IF AFK = "E"
GO TO ST98 defined in a COBOL data structure and assigned an initial
END-IF. value taken from the constant referred to in the procedural
statement. The statement is changed from
B. Renaming data variables
MOVE “blue” TO COLOR to
Prior to this step the AS400 maps and file descriptions are MOVE X-LIT-001 TO COLOR
parsed to collect the titles associated with the criptic data
names. These are then stored in an Excel table together with The X-LIT-001 is placed in a list of literals and given a
the short names. At reengineering time that table is read into a picture and an initial value.
main storage. When this step is executed each variable name in 05 X-LIT-001 PIC X(4) VALUE “blue”.
a COBOL statement is checked against the short names in the The same applies to numbers such as in the statement
name table. If it is found, the long name is converted to a valid ADD 100 TO SALARY.
COBOL name and inserted in the statement in place of the becomes
short name. The result of the renaming step is a code full of ADD X-CON-001 TO SALARY.
long, speaking data names equivalent to the titles in the AS400 The X-CON-001 is placed in a list of constants as depicted
file description. (See Sample 7) below:
05 X-CON-001 PIC 999 VALUE “100”.
* MOVE PTEXT TO MTEXT. 
MOVE Fertigmasse-in-kg-Gewicht TO Satzformatname.
* MOVE MHF TO MCHF.  The result of the constant replacement function is a COBOL
MOVE Auftragsposition TO Mandant. source free of hard-wired data and a COPY member for each
* MOVE PPSP TO MPSP.  source in which the literals and constants are defined. (See
MOVE Restblechnummer TO Bedarfsnummer. Sample 9)
MOVE PPZL TO MPZL. 
* MOVE Sollmenge TO Auftragsnumme.
IF ABDFGG = SPACE
MOVE XLIT013 TO ABDFGG
END-IF
C. Revising data types -----------------------------------------------
01 REDO-LITERAL-TABLE.
Wherever data types occur which are neither character nor REPLAC 05 XLIT012 PIC X(006) VALUE
decimal numeric, they are overwritten by one of these types. In REPLAC "AFK ".
REPLAC 05 XLIT013 PIC X(006) VALUE
this AS400 code only binary and packed decimal types are REPLAC "ABDSPB".
affected. Both are converted to a fixed decimal type and the
E. Restructuring the Control Logic RESTRU MOVE 'ST01' TO X-NEXT-LABEL.
RESTRU PERFORM UNTIL X-NEXT-LABEL = 'STOP'
RESTRU OR X-NEXT-LABEL = SPACES
The purpose of the last step is to remove the GOTO RESTRU IF X-NEXT-LABEL = 'ST01'
branches from the COBOL code so that it can be converted to RESTRU PERFORM ST01
Java. This is done by inserting a finite state control loop at the RESTRU END-IF
RESTRU IF X-NEXT-LABEL = 'ST02'
beginning of the COBOL program. Each paragraph is RESTRU PERFORM ST02
performed from here depending on the current value of the X- RESTRU END-IF
NEXT-LABEL variable. The loop is executed until a RESTRU IF X-NEXT-LABEL = 'ST05'
paragraph is performed which contains an exit command such RESTRU PERFORM ST05
RESTRU END-IF
as GOBACK or STOP-RUN. This amounts to a finite state RESTRU IF X-NEXT-LABEL = 'ST98A'
machine where each elementary code unit is invoked by a RESTRU PERFORM ST98A
central control unit to which control is returned after the RESTRU END-IF
RESTRU ……………………………………………………………
elementary unit has been executed. (see Figure 3) RESTRU END-PERFORM.

The mainline operations are the paragraphs performed by


MainLine Performed Access

GO To’s
Operations Operations Operations the central control loop. In place of their GOTOs the label

variable is set to the name of the next paragraph to be
Performs performed by the control loop. Control is returned to the
control loop when the next paragraph is declared.
Central Control Loop

*******************************************
A001.
RESTRU MOVE SPACES TO X-NEXT-LABEL.
MOVE ZERO TO MFE.
PERFORM PRUEFE.
RESTRU MOVE SPACES TO X-NEXT-LABEL
REFORM IF MFE = 1
REFORM* GO TO A802
RESTRU MOVE 'A802' TO X-NEXT-LABEL
REFINE END-IF.
Figure 3: GOTO Conversion to PERFORM Subroutine RESTRU IF X-NEXT-LABEL = SPACES
REFORM IF MFE = 2
All GOTO statements in the COBOL code units are REFORM* GO TO A822
RESTRU MOVE 'A822' TO X-NEXT-LABEL
replaced by a statement which sets the next label to the label REFINE END-IF
referenced by the GOTO. The statement REFINE END-IF
GO TO LABEL-X RESTRU IF X-NEXT-LABEL = SPACES
RESTRU MOVE 'A002' TO X-NEXT-LABEL
is replaced by the statement RESTRU END-IF
MOVE “LABEL-X” TO X-NEXT-LABEL *******************************************
A002.
All of the statements following that GO TO are then skipped
by inserting an if statement
IV. THE COBOL TO JAVA CONVERSION
IF X-NEXT-LABEL = SPACES The conversion to Java is performed in three steps;
<subsequent statements> • Step1: The COBOL record descriptions are converted
END-IF to classes
• Step2: The COBOL procedures, i.e. paragraphs, are
If the GO TO was conditional and was not executed, then converted to methods
the X-NEXT-LABEL variable will not be set and the • Step3: The methods are merged with the classes
subsequent statements will be executed. Otherwise, they will
be skipped over. Step1: Converting the Data: This conversion process has
been described in a previous paper [9]. The method for
The result of the restructuring step is a COBOL program converting data records to classes has remained basically the
with a central control loop at the beginning and an ordered set same. Each record becomes a class and each data field
of elementary code units – paragraphs – which set where the becomes a pair of get and set operations.
control should go next. This solution has proven to work in
several previous migrations and it also works here. It may not The next sample is that of an AS400 record description
be the most elegant solution but it avoids the many errors that which is converted to a Java class. Here it is possible to see
come up when the GOTO branches are replaced by nested IF how the COBOL data declarations are converted to Java set
and PERFORM UNTIL statements. (See sample 10) and get operations. There is a set and get operation for each
COBOL variable that performs the data conversion to and
from the string format in which it is stored. By storing all data statements are simply converted on a 1:1 basis. There is a Java
in string format and giving then a starting position and a length pattern for every type of COBOL processing operation, e.g.
it is possible to process the original AS400 records without MOVE, COMPUTE, ADD, etc. The conversion tool fills in the
having to restructure them. The problems of redefined data and pattern with the proper data names (see Sample 13)
multiple arrays is circumvented here by defining the record as
public String IPBSTG_ST02() {
a whole as a single character string. (See Sample 11) String xNextMethod = Spaces();
IPBSTG.WORK_DATA.setSATZFORMATNAME(FERTIGMASS_2.getF
public class DBS_AF00 extends COBOLObject { ERTIGMASSE_IN_KG_GEWICHT());
private static DBS_AF00 instance = null; IPBSTG.WORK_DATA.setMANDANT(IPBSTG.WORK_DATA.getAUFT
private static IPBSTG IPBSTG ; RAGSPOSITION());
public static char[] AF00; IPBSTG.WORK_DATA.setBEDARFSNUMMER(IPBSTG.WORK_DATA.g
private DBS_AF00(Object program) { etRESTBLECHNUMMER());
AF00 = new char[636 ]; IPBSTG.WORK_DATA.setAUFTRAGSNR(IPBSTG.WORK_DATA.getS
AF00 = this.xSpaces(636 ).toCharArray(); OLLMENGE());
IPBSTG = (IPBSTG)program; IPBSTG.WORK_DATA.setAUFTRAGPOSITION(Zeros);
initDBS_AF00(); this.setWERKAUFTRAGSNUMMER(Zeros);
} this.setBESTELLNUMMER(Zeros);
public static DBS_AF00 getInstance(Object program) this.setTEILENUMMER(Zeros);
{ this.setZUSATZKENNZEICHEN(Zeros);
if (instance == null) {
instance = new DBS_AF00(program); }
return instance;
A good part of the code in the pattern is devoted to
} qualifying the sending and receiving variables. If the variables
/*<Attr name = "FERTIGMASS-1" type = "X(0006) " pos being processed happen to be in the class where the method is
= "0002" lng = "0006“/> <br> */ located, as is the case when Java code is designed from the
/**GetAttributeMethod<br>*/
public String getFERTIGMASS_1() {
start as object-oriented, there is no need to qualify them.
return getString(AF00,2,6); } However, if the variables are scattered in foreign classes, as
/**SetAttributeMethod<br>*/ they often are in procedural code, they have to be qualified by
public void setFERTIGMASS_1(String inStr) { their class names. IPBSTG is the master class of this particular
setAsChar(AF00,inStr,2,6); }
program. WORK_DATA is the class containing the local data
Step2: The second sample is that of an AS400 COBOL of that program. This is the price to be paid when the methods
paragraph containing GOTO’s and hard-wired data. This and variables are in separate classes.
paragraph has been converted to a Java method free of
GOTO’s and embedded data. The price for this conversion is Step3: In the third and final step, the generated methods
the return to the central controller to guide control to the next are allocated to a class. There are two alternate solutions to this
method in place of the former GOTO statements. The merging problem. One is to put all of the methods into one
important thing is that the Java code units are reusable and single processing class for each program. This is referred to as
flexible. They can now be used in any context, also as wrapped the God class [10]. It is the simplest method to implement and
web services. The method headers which have been inserted is often preferred by the old COBOL programmers, since it
help to document the interface to the method by defining the keeps the programs much as they were. However, the code is
input and output parameters. (See Sample 12) far from being object-oriented. The program remains a
COBOL program with Java syntax. The only advantage of this
public String IPBSTG_ST30() { approach is that the code is in Java and can be compiled and
String xNextMethod = Spaces(); executed on a Java platform. In this case it is better to generate
//* Berechtigungen f r BSTG ermitteln Java byte code from the COBOL source and to keep the
this.setSOLLTERMIN(this.getAFK());
this.setVERSANDMENGE(this.getFERTIGMASS_1()); COBOL program as it is. This solution is offered by
xNextMethod = IPBSTG.External.IPBPRF(PARA1) MicroFocus, the world’s largest COBOL vendor.
if (this.getAFK().compareTo("A") == 0) {
xNextMethod = "$$Class.IPBSTG_ST30A";
return xNextMethod;
The other solution is somewhat more object-oriented. The
} methods are assigned to classes based on locality of reference.
if (this.getAFK().compareTo("E") == 0) { A data usage analysis is made to determine which attributes
xNextMethod = "$$Class.IPBSTG_ST30E"; from which class are used most by each method. The method is
return xNextMethod; }
if (this.getAFK().compareTo("Q") == 0) { then assigned to that class whose attributes it references most.
xNextMethod = "$$Class.IPBSTG_ST30Q"; In this way real objects are created. Each user interface
return xNextMethod; } becomes a transient object. Each former AS400 file, now an
$$Class.IPBSTG_ST30Q.setTEILENUMMER(1); SQL table, becomes a persistent data object. The other records
xNextMethod = $&Class.IPBSTG_FTXT();
xNextMethod = "$$Class.IPBSTG_ST10"; in the working storage of the COBOL program become
return xNextMethod; internal objects. Those COBOL variables which do not belong
to a structure are collected together in a common object. The
The conversion of the processing paragraphs is straight former COBOL program becomes a package consisting of
forward. There are no branches out of the paragraph, so the many atomic classes, each with its own attributes and methods.
Classes whose attributes are seldom referenced may contain no the way a Java programmer would make it. The many nested
processing methods. They remain as data containers to be branches here and there can never be depicted in a Java style.
accessed by the other classes in the package. Just as in COBOL Therefore it is best if these methods are manually rewritten.
there is no reference to data in another program without going They can be sorted out and given to the Java developers who
through a call interface, no class in one package can directly rewrite them and put them back into the package from which
reference a class in another package. If a class needs data from they were taken. Thus, the conversion of old COBOL to Java
a foreign class, then it must send a request message and get cannot be fully automated. It is possible to automatically
back a response. The COBOL call mechanism for passing data convert the data structures into classes and the processing and
between separately compiled modules is imitated in Java by service paragraphs into methods but the control paragraphs
messaging. (see Figure 4) such as the one depicted in Sample 10 have to be
reimplemented to become truly Java-like [12].

Package X Package Y

Class A
VI. RELATED WORK
Class D
Method A1 Data Exchange Method D1
Method A2 Between Method D2
Packages
Method A3 Method D3 There has been an immense amount of work done on trying
Class B
Method B1
Request Class B
Method B4
to solve the problem of converting procedural legacy code into
X to Y
Method B21 Method B5 object-oriented systems. It goes back to the very beginning of
Method B3
Response
Y to X
Method B6
the object-oriented movement. At the first OOPSLA
Class C
Method C1
Class C
Method C1 conference in New Orleans the question came up as to what
Same Classes
Method C2
appear in
Method C4
should be done with all of the procedural code that had already
Method C3 different Packages Method C5
been written. The answer of Wally Dietrich from IBM was to
wrap it [13]. Wrapping may be good as a temporary solution
but it does not do away with the legacy code. That code is
V. PENDING WORK preserved and has to be maintained. The problems associated
By converting each COBOL program into a separate Java with legacy code maintenance remain. To escape those
package the conversion is simplified and the resulting package problems the legacy code must be replaced. The main
can be easily tested against the original COBOL program. But motivation in Europe for software migration is the lack of
as proven in previous projects there is a lot of code redundancy personnel to maintain the older software systems. That is also
as the same classes appear in many packages, just as the same the case in this case study. There are hardly any AS400-
COBOL data structures are copied into many programs. COBOL programmers around to maintain the system and those
However, in every package the methods contained in the class that are there are too expensive. On the other hand there is an
can be slightly different. To solve this problem any class which abundance of young Java programmers who are prepared to
appears in at least two different packages must be identified take over the system if it were only in their language.
and extracted from those packages. Then all of variants of that
class have to be examined to collect all of the different The work on converting legacy code to an object-oriented
methods assigned to that class in all of the packages. The class language has taken place in two fields of endeavor – the
then contains the original data attributes and in addition all academic field and the industrial field. In the academic field
methods of that class in all packages. These common classes work on transforming programs into object-oriented languages
are then placed in a separate common package. In this way the goes back to the work of Martin Ward and his colleagues at the
volume of code can be reduced by at least 50%. This approach University of Durham with their tool Fermat which generates a
is similar to clone removal but somewhat different since here wide spectrum language – WSL -from the original code and
classes are being merged. Since many of the methods from then converts it into a target language [14]. A parallel
different packages may be similar, clone removal can be development was at the University of Oxford where Kenneth
performed on the common package once it has been created. Lano and his team converted COBOL code into a another
This is also planned for future conversion projects [11]. intermediate language – UNIFORM – to then convert that to
an object-oriented COBOL [15]. At Durham, Hongji Yang
Another more serious problem to be solved is that of the came up with the concept of a God-Class for converting
GOTO driven control logic. In this AS400-COBOL code there procedural C-Programs into object-oriented CPP [16]. Kloesch
are three types of COBOL paragraphs: and Gall picked up this idea at the Technical University of
• Control paragraphs Vienna and devised a tool supported process for converting C
systems into object-oriented C++ [17].
• Processing paragraphs
• Service paragraphs.
The RCOST institute at the University of Benevento in
Italy has a long tradition in software reengineering going back
The processing and service paragraphs, can as has been
to the mid 1980’s. This work on reengineering has led to
shown, converted on a statement by statement basis. This is not
further work on software migration and source transformation
true of the control paragraphs. Their control logic is far from
which has been published in many papers and conferences. In the meantime several tools of this sort are being offered
Their particular strength has been in converting user interface on the commercial market. Here are only a few of the many
software to other platforms as performed in a project for the solutions offered. The Fermat Ltd, a spinoff of the University
local government [18]. The work of RCOST has been picked of Durham, offers a tool supported service for converting
up and advanced by the University of Salerno in recent years legacy code to Java. This is actually a continuation of the work
with a number of significant papers by De Lucia and his of Martin Ward. In Manassas, Virginia, the Software Mining
colleagues there [19], [20]. In Canada, Waterloo University Company offers a set tool set to extract Java business rules
been doing research in program transformation for many years from COBOL code, but this solution is only partially
[21]. The work of Ceccato and Tonella at the IRST Institute in automated. The tool set was used in a large scale migration for
Trento should also be cited for their contribution to the the state of Maine [31]. The French company GPL Inc. offers
conversion of complex COBOL data structures into Java the NacaTrans tools to convert CICS-COBOL programs
classes [7]. Finally, the research of the author Harry Sneed running on the IBM mainframe to Java. The tool has been used
must be mentioned, who has been working on the source to help convert the software of a large bank with 4 million
transformation problem for 20 years now and who has lines of code [32]. In the U.S. the well known legacy software
published three books and more than 40 technical articles on company Micro-Processing Services – MSP – offers a tool
this subject [22]. COB2J which extracts an AST tree from the COBOL program
and generates Java classes from it [33]. The Canadian
The work in the industrial field began already in the early company FreeSoft even offers a tool for converting AS400
1990’s at the IBM Laboratory in Toronto. Mara Tomic systems over into Java-J2EE systems including user interfaces
reported in 1994 on an approach to the OO-Reengineering of and relational databases [34].
COBOL programs [23]. This work led to the development of
commercial tools for transforming COBOL and PL/I sources The problem with all of these tools is that none of them
into C++. Later this work was picked up by Relativity produce a complete Java application which can be readily
Technology in North Carolina, which built the tool turned over to a team of Java programmers. The results always
Rescueware for converting legacy COBOL programs to Java have to be manually completed and the resulting code has to be
[24]. In Germany, the company SES developed a tool for reengineered. What is worse is, if the ingoing COBOL code is
converting procedural COBOL programs into OO-COBOL. It messy, unstructured and unreadable, then a messy,
was the subject of the German book – Object-oriented unstructured, unreadable Java code comes out. That is why the
Software Migration [25]. At the same time MicroFocus offered emphasis in this paper has been on reengineering and cleaning
a product to convert COBOL85 to OO-COBOL [26]. up the COBOL code prior to migration. It is absolutely
necessary to first reengineer and then convert. Not doing so
The step by Relativity Technology was much greater and leads to the “garbage in and garbage out” syndrome [35].
won more attention. That firm developed a full scale COBOL
to Java conversion tool. The problem with it was that it failed
to make a complete transformation. The Java software had to VII. CONCLUSIONS
be finished by hand by a sub contractor in Siberia [27]. The The industrial pilot project described here demonstrates
resulting Java code was error prone and far from that what that it is feasible to automatically convert portions of a legacy
Java programmers expected. Later Relativity Technology gave COBOL system running on the IBM I-series platform into Java
up the venture and was taken over by MicroFocus [28]. In but it is not possible to convert the system as a whole. Even
Japan Hitachi Software announced a product in 1997 that converting portions of the legacy code is a major endeavor.
would create networking Java objects from legacy code with a Not only the programs but also the user interfaces and the
minimum of human interaction [29]. Later nothing more was database schemas have to be converted. On top of that the
heard from that product. In Austria the company Shark COBOL programs have to be reengineered before they are
Software developed a tool for converting the COBOL converted. All of the reengineering and conversion steps can
programs of the Austrian social security agency into Java be automated but the results of the automation are not always
classes. The solution was to place the entire procedural code of that what a Java developer would like. The processing code
a program into one big master class which often exceeded units and the data accessing units can be transformed to
10000 lines of code [30]. Inspired by the Rescueware project, acceptable Java methods. The control code can also be
Sneed developed the CodeTran tool for converting COBOL transformed but the result is suboptimal. Therefore the control
and PL/I code to Java. It was first used for a migration project logic has to be rewritten. The converted methods are
at the Vienna Airport to convert 400,000 lines of COBOL executable but have to be manually refined to become
code. This tool distributed the procedural logic among several maintainable. On top of that the redundant code in different
classes and created a separate package for each program. The packages needs to be removed by extracting common classes
solution worked, but it led to an enormous volume of and similar methods, i.e. clones, into a single common class
redundant code, as the same methods and same data structures library.
came up in many packages.
Thus, working building blocks – relational databases, [12] Mossienko, M.: “Automated COBOL to Java Recycling”, Proc.
HTML screens and individual Java classes - can be provided of 7th CSMR, IEEE Computer Society Press, Benevento, March,
2003, p. 40
from the old system but the developers of the new system have [13] Dietrich, W.: “Saving a Legacy System with Objects”, Proc. Of
to fit them together and implement the encompassing OOPSLA-88, ACM Press, New York, 1989, p. 54
framework. The same could also have been achieved by [14] Ward, M.P.: “Assembler to C Migration using the FermaT
wrapping the COBOL modules but the code would then Transformation System”, Proceedings of ICSM1999, IEEE CS
remain in COBOL and COBOL programmers would be Press, Oxford, p. 67
required to maintain it. The reason for this conversion is that [15] Lano, K./Breuer, P., Haughton, H.: „Reverse Engineering
COBOL via Formal Methods“, Journal of Software
there are no COBOL programmers available. Otherwise the Maintenance, Vol. 5, No. 1, March 1993, p. 13
whole system could have been left in COBOL. The ultimate [16] Zuylen, H. Editor: The REDO Compendium, A special ESPRIT
alternative is to replace the user’s own system by a standard Report, John Wiley and Sons, Chichester, G.B., 1993
package. Technically, this would be the cleanest solution but [17] Gall, H., Klösch, R.: “Finding Objects in Procedural Programs –
as yet the end users still insist on keeping their old production An alternative Approach”, Proc. of 2nd WCRE, IEEE Computer
process as implemented by the old As400 system. Society Press, Toronto, July 1995, p. 208
[18] Aversano, L./Canfora, G./DeLucia, A.: “Migrating Legacy
System to the Web”, in Proc. of CSMR-2001, IEEE Computer
The main lesson learned is that you cannot transform a Society Press, Lisabon, March 2001, p. 148
donkey into a race horse. If a system is designed to work in a [19] De Lucia, A., Francese, R., Scanniello, G. Tortora, G.:
particular environment it can be copied over into another Developing legacy system migration methods and tools for
environment but it will never really fit to that environment. To technology transfer. Softw., Pract. Exper. 38(13): 2008, p. 1333
really fit, a new system has to be designed which can be [20] Scanniello, G., De Lucia, A., Mennella, M., Tagliamonte, G.: An
approach and an eclipse based environment for data migration.
constructed from elementary building blocks taken from the Proc. of ICSM 2008, Bejing, Sept. 2008, p. 237
old system. The reused converted Java code makes up here [21] Tahvildari, L./ Kontogiannis, K.: “A Methododolgy for
circa 60% of the code in the new system. The remaining 40% developing Transformations using the Maintainability Graph”,
will have to be coded by hand. Still it is more economical to IEEE Workshop on Reverse Eng, IEEE Computer Society Press,
reuse the old code than to rewrite everything. The best solution Richmond, 2002, p. 77
in the long run would be for the user to give up developing and [22] Sneed, H.: “Transforming procedural Program Structures to
object-oriented Class Structures”, Proc. Of ICSM-2002, IEEE
maintaining his own software and to fit his business processes Computer Society Press, Montreal, 2002, p. 286
to a standard business solution. [23] Tomic, M.: “A Possible Approach to OO-Reengineering of
COBOL Programs”, ACM Software Eng. Notes, Vol. 19, No. 2,
REFERENCES April 1994, p. 29
[1] Hayek, F.: “The Maintenance of Capital“ in Profits, Interest and [24] Triangle Technologies: Relativity – A Tool for the automated
Investment, Routledge & Sons, London, 1939 Translation of COBOL to Java, Raleigh, N.C., 2001
[2] Jacobson, I.: “Re-engineering of old systems to an object- [25] Sneed, H.: Object-Oriented Software Migration, Addison-
oriented architecture”, Proc. Of OOPSLA-91, ACM Press, New Wesley, Bonn, 1999
Orleans, 1991, p. 340 [26] Topper, A.: Object-oriented Development in COBOL, McGraw-
[3] Bodhuin, T. / Totorella, M.: “Migration of non-decomposable Hill, New York, 1995
software systems to the web using screen proxies”, Proceedings [27] Terekhof, A.: “Automating Language Conversion – A Case
of 10th WCRE, Victoria, BC., IEEE CS Press, 2003, p. 165 Study”, Proc. of ICSM-2001, IEEE Computer Society Press,
[4] Mecella, M. / Batmi, C: “ Enabling Italian E-Government Toronto, Oct. 2001, p. 654
through a cooperative Architecture”, IEEE Computer Magazine, [28] www.microfocus.com/products/
Feb. 2001, p. 40
[29] Computer Weekly, “COBOL to Java Conversion” CW-Nr 46, Munich,
[5] Henrad, J./Thiran, P./Hainut,J.: “Strategies for Data Nov. 1997
Reengineering”, Proceedings of 9th WCRE, Richmond, VA., [30] www.shark-soft.com
IEEE CS Press, Oct. 2002, p. 211
[31] www.SoftwareMining.com
[6] Sneed, H.: “Migration of procedurally oriented COBOL
programs in an Object-oriented Architecture” Proc. of 8th ICSM- [32] http://www.infoq.com/news/2009/07/cobol-to-java
1992, IEEE Computer Society Press, Orlando, Nov. 1992, p. 105 [33] www.mpsinc.com
[7] Ceccato, M., Dean, T., Tonella, P., Marchignoli, D. “Migrating [34] http://www.freesoft-gmbh.de
legacy data structures based on variable overlay to Java”, Journal [35] Terekhov,A./Verhoef,C.: “The Realities of Language
of Software Maintenance and Evolution, Vol. 22, No. 3, April Conversion”, IEEE Software, Nov. 2000, p. 111
2010, p. 211
[8] Sneed, H.: “Architecture and Functions of a Commercial
Reengineering Workbench”, Proceedings of 2nd CSMR, IEEE
CS Press, Florence, March 1998, p. 11
[9] Sneed, H.: “Migrating from COBOL to Java”, Proceedings of
ICSM2010, IEEE CS Press, Temisvar, RO., Sept. 2010, p. 192
[10] Pidaparthi, S., Luker, P., Zedan, H.: “Conceptual Foundations
for the Design Transformation of Procedural Software to object-
oriented Architecture” in Proc. of 6th IWPC, IEEE Computer
Society Press, Ischia, June 1998, p. 162
[11] Goede, N./ Harder, J: “Clone Stability”, Proceedings of 15th
CSMR, IEEE CS Press, Oldenbourg, March, 2011, p. 65

You might also like