Professional Documents
Culture Documents
Identifying Data Transfer Objects in EJB Applications
Identifying Data Transfer Objects in EJB Applications
Abstract replica of the state of an entity bean; such beans serve as the
liaisons between the application layer and the database.
Data Transfer Object (DTO) is a design pattern that is The goal of our work is to define a dynamic program
commonly used in Enterprise Java applications. Identifica- analysis that identifies classes which implement the DTO
tion of DTOs has a range of uses for program comprehen- pattern. The identification of DTOs is useful in several
sion, optimization, and evolution. We propose a dynamic contexts. First, it can assist program comprehension by
analysis for identifying DTOs in Enterprise Java applica- pointing out instances of this Enterprise Java pattern; this
tions. The analysis tracks the reads and writes of object can also be used to create additional documentation (e.g.,
fields, and maintains information about the application tier JavaDoc comments). Second, DTOs may involve serializa-
that initiates the field access. The lifecycle of a DTO is rep- tion, which can create performance bottlenecks [20, 17]. As
resented by a finite state automaton that captures the rele- a performance optimization, identified DTOs can be sub-
vant run-time events and the location of the code that trig- jected to customized serialization mechanisms (i.e., type-
gers these events. We implemented the proposed approach aware serialization and acyclic-graph serialization [27, 22]).
using the JVMTI infrastructure in Java 6, and performed a Finally, identifying DTO instances is an important step to-
study on a real-world Enterprise Java application which is wards software evolution for migrating Java Enterprise ap-
deployed for commercial use. Our results indicate that the plications based on the older EJB 2 specifications to the new
dynamic analysis achieves high precision and has accept- EJB 3 model. In EJB 3, entity beans can be detached from
able overhead. the persistence context related to a database, modified else-
where in the application, and merged back to the respec-
tive persistence context. As DTOs in older EJB applications
usually mirror the state of entity beans, it is highly desirable
1 Introduction to simplify the design by using the same entity bean object
to represent state throughout the application layer stack.
Data Transfer Object (DTO) [6, 1, 25] is a design pat-
tern that is commonly used in distributed systems in gen- Contributions. Our work has two specific contributions:
eral and Enterprise Java applications in particular. Every
method call made to a business object in an enterprise sys- • Dynamic analysis. We propose a dynamic analysis for
tem is potentially remote. In older Enterprise JavaBeans identifying DTOs in Enterprise Java applications. The
(EJB) applications such remote invocations use the network analysis tracks the reads and writes of object fields,
layer regardless of the proximity of the client to the server, and maintains information about the application tier
creating network overhead. Such method calls may perme- that initiates the field access. The lifecycle of a DTO
ate the network layers of the system even if the client and is represented by a finite state automaton (FSA) that
the enterprise application layer are both running in the same captures the relevant run-time events and the location
Java Virtual Machine (JVM). When multiple attribute val- of the code that triggers these events.
ues need to be obtained, using multiple calls to getX meth-
ods (one per attribute) is highly inefficient. • Experimental study. We implemented the proposed ap-
proach using the JVMTI infrastructure in Java 6, and
A DTO, also called a transfer object or a value object,1
performed a study on a real-world Enterprise Java ap-
encapsulates a set of values, allowing remote clients to re-
plication — the J2EE Certificate Authority, which is
quest and receive the entire value set with a single remote
deployed for commercial use on a number of web sites.
call. In EJB applications, in most cases a DTO is an exact
Our results indicate that the dynamic analysis achieves
1 Note that a DTO is different from the GoF Value Object pattern [11]. high precision and has acceptable overhead.
Clients public class UserDataVO implements Serializable {
HTML
private String username;
Web Tier private String subjectDN;
private int caid;
HTML
private String subjectAltName;
DTOs
private String subjectEmail;
private String password;
private int status;
DTOs
private int type;
Entity
EJB Tier private int endentityprofileid;
Beans
private int certificateprofileid;
Database
private Date timecreated;
JVM / Application Server Context private Date timemodified;
private int tokentype;
private int hardtokenissuerid;
Figure 1. Structure of a JEE application. private ExtendedInformation extendedinformation;
public void setUsername(String user) { ... }
public String getUsername() { ... }
2 Background and Problem Statement public void setDN(String dn) { ... }
public String getDN() { ... }
public int getCAId() { ... }
A Java Enterprise Edition (JEE) application consists of public void setCAId(int caid) { ... }
layers, or tiers. The data tier manages the organized infor- ...
}
mation that the application needs. The usual participants
in this tier are database management systems and LDAP Figure 2. A DTO example from EJBCA.
repositories. The next layer is the EJB tier, which contains
the entire business logic of the application — it manipulates
ment either interface java.io.Serializable or (in extremely
data obtained from the data tier and passes along the results.
rare cases) java.io.Externalizable. Figure 2 shows a part of
The web tier, which is not always present, obtains the pro-
a DTO class from the EJB Certificate Authority (EJBCA
cessed data from the EJB tier and presents it to the client’s
[9]) application used in our experimental study.
web browser. This structure is depicted in Figure 1.
The EJB tier usually provides various means for clients 2.2 DTO Lifecycle
to access it. There are interfaces to both clients within the
same JVM (i.e., the web tier) or remote clients. Thus it A DTO passes through several states during its lifecy-
is possible for a user to access the functionality of a well- cle. In State 1, the object has just been created, and it does
designed JEE application by going to the web site of the not yet carry any application-specific state, meaning that its
application and browsing its web tier, or by running a desk- fields have not been initialized. Populating the just-created
top application designed specifically for that purpose, which DTO with application state by initializing its fields leads to
connects directly to the EJB tier through Java RMI or a sim- State 2. When in State 1 or State 2, the DTO exists within
ilar remoting mechanism. the same JEE tier (we will refer to it as Tier 1); this is usu-
ally the EJB tier. The fields of the DTO may be accessed
2.1 Uses of Data Transfer Objects and/or modified in Tier 1 after initialization for various rea-
sons, for example reading a field indicating the application-
The information that a client (be it the Web tier or a specific status of the DTO.
remote application) receives from the EJB tier could be a The DTO enters State 3 when it is passed to an object
primitive value, a Java Collection object, or a DTO that is belonging to a different tier (Tier 2). The application state
specific to the particular enterprise application. The client carried by the DTO is read and/or written in Tier 2. The
may also create DTOs and pass them to the EJB tier in or- DTO then might be passed back to State 4 in Tier 1, where
der to decrease the network overhead. A DTO may even its data is read and potentially modified again. In fact, the
be created in one tier (EJB, web, or remote client), carry its object can “oscillate” between State 3 and State 4. Finally,
state to another tier, and then the same object may be used when the use-case ends, the DTO is prepared for garbage
to carry new state back to the tier where it was created. This collection (in either Tier 1 or Tier 2) and enters State 5. The
more complex pattern is referred to as Updatable DTO. FSA for this lifecycle is shown in Figure 3.
The single most important part of the definition of a DTO
is that it must be capable of being passed over the wire. In- 2.3 Problem Definition
deed, there are multiple explanations and/or definitions of
the DTO design pattern [6, 1, 25], but the only property The goal of our work is to perform dynamic analysis
they all strictly require of a DTO class is that it must imple- of the execution of a JEE application in order to identify
Tier 1
ClassLoad event
caught
State
1
Tier 2
populate read or
fields write fields
Class part of
No
read or
JEE application?
write fields
State
2 State End
3 Yes
read or read or
prepare for Class
No
Flag all class fields
write fields write fields
garbage Serializable? to send read / write
collection events
read or
prepare for
garbage
Figure 4. Processing at class loading.
write fields collection
Figure 3. Lifecycle of a DTO. The analysis intercepts all class load events within the ap-
plication server’s JVM, considers whether the loaded class
is a DTO candidate, and if so, tags all its fields. Tagging
classes whose instances implement the DTO pattern. To enables run-time events related to tagged fields: whenever a
achieve this goal, the analysis tracks the lifecycle events of field in such a class is accessed (read or written) anywhere
potential DTO objects, and matches the observed sequence within the same JVM, a JVMTI event is generated and pro-
of events (on per-object basis) with the DTO FSA. The key cessed by our analysis. Events are generated for a tagged
research questions that need to be answered are as follows: field regardless of whether the field is static (i.e. the field
directly pertains to a DTO candidate class) or instance (i.e.
• What specific kinds of run-time information should be the field belongs to an object that is an instance of a DTO
collected during the dynamic analysis, and how should candidate class). Figure 4 illustrates this component of the
this information be used to identify DTO objects? analysis.
• What is the false positive rate? That is, how often does
the analysis observe objects whose behavior matches Processing of field reads and writes. When a field read
the DTO state transition diagram, even though these or a field write event is observed, the analysis identifies the
objects do not actually implement the DTO pattern? object that caused that event and the location (i.e., applica-
• What is the false negative rate? In other words, how tion tier) of that object. Figure 5 summarizes this process-
many DTO objects violate the pattern described above, ing. To find the object that caused the event, the analysis
and therefore are not reported by the analysis? obtains a handle to the call stack of the current thread. A
• What is the run-time overhead of the analysis? traversal of the stack frames, starting from the most recent
one, identifies the first method that belongs to an applica-
tion class different from the class that the field belongs to.
3 Dynamic Analysis for DTO Identification The receiver object of that method is the one accessing the
candidate DTO.
DTOs always carry state, and sometimes also implement The analysis has to consider the state of the run-time call
application logic. Our approach is focused on the state, stack because it is very common for DTOs to implement
which makes it unnecessary to track method call/return or getter and setter methods. If a field is read through a getter,
entry/exit events. To identify DTOs, the analysis tracks field or written through a setter, the top call stack frame would be
accesses (reads and writes) for objects of potential DTO within the context of a method belonging to the same class
classes, together with the location of the code that triggers as the field. However, we are interested in the object that
the events. The relevant run-time events are observed with caused the field read or write to happen in the first place.
the help of the Java Tool Interface (JVMTI), which provides Note that the traversal of the call stack must identify a
a portable and standardized infrastructure for implementing method that belongs to an application class. This is nec-
our dynamic analysis. essary due to the frequent use of reflection in JEE ap-
Processing at class loading. Classes that are DTO candi- plications and application servers. We match the cur-
dates are all Java classes that (1) belong to the enterprise ap- rent frame method’s class against the packages of the JEE
plication, and (2) implement interface java.io.Serializable. application to ensure that we do not identify the cause
Object Free
Try accessing the custom
Field read / write event caught
event caught data structure related
to the field’s object
Yes
set location
No of creation object moved Yes
method belongs
to method’s
to JEE app? from the location
location
method location of its creation?
No different from
location of End
object creation?
Yes No
Yes
method belongs
to the class of
the field?
Yes
Figure 6. Processing at garbage collection.
Set the "moved"
flag of the fields’s
object
No