Java Bytecode

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 90

Introduction to

Java Bytecode
Amin Borjian
Fall 2019
Outline
- Operand Stack
- Java Bytecode
- ASM
- Abstract Syntax Tree
Operand Stack
Example intermediate code

d = a + b + c add a, b, t1
g = d * f add t1, c, d
mult d, f, g
Operand Stack Mode
d = a + b + c
g = d * f
Operand Stack Mode
d = a + b + c push a
g = d * f

a
Operand Stack Mode
d = a + b + c push a
g = d * f push b

b
a
Operand Stack Mode
d = a + b + c push a
g = d * f push b
add

a+b
b
a
Operand Stack Mode
d = a + b + c push a
g = d * f push b
add
push c

c
a+b
Operand Stack Mode
d = a + b + c push a
g = d * f push b
add
push c
add a+b+c
c
a+b
Operand Stack Mode
d = a + b + c push d
g = d * f push a
push b
add a+b+c
push c c
a+b
add
d
Operand Stack Mode
d = a + b + c push d
g = d * f push a
push b
add
push c
a+b+c
add
d
assign
Example intermediate code: d = a + b + c
Assembly Java Bytecode
add a, b, t1 push d
add t1, c, d push a
push b
add
push c
add
assign
Operand Stack
Be More Precise
int a; push d
long b; push a
float c, d; push b
d = a + b + c;
add
a+b+c
push c
c
add a+b
assign b
a
d
int a; push d
long b; push a
float c, d; i2l
d = a + b + c; a+b+c
push b
c
add
a+b
push c b
add a (long)
assign a (int)
d
int a; push d
long b; push a
float c, d; i2l
d = a + b + c; a+b+c
push b
c
ladd
a+b
push c b
add a (long)
assign a (int)
d
int a; push d
long b; push a
float c, d; i2l a+b+c
d = a + b + c; c
push b
a+b
ladd
a+b
l2f b
push c a (long)
add a (int)
d
assign
int a; push d
long b; push a
float c, d; i2l a+b+c
d = a + b + c; c
push b
a+b
ladd
a+b
l2f b
push c a (long)
fadd a (int)
d
assign
int a; push d
long b; push a
float c, d; i2l a+b+c
d = a + b + c; c
push b
a+b
ladd
store a+b
l2f b
load
push c a (long)
fadd a (int)
d
assign
Operand Stack
Array
int a[5];
int b[10];
b[3] = a[2];
int a[5]; aload b
int b[10];
b[3] = a[2];

b (address)
int a[5]; aload b
int b[10]; iconst_3
b[3] = a[2];

3
b (address)
int a[5]; aload b
int b[10]; iconst_3
b[3] = a[2]; aload a

a (address)

3
b (address)
int a[5]; aload b
int b[10]; iconst_3
b[3] = a[2]; aload a
iconst_2

2
a (address)

3
b (address)
int a[5]; aload b
int b[10]; iconst_3
b[3] = a[2]; aload a
iconst_2
iaload a[2]
2
a (address)

3
b (address)
int a[5]; aload b
int b[10]; iconst_3
b[3] = a[2]; aload a
iconst_2
iaload a[2]
iastore 2
a (address)

3
b (address)
int a[5]; aload b
int b[10]; iconst_3
b[3] = a[2]; aload a
iconst_2
iaload a[2]
address iastore 2
a (address)

3
b (address)
Java Bytecode
Java Virtual Machine
(JVM)
JVM From Top To Bottom

Threads
JVM From Top To Bottom

Frame handled Frame 4


by JVM Methods
Frame 3
(Specific opcode Frames
Frame 2
for call or return
from function) Frame 1

Thread
JVM From Top To Bottom
0 1 2 3 …

Array of local variables

Operand
Stack
Constant Pool

Frame
JVM From Top To Bottom
Array of local variables (Constructor or instance method)
0 1 2 3 …


first parameter
reference of object
JVM From Top To Bottom
Array of local variables (static method)
0 1 2 3 …


second parameter
first parameter
Frame size is unlimited???
Frame size is unlimited???
No!
Calculated in Compile Time
Java Bytecode
Example 1
public class Student {
private String name;

public String getName()


{
return name;
}
}
public java.lang.String getName();
Code:
0: aload_0
1: getfield #2
4: areturn
- First local variable is reference
- What is number #2?
- Line numbers?
public java.lang.String getName();
Code:
0: aload_0
1: getfield #2
4: areturn
0 1 2 3 4
aload_0 getfield 00 02 areturn
Bytecode array of method
public java.lang.String getName();
Code:
0: aload_0
1: getfield #2
4: areturn
0 1 2 3 4
aload_0 getfield 00 02 areturn

byte for 0!? Bytecode array of method


Java Bytecode
Descriptors
Descriptors
- Internal (like java/lang/String)
- Type
- Method
Type Descriptor
boolean Z
char C
byte B
short S
int I
float F
long J
double D
Object Ljava/lang/Object;
int[] [I
Method Descriptor
void m(int i, float f) (IF)V
int m(Object o) (Ljava/lang/Object;)I
int[] m(int i, String s) (ILjava/lang/String;)[I

Object m(int[] i) ([I)Ljava/lang/Object;


Java Bytecode
Example 2
public class Course {
private String name;
private int grade;
public Course(String name, int grade) {
this.name = name;
this.grade = grade;
storeInDB(name, grade);
}
private void storeInDB(String name,int grade) {
}
}
public class Course {
private String name;
private int grade;
public Course( ,
String name, int grade) {

this.name = name;
this.grade = grade;
storeInDB( , name, grade);

}
}
Java Code:
public Course(String name, int grade) {
...
}

Java Bytecode:
public Course(java.lang.String, int);
Code:
...
Java Code:
super(this);

Java Bytecode:
0: aload_0
1: invokespecial #1

Constant Pool:
#1 java/lang/Object."<init>":()V
Java Code:
this.name = name;

Java Bytecode:
4: aload_0
5: aload_1
6: putfield #2

Constant Pool:
#2 Field name:Ljava/lang/String;
Java Code:
this.grade = grade;

Java Bytecode:
4: aload_0
5: aload_2
6: putfield #3

Constant Pool:
#3 Field grade:I
Java Code:
storeInDB(this, name, grade);

Java Bytecode:
14: aload_0
15: aload_1
16: iload_2
17: invokespecial #4
Constant Pool:
#4 storeInDB:(Ljava/lang/String;I)V
Java Code:
return;

Java Bytecode:
20: return
Java Bytecode
Notes
- Do not talk about frames
- See full list of opcodes
https://en.wikipedia.org/wiki/Java_bytecode_instruction_listings
Get Java Bytecode
javac Student.java
javap –c Student.class
javap –c –v Student.class
ASM
ASM
- Java bytecode manipulation
- Modify existing classes or to dynamically
generate classes
- A Java library (Use maven dependency
management or put in classpath)

Download: https://asm.ow2.io/
Features of ASM
- Handles labels for you!!
- Create bytecode instruction easily
- Handle Constant Poll for you
Java Code:
public class Course extends java.lang.Object {
}

ASM Code:
ClassWriter classWriter =
new ClassWriter(ClassWriter.COMPUTE_FRAMES);
classWriter.visit(Opcodes.V1_8,
Opcodes.ACC_PUBLIC | Opcodes.ACC_SUPER,
"Course", null, "java/lang/Object", null);
...
classWriter.visitEnd();
Java Code:
private String name;
private int grade;
ASM Code:
classWriter.visitField(ACC_PRIVATE, "name",
"Ljava/lang/String;", null, null).visitEnd();
classWriter.visitField(ACC_PRIVATE, "grade",
"I", null, null).visitEnd();

Default value
Generic Signature
Java Bytecode:
public Course(java.lang.String, int);
Code:
...
ASM Code:
MethodVisitor methodVisitor = classWriter.visitMethod
(ACC_PUBLIC, "<init>", "(Ljava/lang/String;I)V",
null, null);
methodVisitor.visitCode();
...
methodVisitor.visitMaxs(0, 0); // MaxStack, MaxLocals
methodVisitor.visitEnd();
Java Bytecode:
0: aload_0
1: invokespecial #1
Constant Pool:
#1 java/lang/Object."<init>":()V
ASM Code:
methodVisitor.visitVarInsn(ALOAD, 0);
methodVisitor.visitMethodInsn(INVOKESPECIAL,
"java/lang/Object", "<init>", "()V", false);
Java Bytecode:
4: aload_0
5: aload_1
6: putfield #2
Constant Pool:
#2 Field name:Ljava/lang/String;

ASM Code:
methodVisitor.visitVarInsn(ALOAD, 0);
methodVisitor.visitVarInsn(ALOAD, 1);
methodVisitor.visitFieldInsn(PUTFIELD, "Course",
"name", "Ljava/lang/String;");
Java Bytecode:
4: aload_0
5: aload_2
6: putfield #3
Constant Pool:
#3 Field grade:I

ASM Code:
methodVisitor.visitVarInsn(ALOAD, 0);
methodVisitor.visitVarInsn(ILOAD, 2);
methodVisitor.visitFieldInsn(PUTFIELD, "Course",
"grade", "I");
Java Bytecode:
14: aload_0
15: aload_1
16: iload_2
17: invokespecial #4 //storeInDB:(Ljava/lang/String;I)V

ASM Code:
methodVisitor.visitVarInsn(ALOAD, 0);
methodVisitor.visitVarInsn(ALOAD, 1);
methodVisitor.visitVarInsn(ILOAD, 2);
methodVisitor.visitMethodInsn(INVOKESPECIAL, "Course",
"storeInDB", "(Ljava/lang/String;I)V", false);
Java Bytecode:
20: return

ASM Code:
methodVisitor.visitInsn(RETURN);
See ASM code of .class file
java -classpath
".;asm-7.2.jar;asm-util-7.2.jar"
org.objectweb.asm.util.ASMifier
YourClass.class
Abstract Syntax Tree
(AST)
AST
E *

E +

E
E E

1 * ( 2 + 5 ) 1 2 5
AST hierarchy example

expr

int_const plus multiply


AST hierarchy example
* multiply
e1 e2

plus
+
e1 e2

int_const int_const int_const


1 2 5 1 2 5
AST hierarchy example
multiply
multiply: e1 e2
e1.generate()
plus
e2.generate() e1 e2
mult e1,e2
int_const int_const int_const
1 2 5
AST hierarchy example
multiply
int_const: e1 e2
ldc constant
plus
e1 e2

int_const int_const int_const


1 2 5
AST hierarchy example
multiply
ldc 1 e1 e2

plus
e1 e2

int_const int_const int_const


1 2 5
AST hierarchy example
multiply
ldc 1
e1 e2
Ldc 2
plus
e1 e2

int_const int_const int_const


1 2 5
AST hierarchy example
multiply
ldc 1
e1 e2
ldc 2
ldc 5
plus
e1 e2

int_const int_const int_const


1 2 5
AST hierarchy example
multiply
ldc 1
e1 e2
ldc 2
ldc 5
plus
iadd
e1 e2

int_const int_const int_const


1 2 5
AST hierarchy example
multiply
ldc 1
e1 e2
ldc 2
ldc 5
plus
iadd
e1 e2
imul
int_const int_const int_const
1 2 5
AST Hierarchy Example
AST hierarchy example
for (E-init,
E-cond,
E-step)
begin
STL-body
end E-init
...
Semantic Stack
AST hierarchy example
for (E-init,
E-cond,
E-step)
begin
E-cond
STL-body ...
end E-init
...
Semantic Stack
AST hierarchy example
for (E-init,
E-cond,
E-step) E-step
...
begin
E-cond
STL-body ...
end E-init
...
Semantic Stack
AST hierarchy example
for (E-init, STL-body
E-cond, ...
E-step) E-step
...
begin
E-cond
STL-body ...
end E-init
...
Semantic Stack
AST hierarchy example for-loop
...
for (E-init, STL-body
E-cond, ...
E-step) E-step
...
begin
E-cond
STL-body ...
end Semantic Token E-init
...
For this
Semantic Stack
Final AST result

for-loop
...

E-init E-cond E-step STL-body


... ... ... ...
The End

You might also like