ECS658 U04 Collection Types

Further
Object Oriented Programming

“FOOP”
Java’s Collection Types

ECS658U
Matthew Huntbach
matthew.huntbach@qmul.ac.uk
Arrays (1)
• Java has special symbols for using arrays, these come from the
language that Java was based on when it was first introduced,
C++ which added object oriented features to the C language
• An array represents a fixed block of memory in a computer
• Type[] is the type of an array of values of type Type, it can be
used like any other type
• An array is an object, so a variable of type Type[] refers to an
object which is an array of elements of type Type
• Type could be a primitive or an object type
• The size of an array is fixed, new Type[val] creates an array
of type elements which contain val places to store elements,
where val could be an actual integer or a variable of type int
or an expression that evaluates to an integer
• Then arr[i]=exp; sets position i to exp, where exp is of
type Type, and i is of type int, again these could be
expressions that evaluate to values of these types
2
Arrays (2)
• If arr is of type Type[] then arr[i] can be used in any place
where a variable of type Type could be used
• However, where a variable of type Type represents a fixed
position in computer memory, what arr[i] represents changes
if i is a variable that gets its value changed, or an expression
which evaluates to a different value when it is called again
• When new Type[val] is used to create a new array, each
position in it stores 0 if Type is a primitive type, or null if it is
an object type
• So to make use of an array, an array of a fixed size has to be
created, then its positions set to values
• A variable of type Type[] can be changed to refer to another
array of a different size, but that is not changing the actual array
it used to refer to
• As with any other object type, two or more variables can refer to
the same array, with any of them used to change its content
3
ArrayList
• Arrays are the only collection type in Java with special symbols
used to manipulate it
• Java provides several built-in collection types, manipulated like
any other objects, that is with normal method calls on them
• An object of type ArrayList<Type> is similar to an array of
type Type[]
• If alist is of type ArrayList<Type> then the equivalent of
the assignment arr[i]=val;, where arr is of type Type[], is
alist.set(i,val);
• The equivalent of arr[i] used elsewhere like a variable to get
the value at position i in the collection is alist.get(i)
• However, while arrays are of a fixed size, there are further
methods that can be called on an ArrayList<Type> that will
change its size
• Putting import java.util.ArrayList; at the top of a class
enables you to use it, or import java.util.*; for all the
collection types
4
add in ArrayList (1)
• If alist is of type ArrayList<Type> then alist.add(val)
will increase the size of the list that alist refers to by one, and
put what is given by val in the new last place
• val could be a variable of type Type or a subtype of Type, or a
method call whose return type is Type or a subtype of Type
• If val was not of type Type or a subtype of it, that would lead to
a compiler error, meaning the code could not be produced to run
• So if alist refers to the equivalent of the array [a,b,c,d],
then alist.add(val), where val evaluates to e, changes it
to [a,b,c,d,e]
• We are using the single characters here to mean any reference
to an object of type Type
• As with arrays and any other object, what is stored in it is a
reference to an object, so if val is a variable, then after this
alist.get(4) will return a reference to the same object that
val refers to
5
add in ArrayList (2)
• As with arrays in Java, the first position is 0, the second position 1
and so on, that is why when alist refers to [a,b,c,d,e] the
call alist.get(4) returns e not d
• The equivalent to arr.length , where arr is of an array type, is
alist.size()
• With arrays, length is a variable not a method, which is why there
is no (), but it is a final variable so it cannot be assigned
another value, arr.length=5 would give a compiler error
• The method add in ArrayList is overloaded, there is one with
the same name that takes two arguments
• If alist refers to [a,b,c,d] then alist.add(2,e) changes it
to refer to [a,b,e,c,d]
• This is different from alist.set(2,e) which changes it to
[a,b,e,d]
• set(i,val) on an ArrayList changes position i to val and
returns what it replaced, so alist.set(2,e) also returns c
6
remove in ArrayList (1)
• The method remove in ArrayList is also overloaded, it has two
versions, both with one argument
• A method call alist.remove(i), where i is of type int, will
remove what is at position i, or if i is less than 0 or equal to or
greater than alist.size() it will throw an exception of type
IndexOutOfBoundsException
• A method call alist.remove(val), where val is of an object
type, removes the first object stored in alist that is equal to val
• If alist.remove(val), is called and there is no object in the list
equal to val, it will not throw an exception, it will just end and
return false, otherwise it returns true
• The call alist.remove(i), returns what was at position i that it
removed
• In both cases, removing an element does not replace it by null,
instead it moves all elements following it down by one position
7
• So, if alist refers to [a,b,c,d,e,b,f,d], the call
alist.remove(4) returns e and changes the list to
[a,b,c,d,b,f,d]
• If alist refers to [a,b,c,d,e,b,f,d], the call
alist.remove(b) returns true and changes the list to
[a,c,d,e,b,f,d]
• The call alist.remove(val) where val evaluates to something
not equal to any of a, b, c, d, e or f returns false and leaves the
list unchanged
• An important thing to note is that alist.remove(val) removes
the first element in alist that is equal to val according to the
method equals, so that is the lowest value of i where
alist.get(i).equals(val) returns true
• So with this, and other methods in the built-in collection classes,
overriding equals affects how it works
8
• If you wanted to remove every occurrence of val in the list
referred to by alist (which means every element where calling
equals(val) on it returns true) would the following do it?
for(int i=0; i<alist.size(); i++)
if(alist.get(i).equals(val)) alist.remove(i);
• An important thing to understand is that it would not, because
alist.remove(i) causes what was at position i+1 to move to
position i, so then as i++ happens after alist.remove(i) the
element that comes next after the one that was removed does not
get checked to see if it should be remove
• So, if val is equal to b, and alist is [a,b,c,d,b,b,e,f,b]
the above loop will change alist to [a,c,d,b,e,f] rather than
[a,c,d,e,f]
• The following will work, it is a while loop with no content:
while(list.remove(val));
but is very inefficient
9
• The reason while(list.remove(val)); works is because
list.remove(val) both changes the collection that list refers to
and returns a boolean, with the boolean being false when all
occurrences of val have been removed
• It is very inefficient because each call of list.remove(val)
starts again at position 0 and checks each element again, this is
unnecessary as it only needs to check the elements after the last
one that was removed
• Anyone who wants to be considered as someone who can properly
program should realise this, and not write such inefficient code
• Here is a correct way to remove all occurrences of val from list:
for(int i=0; i<list.size(); )
if(list.get(i).equals(val)) list.remove(i);
else i++;
10
Methods in ArrayList
• So these are the methods we have seen so far in ArrayList:
• list.size() returns the size of list
• list.get(i) returns what is at position i in list
• list.add(val) adds val to the end of list
• list.add(i,val) puts val at position i in list and moves what
was there and everything after it up by one position
• list.set(i,val) replaces what is at position i by val in list,
and returns what was replaced
• list.remove(i) removes what at position i in list, moves
everything after it down by one position and returns what was
removed
• list.remove(val) removes the first occurrence of val in list and
returns true, or returns false if val does not occur in list
• There are more methods than these in ArrayList, for the
purpose of this module you will not need to know every method it
has, but you should get familiar with these commonly used
methods
11
Generic Classes: Type Variables
• The correct name for the class is ArrayList<E>
• Here E is a type variable, and what this means is that an
ArrayList object needs to have E set to a type, that means that
ArrayList is a generic class
• So, for example, a variable of type ArrayList<String> can only
refer to objects of type ArrayList<String>, that is with E set to
String
• E can be set to any object type, so that is interface types as well
as class types, and types you have defined
• However, E cannot be set to a primitive type, so an ArrayList to
store integers has to be type ArrayList<Integer> rather than
ArrayList<int>
• A type variable can have any name, so E has no special meaning,
but by convention type variables are given single capital letters as
names, E is used for collection types, T is commonly used in other
cases
12
Defining a Generic Classes
• You can write your own generic class by putting a type variable
in its header, as in:
class MyClass<T>
which gives T as the name for the type variable used in the code
you write for MyClass
• A class can have more than one type variable, just list them
separated by commas between the < and >, as in <T1,T2>
• Having declared the type variable T in the class header, you can
use T as a parameter type for constructors and methods in the
class, as the return type for a method, and as the type for instance
and local variables
• However, as T could be set to any object type, the only methods
you can can call on a variable of type T are those from class
Object, and you also cannot use a constructor to create new
objects of type T
13
Creating an object of a generic class
• If this is the start of the class, declaring two instance variables, and
a constructor:
class MyClass<T> {
private T myObj;
private int num;
public MyClass(T arg) {
myObj=arg;
}
…
then new MyClass<Account>(acc) would create a MyClass
object with T set to Account, myObj referring to the Account
object that acc refers to, and num set to 0
• A variable myAcc could be given type MyClass<Account>, and it
can be declared and assigned an initial value by:
MyClass<Account> myAcc = new MyClass<>(acc);
14
An issue with type arguments
• The reason:
MyClass<Account> myAcc = new MyClass<>(acc);
can be used rather than:
MyClass<Account> myAcc = new MyClass<Account>(acc);
is that with T set to Account in myAcc, it can only refer to an
object of type MyClass<Account>
• Suppose there is a class SaveAccount which is a subclass of
Account, a variable of type Account can refer to an object of
actual type SaveAccount
• But a variable of type MyClass<Account> cannot refer to an
object of actual type MyClass<SaveAccount>
• An example of why this is the case is that if a variable accList of
type ArrayList<Account> could refer to a list whose actual type
was ArrayList<SaveAccount> then you could call
accList.add(acc) where acc refer to an object that is not a
SaveAccount, and that should not be allowed
15
Generic Methods
• A type variable can be declared just for use in an individual method
rather than for use in a whole class, in that case it is declared by
naming it between < and > before the return type of the method
• For example, here is a method that takes an ArrayList of objects
and two objects of its content type, and returns true if the second
object occurs only in a later position than the first object:
static <E> boolean occursAfter(ArrayList<E> list,
E obj1, E obj2) {
int i=0;
for(; i<list.size(); i++) {
E next = list.get(i);
if(next.equals(obj2)) return false;
if(next.equals(obj1)) break;
}
for(i++ ; i<list.size(); i++)
if(list.get(i).equals(obj2)) return true;
return false;
}
16
ArrayList<E> Processing Issues
• If a method call occursAfter(accs,acc1,acc2) is made
where accs has type ArrayList<Account> then E is set to
Account
• It is possible for a variable just of generic type E to refer to an
object of the subtype that it is set to, so acc1 and acc2 must be
either of type Account or a subtype, such as SaveAccount
• There are many ways in which the basic requirement could be met,
the code given shows a way of doing it which is efficient as it only
goes through the ArrayList once
• i has to be declared outside the for-loop as it needs to be used
with its same value in the next for-loop
• The possibility of the same object occurring more than once in
an ArrayList needs to be considered, so it is not enough just to find
the position of obj1 and search after that for obj2
• In the first loop a variable is set to what list.get(i) returns, to
avoid the unnecessary call of list.get(i) twice
17
For-each loop
• Here is a version that performs the same task using a for-each
loop:
E obj1, E obj2) {
boolean obj1found=false;
for(E next : list)
if(next.equals(obj2)) return obj1found;
else if(next.equals(obj1)) obj1found=true;
return false;
}
• This shows that although a for-each loop goes through a whole list,
it can be halted by a return (or also a break) inside it
• return false at the end is reached only if return obj1found
was never reached
• An important issue with a for-each loop is that if you change the
content of the list, an exception is thrown when the loop continues
• So add or remove can be called on the list inside the for-loop, but
only if it then exits without going to the next element
18
indexOf(obj)
• Here is the more obvious way to write a method that takes an
ArrayList and two objects of its content type, and returns true
if the second occurs only in a later position than the first:
E obj1, E obj2) {
int pos1 = list.indexOf(obj1);
if(pos1 == -1) return false;
int pos2 = list.indexOf(obj2);
if(pos2 == -1) return false;
return pos1<pos2;
}
• This uses another method in ArrayList<E>, indexOf, which
returns the first position of its argument, or -1 if it does not occur
• So list.indexOf(obj) will return the lowest value of i where
list.get(i).equals(obj) returns true, or -1
• This version of occursAfter has to go through the list twice
unless obj1 does not occur at all
19
Deal with all possibilities
• Having been asked to write code that does what was asked for,
you might first think of writing:
E obj1, E obj2) {
return list.indexOf(obj1)<list.indexOf(obj2);
}
• The issue here is that it does not deal with the possibility of obj1
or obj2 not occurring, and the way it works means that it would
return true if obj1 did not occur but obj2 did
• When writing code always consider all possibilities
• For any that is not mentioned in the requirement, think of a
sensible way of dealing with it, and also make a note of it so that
the person who made the requirement can confirm that is the way
they would prefer it to be dealt with
• Here it did not say return false if obj1 or obj2 does not occur,
but it makes sense to do that
20
LinkedList<E>
• Java has a class LinkedList<E> which works exactly the same
as ArrayList<E> as described so far, with the same methods
get, set, add, remove, size
• The reason for having these two classes is efficiency: the methods
work the same in terms of how they interact with other code, but
the time they take to perform differs
• Inside an ArrayList<E> object there is an array arr used to
implement it, so if alist refers to an ArrayList<E> object then
alist.get(i) works very quickly as it just returns arr[i]
• However, alist.remove(i) can work slowly, as it has to do:
for(int j=i+1; j<size; j++)
arr[j-1]=arr[j];
• The class LinkedList<E> uses a linked list to represent the list,
which means it can work more quickly on some methods
• llist.get(i) can take a long time if llist refers to a
LinkedList<E> object, as it has to send a pointer through the list
one cell at a time to get to position i
21
List<E>
• Java has an interface List<E> which both ArrayList<E> and
LinkedList<E> implement
• An interface has method headers, not code for them, so classes
that implement them must provide code for them
• All the methods described so far as called on ArrayList<E> also
work with LinkedList<E> in the same way in terms of their
interaction with other code and are in List<E> as headers
• So, rather than write a method with ArrayList<E> as a
parameter type, it is better to use List<E> because the same
method could then also take a LinkedList<E> as its argument
and work in the same way in terms of interacting with other code
• The only place where the actual class names ArrayList<E> and
LinkedList<E> need to be used is when a constructor is used
to create a new list, a constructor has to use an actual class type
• You can then chose to use LinkedList<E> rather than
ArrayList<E> if you know the way it is going to be used means
that LinkedList<E> would work more efficiently
22
Set<E>
• Java also has an interface type Set<E> which has the following
methods:
• set.size() returns the size of set
• set.add(val) adds val to the set and returns true, or leaves it
unchanged and returns false if val is already in the set
• list.remove(val) removes val from the set and returns true, or
returns false if val does not occur in list
• There are more methods than these in Set<E>, for the purpose of
this module you will not need to know every method it has, these
are the most important ones
• A Set<E> is a collection where elements are not stored in
numerically indexed positions, and no element is stored more than
once, so Set<E> does not have the methods of List<E> that
include integer as arguments giving positions
• The interface Set<E> is implemented by classes HashSet<E>
and TreeSet<E>
23
HashSet<E> and TreeSet<E>
• Like ArrayList<E> and LinkedList<E>, the classes
HashSet<E> and TreeSet<E> use very different internal data
structures to implement the same methods in the interface
• However, they do work differently in the way they interact code,
because HashSet<E> seems to store its elements in a random
order, while TreeSet<E> stores them in their own order
• As they do not have methods that access elements by an integer
giving their position, you cannot go through a HashSet<E> or a
TreeSet<E> using a for-loop
• But you can go through them using a for-each loop:
for(E element : set)
System.out.println(element);
• Here, if E is set to String, then if set refers to a
HashSet<String> its elements are printed in a random order,
but if it refers to a TreeSet<String> they are printed in
alphabetical order
24
TreeSet<E>
• TreeSet<E> actually implements interface SortedSet<E>
• SortedSet<E> extends Set<E>, what that means is that it
inherits the method headers from Set<E> and also adds some
extra method headers
• The extra method headers in SortedSet<E> are those that deal
with sets where elements have an ordering
• So, if tset was set to new TreeSet<String>() and str is of
type String, then tset.headSet(str) returns a
TreeSet<String> whose elements are those in tset which are
less than str alphabetically
• tset.tailSet(str) returns a TreeSet<String> whose
elements are those in tset which are equal to or greater than str
alphabetically
• What these return are actually views of tset, so if we do
lset=tset.headSet(str) then lset.add(str1) will cause
str1 to be added to tset as well as to lset
25
Type variable extends type (1)
• Before looking further at Set<E>, we need to go back and
consider the issue that a variable of type List<Account> cannot
refer to an object of type List<SaveAccount> where
SaveAccount is a subclass of Account
• Suppose we want to write code that takes a list of Account
objects and returned the sum of the balance of each of them
• How could we do that for a list of SaveAccount objects?
• It would go against the principle of avoiding repeating code if that
meant separate methods had to be written whose only difference is
that one had a parameter type of List<Account> and another
had a parameter type of List<SavingAccount>, and even
worse if there were many subclasses of Account that needed a
method to perform that task
• The way to resolve this is to declare a type variable like
<A extends Account>, that means that instead of being able to
be set to any object type, A can only be set to Account or a
subtype of Account
26
Type variable extends type (2)
• So here is code that takes a List<A> object, where A could be set
to Account or a subclass of Account:
<A extends Account> int addBalance(List<A> list) {
int sum=0;
for(int i=0; i<list.size(); i++) {
A acc = list.get(i);
sum = sum + acc.balance();
}
return sum;
}
• Because an object that a variable of type A refers to must be of
type Account or a subtype of Account, we can call methods from
class Account on variables whose type is that type variable
• A variable of type List<A> can refer to an object of type
ArrayList<A> or LinkedList<A>
• The type variable A is set to the actual element type of the list
passed to it of a type that implements the interface List<E>
27
Anonymous Type Variable
• Here is another way of dealing with it:
int addBalance(List<? extends Account> list) {
int sum=0;
Account acc = list.get(i);
}
return sum;
}
• A variable of type List<? extends Account> can refer to an
object of any type that implements List<E> and has E set to
Account or a subclass of Account
• It can also be done with a for-each loop:
int addBalance(List<? extends Account> list) {
int sum=0;
for(Account acc : list)
return sum;
}
28
Type variable as return type
• Here is a method that uses a type variable as its return type:
<A extends Account> A find(List<A> list,
int val, A mine) {
for(A acc : list)
if(acc.balance()>=val) return acc;
return mine;
}
• This means that the third argument to the method must be of the
same type as the method content, and states that what the method
returns must also be of that type (or a subtype of it)
• Whereas with:
Account find(List<? extends Account> list,
int val, Account mine) {
for(Account acc : list)
if(acc.balance()>=val) return acc;
return mine;
}
list could be of type List<SaveAccount>, but the third
argument and what is returned may not be of type SaveAccount
29
Class Rectangle
We gave this class previously as a simple class to illustrate things:
class Rectangle {
private int height, width;
public Rectangle(int h, int w) {
height=h;
width=w;
}
public int getHeight() {
return height;
}
public void setHeight(int h) {
height=h;
}
public int getWidth() {
return height;
}
public void setWidth(int w) {
width=w;
}
public int area() {
return height*width;
}
}
30
Using equals
• Suppose we had the following code:
List<Rectangle> recs = new ArrayList<>();
Rectangle rec1 = new Rectangle(10,4);
recs.add(rec1);
recs.add(rec2);
recs.add(rec3);
recs.add(rec4);
Rectangle trec = new Rectangle(5,8);
int pos = recs.indexOf(trec);
• What would pos be set to? Answer is -1
• Recall that list.indexOf(obj) will return the lowest value of i
where list.get(i).equals(obj) returns true, otherwise -1
• recs.indexOf(2).equals(trec) does not return true
because although trec refers to an object which is identical in
content with what is at position 2 in recs, it is not the same object
31
Default equals
• If we had this code:
List<Rectangle> recs = new ArrayList<>();
recs.add(rec1);
recs.add(rec2);
recs.add(rec3);
recs.add(rec4);
Rectangle trec = rec3;
• Then pos would be set to 2 because trec and
recs.indexOf(2) do refer to the same Rectangle object
• The class Rectangle as given does not override the method
equals, so it uses the one inherited from Object
• That means obj1.equals(obj2) returns what obj1==obj2
gives, which is true only if they refer to the same object
32
Overriding equals
• If we wanted recA.equals(recB) to return true when they are
identical in content, so that then:
in the previous code would result in pos set to 2, we need to
override equals in class Rectangle
• Here is code which does that:
public boolean equals(Object obj) {
if(obj==this)
return true;
if(obj instanceof Rectangle) {
Rectangle that = (Rectangle) obj;
return (this.height==that.height) &&
(this.width==that.width);
}
return false;
}
33
Overloading equals
• Note that the method equals to be put in class Rectangle needs
to have parameter type Object rather than Rectangle
• The reason for that is that the method in class Object that it is
overriding has parameter type Object
• If only the following was put in:
public boolean equals(Rectangle that) {
return (this.height==that.height) &&
(this.width==that.width);
}
that would be overloading equals rather than overriding it,
meaning that Rectangle has two equals methods, the other
with header public boolean equals(Object obj)
• When methods are overridden, the choice of which method to use
is done when the code is compiled, and because
recs.indexOf(trec) is using the generalised code in
ArrayList<E> it would use the version with Object as its
parameter
34
Using TreeSet<E>
• If you tried to do:
Set<Rectangle> rset = new TreeSet<>();
as Rectangle has been defined so far, it would result in an error
in the code
• The reason for this is that a TreeSet requires its elements to have
a defined comparison to show whether one is greater than or less
than another, in order for them to be stored in order
• That can be done by adding to the class Rectangle a method:
public int compareTo(Rectangle that) {
return this.area()-that.area();
}
which orders Rectangles by their area
• If rect1.compareTo(rect2) returns a negative value, rect1 is
considered less than rect2, if what it returns is, positive rect1 is
considered greater than rect2, and if 0 is returned they are
considered equal by TreeSet
35
Using Comparable<T>
• If you try to add a Rectangle to a TreeSet<Rectangle>
constructed as given, and the Rectangle has the same area as
one already in it, it will not be added, because according to the
method compareTo, it is equal to one that is already there
• TreeSet<E> uses compareTo called on its elements rather than
equals
• The header of class Rectangle must be:
class Rectangle implements Comparable<Rectangle>
for compareTo to be used on it, because compareTo is the
method in the interface Comparable<T>, and the code for
TreeSet<E> uses variables of type Comparable<E> which need
to be able to refer to objects of the type that E was set to
• The method equals can be called on any object, because it is in
class Object, but the method compareTo is not in class Object,
so it is not in every object
• If a class has a compareTo method, that is called its natural order
36
Implementing compareTo
• If you wanted Rectangles to be ordered by area in natural
order, but not considered equal unless they have the same
height and width, you would need to implement compareTo in
Rectangle in a way that did more comparison when the areas
are equal:
public int compareTo(Rectangle that) {
int areaDiff = this.area()-that.area();
if(areaDiff == 0) {
int heightDiff = this.height-that.height;
if(heightDiff == 0)
return this.width-that.width;
else
return heightDiff;
}
return areaDiff;
}
• You can use that.height rather than that.getHeight(),
and similar with width, because this is inside class Rectangle
where the private variables height and width are declared
37
Using TreeSet<E>
• With compareTo as given for Rectangle and this code:
Set<Rectangle> recs = new TreeSet<>();
recs.add(rec1);
recs.add(rec2);
recs.add(rec3);
recs.add(rec4);
• int pos = recs.indexOf(trec) could not be used because
there is no indexOf method in Set<E>
• recs.contains(trec) would return true because trec has
the same area as rec4
• recs.size() would return 3 because rec3 has the same area
as rec1, so recs.add(rec3) does not cause it to be added
38
Using Comparator<T> (1)
• Another way of creating a TreeSet<E> is like this:
Set<Rectangle> rset = new TreeSet<>(comp);
where comp is defined as of type Comparator<Rectangle>
• Comparator<T> is an interface which contains the method
header: public int compare(T obj1, T obj2)
• What this means is that in this case a separate object of type
Comparator<Rectangle> is passed to be used by the
TreeSet<T> with T set to Rectangle
• One way of doing this is to declare comp by:
Comparator<Rectangle> comp = new Comparator<>() {
public int compareTo(Rectangle rec1, Rectangle rec2) {
return rec1.getHeight()-rec2.getHeight(0;
} }
• This is using an anonymous class, with the one method required in
a Comparator object declared in the call of a constructor for it
39
• Otherwise a separate class could be declared
class HeightComp implements Comparator<Rectangle> {
public int compareTo(Rectangle rec1, Rectangle rec2) {
return rec1.getHeight()-rec2.getHeight(0;
}
}
with comp declared as:
Comparator<Rectangle> comp = new HeightComp();
• Then passing comp as the argument to a constructor of a
TreeSet<Rectangle> object sets it as one where the
Rectangles are stored in order of their height
• There would not be two with the same height as two Rectangles
whose height are the same will be considered as equal, and a Set
does not contain two objects that it considers to be equal
• So note that TreeSet<E> does not use equals to check with an
element being added to it is equal to one already there, it uses the
same method used to check the ordering of its elements
40
• Here is a generalised method that goes through a list and returns
the highest element in that list according to a Comparator<T>
argument:
static <E> E most(List<E> list,
Comparator<? super E> comp) {
E biggest=list.get(0);
E next=list.get(i);
if(comp.compare(next,biggest)>0) biggest=next;
}
return biggest;
}
• Using Comparator<? super E> means that E could be set to a
type, and the Comparator could be for a supertype of that type
• For example, it could take a List<SaveAccount> and a
Comparator<SaveAccount>, but it could also take a
List<SaveAccount> and a Comparator<Account>
41
• Here is an example of a Comparator class which does require an
argument in its constructor:
class ClosestTo implements Comparator<Account> {
private int val;
public ClosestTo(int amount) {
val=amount;
}
public int compare(Account acc1, Account acc2) {
int diff1 = Math.abs(acc1.balance()-val);
int diff2 = Math.abs(acc2.balance()-val);
return diff2-diff1;
}
}
• Then if comp is set to new ClosestTo(bal) a comparison
comp.compare(accA,accB) will give as the highest of the
Accounts referred to by accA and accB the one whose balance
is closest to bal
42
Using Natural Order
• To write generalised code that works for any type that has a natural
order, you use a type variable E which must be declared as set to a
class that has a method (in interface Comparable<T>) with header:
int compareTo(T obj)
• But T could instead by a subclass of a class S with header
int compareTo(S obj)
• Here is a version of the previous method most where the element
returned is the biggest by natural order rather than by a
Comparator:
static <E extends Comparable<? super E>> E most(List<E> list){
E biggest=list.get(0);
E next=list.get(i);
if(next.compareTo(biggest)>0) biggest=next;
}
return biggest;
}
43
How TreeSet<E> works underneath
• A TreeSet<E> created by a zero-argument constructor must be
one where E has a natural order, and its elements are compared
with each other by obj1.compareTo(obj2) where obj1 and
obj2 are of type E
• Or a TreeSet<E> is created by a constructor which provide an
object comp of type Comparator<? super E>, where obj1 and
obj2 are compared by comp.compare(obj1,obj2)
• Comparator<? super E> means the comparator could be of
type Comparator<S> where E is a subclass of S
• Inside a TreeSet<E> object is a binary tree storing E objects
• When an object is added, it is put in the tree on the left side if it is
less than the object in the tree root, on the right side if it is greater,
in the root if it is an empty tree, and not stored if it is equal to what
is in the root
• The binary tree is reorganised to keep it balanced, meaning it
takes no more than log2N checks to find if an element is in a
TreeSet<E> that stores N elements
44
How HashSet<E> works underneath
• A HashSet<E> works by storing elements in an array
• The position of an obj element is obj.hashCode()%length in
its array, where length is the length of the array
• In Java m%n evaluates to the remainder when m is divided by n, for
example, if m is 523 and n is 100 then m%n is 23
• As this means several elements could be stored at the same
position, each position in the array stores a list of elements
• If an element would be added to the list at a position, but an object
stored in the list at that position is equal to it according to the
equals method, then the element is not added
• Every object has the method hashCode(), as it is another method
that is inherited from Object
• The default method for hashCode() means that it returns a
different value for every object, so that is why they are distributed
evenly in the array inside a HashSet<E>
45
Overriding hashCode()
• A Set<E> referred to by set should work by set.add(obj2) not
adding anything to the set if what obj2 refers to is equal to what
obj1 which is already contained in the set refers to
• As noted, a TreeSet<E> tests that by whether
obj1.compareTo(obj2) returns 0, or by whether
comp.compare(obj1,obj2) returns 0, so it does that rather
than by testing whether obj1.equals(obj2) returns true
• A HashSet<E> does test whether obj1.equals(obj2), but only
in the list at position obj2.hashCode()%length in its array
• So for HashSet<E> to work properly, obj1.hashCode() and
obj2.hashCode() should return the same value
• However, the hashCode() method inherited from Object would
mean obj1.hashCode() and obj2.hasCode() would return
different values
• So for HashSet<E> to work properly, when equals is overridden,
hashCode should also be overridden so that it gives the same
value for obj1 and obj2 if ob1.equals(obj2) returns true
46
Mutable Objects
• An issue that needs to be considered is whether it makes sense to
override equals for objects that may have their content changed,
when that content is used in equals
• For example, if we do
• Then rec1.equals(rec2) returns false, but if we do
rec2.setHeight(10) then rec1.equals(rec2) returns true
• If we had a variable rset referring to a HashSet<Rectangle>
object, if we did rset.add(rec1) and rset.add(rec2) before
the call of rec2.setHeight(10) then rec2 would be in the set,
but if we did rset.add(rec2) after, it would not
• If rset.add(rec2) was before the call which changed its height
from 12 to 10, it would continue to be in the position in the set
where it was put when its height was 12, and the set would then
have two Rectangles in it that are equal
47
Map<K,V>
• The interface type Map<K,V> is the generalised type of
variables to refer to collections where elements of whatever type
V is set to are indexed by elements of whatever type K is set to
• For example, Map<String,Account> accs is a variable that
refers to a collection of Account objects indexed by Strings
• Then if name is of type String, and acc is of type Account,
accs.put(name,acc) stores acc in accs indexed by name
• accs.get(name) returns the Account indexed by name, or
null if there is no Account indexed by name
• In a Map<K,V> only one V object can be indexed by a particular
K object, so if there is already an Account object in accs
indexed by name, accs.put(name,acc) will replace it by the
one referred to by acc, it also returns the one that was replaced
• accs.remove(name) will return the Account indexed by
name and remove it from accs, or return null if no Account is
indexed in accs with name
48
HashMap<K,V> and TreeMap<K,V>
• In Java’s built-in collection code, the interface type Map<K,V> is
implemented by HashMap<K,V> and TreeMap<K,V>
• These are implemented and work similar to what HashSet<K>
and TreeSet<K> would do, but also adding a value V to the
position where a value of K is stored
• So TreeMap<K,V> stores elements of type V in the natural
order of their index K, but like TreeSet<E> can also be set to
store elements in an order given by a Comparator<? super K>
(that is Comparator<K> or Comparator<S> where S is a
supertype of K) passed as an argument to the constructor
• HashMap<K,V> stores elements in what seems to be a random
order, and like HashSet<E> a check whether an element is
stored in it can be done in one step, providing that hashCode()
for K is set so that elements are unlikely to have the same
hashCode value, unless they are equal, in which case they must
have the same hashCode value
49
keySet() and values() (1)
• If map refers to Map<K,V> then map.keySet() returns a
Set<K> that stores the K values used as indexes in what map
refers to
• If what map refers to is a TreeMap<K,V> then the Set<K>
returned by map.keySet<> is such that going through it using a
for-each loop will go through it in the order of the K objects
• So to print each element in a map in the order they are stored,
the following, with K the type used when map was declared:
for(K index : map.keySet())
System.out.println(map.get(index));
would do that
• Also, map.values() returns a collection of the elements of type
V stored in the map, so the following will also print them all in the
order they are stored, with V what was used for map:
for(V obj : map.values())
System.out.println(obj);
50
Collection<E>
• What is returned by map.keySet() is of type Set<K> because
each object of type K can only be stored in it once
• What is returned by map.values() is not of type Set<V>
because the same object could be stored more than once in the
map if indexed by more than one K object
• What is returned by map.values() is of type Collection<V>,
which is an interface that is a supertype of both List<V> and
Set<V>
• Generalised code that could work for both Set<E>s and
List<E>s can be given by using Collection<E> as its
parameter type
• Only the methods that are in Set<E> can be called on a variable
of type Collection<E>, but if it refers to a List<E> object
they will work as List<E> methods
51
keySet() and values() (2)
• If what map refers to is a TreeMap<K,V> then the Set<K> that
keys is set to by keys=map.keySet() is such that going
through it using a for-each loop will go through it in the order of
the K objects, similarly with vals set by val=map.values()
• Also, what is returned by map.keySet() and map.values()
remains linked to the TreeMap<K,V> that have come from, so
keys.remove(index) will remove from map what is indexed
by index, and vals.remove(obj) will remove obj from the
map, but only one occurrence if it occurs more than once
• If you call keys.add(index) or vals.add(obj) it will cause
an exception to be thrown when that code is reached, because it
is not possible to add an index without a value or a value without
an index
• There are other methods that can be called on Map<K,V> object,
what has been covered here are the most important ones to use
and get a general understanding of how a Map<K,V> works
52
Other methods in Collection<E>
• Here are some of the other methods that are in Collection<E>
and so can be called on any List<E> or Set<E> object:
o contains(E obj) returns true if the collection contains
what obj refers to (or equal to it), otherwise false
o containsAll(Collection<? extends E> c) returns
true if the collection contains all the objects that are in the
collection that c refers to, otherwise false
o addAll(Collection<? extends E> c) adds to the
collection all the elements from the collection that c refers to
o removeAll(Collection<? extends E> c) removes
from the collection all the elements from the collection that c
refers to
• Actually, the argument of contains is type Object rather than
E, so it can take any argument of any object type, though it would
always return false if the object it referred to was not of type E
or a subtype of E
53
Iterator<E>
• The final method in Collection<E> we will look at is
iterator() which returns an object of type Iterator<E>
• With an Iterator<E> object created by
it=coll.iterator(), each time it.next() is called, it
returns a different object from the collection referred to by coll
• it.hasNext() returns true if there are further elements to be
returned by calls of it.next(), otherwise it returns false
• So:
for(Iterator<E> it=coll.iterator(); it.hasNext(); )
{E obj=it.next(); ... }
is equivalent to the for-each loop:
for(E obj : coll) { ... }
• it.next() will throw an exception if it is called when there are
no more elements in the collection still to be returned, so
it.hasNext() should always be used to do the check
necessary to stop that from happening
54
remove() in Iterator<E>
• As noted previously, in
for(E obj : coll) { ... }
when it returns to the header, it will throw an exception if ...
caused the collection referred to by coll to be changed
• That is because a for-each loop works underneath by using an
Iterator, and if it was set by it=coll.iterator() then
it.next() will throw an exception if what is inside the collection
referred to by coll was changed after that
• However, there is one way a collection can be changed when it is
being processed by an iterator, which is calling it.remove()
• The call it.remove() will remove from the collection coll the
element that was returned by the last call of it.next()
• Some forms of collections cannot be changed at all, and in that
case a call it.remove() will throw an exception
55
ListIterator<E>
• There is a method listIterator() in interface List<E> which
returns an object of type ListIterator<E>
• ListIterator<E> is a subtype of Iterator<E>, it has the
same methods as Iterator<E> and some additional ones
• The additional methods allow the iterator to be moved backwards
as well as forwards on the list it refers to
• If listIt refers to a ListIterator<E> then
listIt.previous() returns the object in the position before
the last one it returned by listIt.next(), and changes it so
that listIt.next() would then return what listIt.next()
returned previously
• listIt.nextIndex() gives the index in the list of the element
that would be returned if listIt.next() was then called
• listIt.set(val) changes to val what is stored in the
position of the element returned by the last call of
listIt.next() or listIt.previous()
56
Primitive Wrapper Types
• As has been mentioned, a type variable in Java cannot be set to a
primitive type, so a collection of integers has to use type Integer,
rather than type int, as in ArrayList<Integer>
• Similarly, an ArrayList of char values would have to be
ArrayList<Character> rather than ArrayList<char>
• These are known as wrapper types, they refer to objects which
contain a variable of the primitive type inside them
• The wrapper type of double is Double, and for all other primitive
types the wrapper type is the same name but with an initial upper
case letter
• The primitive wrapper classes also have some static methods that
deal with issues of their types, Integer.parseInt(str) is an
example of using a static method in Integer
• Character has methods like Character.isUpperCase(ch) that
returns a boolean saying whether what ch stores is an upper case
character
57
Integer and int
• If num is a variable of type Integer, then num.intValue()
returns the int value that it wraps
• It is rarely needed to be used, because if a variable of type
Integer is used where a variable of type int is required, the
value is automatically converted to an int value and vice versa
• For example, the header of the get method in List<E> is:
E get(int index)
• Which means if nums is of type List<Integer> then it takes an
argument of type int and returns a reference to an Integer
object, but you could call int p = nums.get(num);
• However, the two different remove methods:
E remove(int index)
boolean remove(E object)
means that if nums is of type List<Integer> then
nums.remove(val) does the first of these if val is of type int
and the second if val is of type Integer
58
Summary: overriding equals
• What has been covered here is basic aspects of programming in
Java that you need to be familiar with in order to be able to say
that you really are someone who knows how to program in Java
• To properly explain how built-in collection types can be used, we
needed to cover some other important aspects of Java
• One of these was the method equals, it is important to
understand the difference between obj1==obj2 and
obj1.equals(obj2)
• If the method equals has not been overridden in the class of the
object that obj1 refers to (or in a superclass) then obj1==obj2
and obj1.equals(obj2) will work the same, evaluating to true
only if obj1 and obj2 refer to the same actual object
• You can override equals so that obj1.equals(obj2) returns
true if they refer to separate objects that have the same content
• Built-in collection code, for example list.indexOf(obj),
makes use of equals, so overriding equals affects how it works,
59
Summary: ordering objects
• It is common to want to have a way of ordering objects so that one
may be considered greater than or less than another
• There are two ways of doing this which are used in built-in code,
such as the collection classes TreeSet<E> and TreeMap<K,V>
• One is to put a method compareTo in a class so that
obj1.compareTo(obj2) returns a negative integer if what obj1
refers to is considered less than what obj2 refers to, a positive
integers if obj1 is considered greater than obj2, and 0 is they are
considered equal
• Another is to use a separate object of type Comparator<T> to do
the comparison, so that if comp refers to that object then
comp.compare(obj1,obj2) gives the comparison
• Using a Comparator<E> means the same objects can be ordered
in different ways, a simple example is a Comparator<String>
that orders Strings in order of length rather than alphabetic order
60
Summary: Type Variables
• Type variables are a way of generalising code
• If you want code that works with several objects that need to be of
the same type, but it doesn’t make any particular use of the type
you can declare a type variable E at the head of a class or before
the return type of an individual method by <E>
• ArrayList<E> is an example of this, it means an ArrayList
can only store objects of the same type, so you know that if accs
is a variable of of type ArrayList<Account>, that is with E set to
Account, accs.get(i) will return an object of type Account
• When a generalised method is called, with parameter types that
are type variables or use type variables, it will be checked when
the code is compiled that the arguments are such that the type
variable can be set to a particular type
• A type variable declared as <E> can be used as the type of
variables that refer to objects, they can only have methods from
Object called on them, which is the superclass of all objects
61
Summary: Type variables with extends
• A type variable declared as <E extends Type> where Type is
any object type can only be set to Type or a subtype of Type
• Then when E is used a type for a variable that refers to objects,
methods from class Type can be called on them
• This is needed because although a variable of type Type can refer
to an object of actual type SubType where SubType is a subtype
of Type, a variable of type Gen<Type> cannot refer to an object of
actual type Gen<SubType> (for any type Gen with a type variable)
• A variable of type Gen<Type> can refer to an object of type
SubGen<Type> where SubGen<T> is a subclass of Gen<T>
• A variable of type Gen<? extends Type> is another way of
dealing with Gen<E> with E set as <E extends Type>
• A variable of type Gen<? super Type> can refer to an object of
type Gen<Type> or Gen<UpType> where Type is a subtype of
UpType, this can be used for objects whose task is to perform
particular operations on objects of type Type, such as
Comparator objects
62
Summary: Collection Classes
• You can use classes like ArrayList<E> or TreeSet<E> or
HashMap<K,V> for general collection use, you don’t need to know
the details of how they work underneath, that is you only need to
know WHAT they do, not HOW they do it
• ArrayList<E> works like an array, objects are stored at a
position given by an integer value
• However, where arrays are of a fixed size, the size of an
ArrayList<E> can change by adding or removing objects, also
when an object is added or removed those following it change their
position going up by one or down by one
• A TreeSet<E> stores elements in order, and if an element is
added that is already present, no change is made
• There is no index in a TreeSet<E>, but an Iterator<E> can go
through its elements one at a time
• HashMap<K,V> stores elements of type V indexed by elements of
type K
63
Summary: HOW collection classes work
• We have mentioned only briefly what is inside the collection
classes that make them work, that is something that can be looked
at it more detail later
• Although to use a class you don’t need to know the details of what
is in it to make it work, it can help to do so in order to make sure
your code works efficiently
• LinkedList<E> is the same as ArrayList<E> in WHAT it
does, but differs in HOW it does it, you can chose to create an
object of type LinkedList<E> is you want code that works like
an array, but you know that the methods called on it most often are
those where LinkedList<E> works more efficiently
• List<E> is an interface type implemented by ArrayList<E>
and LinkedList<E>, it should be used as the general type so
that the same code will work for ArrayList<E> and
LinkedList<E> objects
• This efficiency only becomes an issue with collections that store
many thousands of elements
64

ECS658 U04 Collection Types

Uploaded by

Copyright:

Available Formats

You might also like

ECS658 U04 Collection Types

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ECS658 U04 Collection Types

Uploaded by

Copyright:

Available Formats

Further

Object Oriented Programming

Java’s Collection Types

You might also like