Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

DAA Experential Learning

Submitted By:Bhaskar Vivek Agarwal

About Assignment:

I have attempted the first question of the assignment which can be divided into
three parts:

Assumption:The two log files are treated as two strings.

1. The first part include the pattern searching algorithm followed by the
usage of hash maps. Time complexity of the first part is O(nlogn)+c, where
c is some constant work done. Space Complexity for the same is O(n).

2. The second part is based on the Longest Common Subsequence(LCS)


algorithm which has Time complexity in my case O(n^2) and space
complexity is O(n).

3. The third part is based on the Huffman encoding algorithm.The time


complexity of this algorithm is O(nlogn) and space complexity is O(n).

Data Structures USED:


st
In 1 part I have used Array/String Data structure followed by hash maps.
Their inbuilt functions of both string and hash maps were very useful.

In 2nd part I have used Array Data Structure which was very useful in
creating the 2-D matrix for the LCS matrix.
2-D LCS matrix schematic diagram

In 3rd part there is main usage of Tree Data structure,Priority Queue and
hash maps.

An example of Huffman tree.

An example of priority queue used in encoding the characters.

Algorithms/Pseudocode

1. Pattern_Search and hash_maps:


1.Create two String lists.
2.Use for each loop to traverse and print the frequent common words.
3.Create hashmaps to get the strings index-wise.
4. Using another String list(Cand) the stored words are sorted.
5.Now using the sublist function the top 3 most frequent words are returned.
2. Longest Common Subsequence(LCS):(Pseudocode)

X and Y be two given sequences


Initialize a table LCS of dimension X.length * Y.length
X.label = X
Y.label = Y
LCS[0][] = 0
LCS[][0] = 0
Start from LCS[1][1]
Compare X[i] and Y[j]
If X[i] = Y[j]
LCS[i][j] = 1 + LCS[i-1, j-1]
Point an arrow to LCS[i][j]
Else
LCS[i][j] = max(LCS[i-1][j], LCS[i][j-1])
Point an arrow to max(LCS[i-1][j], LCS[i][j-1])

3.Huffman Coding:

1. Input:-Number of message with frequency count.


2. Output: - Huffman merge tree.
3. Begin
4. Let Q be the priority queue,
5. Q= {initialize priority queue with frequencies of all symbol or message}
6. Repeat n-1 times
7. Create a new node Z
8. X=extract_min(Q)
9. Y=extract_min(Q)
10. Frequency(Z) =Frequency(X) +Frequency(y);
11. Insert (Z, Q)
12. End repeat
13. Return (extract_min(Q))
14. End.

CODES:
st nd
Since 1 and 2 question were done in the same class I am attaching one code for it.
There are commented explanation as well in some places.
Programming Language used: Java, IDE: Eclipse.

import java.util.*;
public class hg {
public static List<String> Pattern_Search(String first,String second)
{
List<String> arr_second = Arrays.asList(second.split(" "));
List<String> list = new ArrayList<String>();
for (String word : first.split(" ")) {
if(arr_second.contains(word))
{
list.add(word);
}
}
if (list == null || list.size() == 0) {
return new ArrayList<String>();
}

Map<String, Integer> map = new HashMap();


for (String s: list) {
map.put(s, map.getOrDefault(s, 0) + 1);
}

List<String> cand = new ArrayList(map.keySet());


Collections.sort(cand, (w1, w2) -> map.get(w1) != map.get(w2) ?
map.get(w2) - map.get(w1) : w1.compareTo(w2));

String St1= LCS(cand.subList(0, 1),cand.subList(1, 2));


String St2= LCS(cand.subList(1, 2),cand.subList(2, 3));
String St3= LCS(cand.subList(0, 1),cand.subList(2, 3));
if(St1.length()>St2.length() && St2.length()>St3.length())
{
System.out.println(St1);
}
else if(St2.length()>St1.length() && St2.length()>St1.length())
{
System.out.println(St2);
}
else
{
System.out.println(St3);
}

return cand.subList(0, 3);//returns the 3 words that are most frequent in the
two strings.
}
public static String LCS(List<String> str1,List<String> str2)
{
String s1=str1.toString();
String s2=str2.toString();
int l1 = s1.length();
int l2 = s2.length();
int dp[][]=new int[l1 + 1][l2 + 1];

// Making LCS Matrix

for (int i = 0; i <= l1; i++)


{
dp[i][0] = 0;
}
for (int i = 0; i <= l2; i++)
{
dp[0][i] = 0;
}
for (int i = 1; i <= l1; i++)
{
for (int j = 1; j <= l2; j++)
{
if (s1.charAt(i - 1) == s2.charAt(j - 1))
{
dp[i][j] = 1 + dp[i - 1][j - 1];
}
else
{
dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
}
}
}

// Storing LCS

String lcs="";
int i = l1, j = l2;
while (i > 0 && j > 0)
{
// If current character in both the strings are same, then current
character is part of LCS
if (s1.charAt(i - 1) == s2.charAt(j - 1))
{
lcs=lcs+s1.charAt(i-1);
i--;
j--;
}

// If current character in X and Y are different & we are moving


upwards
else if (dp[i - 1][j] > dp[i][j - 1])
{
i--;
}
// If current character in X and Y are different & we are moving
leftwards
else
{
j--;
}
}
lcs= new StringBuilder(lcs).reverse().toString();
// Object[] objects = lcs.toArray();
// System.out.println(lcs);
return lcs;
}

public static void main(String args[])


{
System.out.println("Enter sentence 1");
String first=sc.nextLine();
System.out.println("Enter sentence 2");
String second=sc.nextLine();
hg ob=new hg();
Scanner sc=new Scanner(System.in);
System.out.println(ob.Pattern_Search(first,second));
}
}

Taking two strings as input and getting output as:

The first List<string> is the output of 2nd ques and 2nd is the output of 1st.
The most common three words occurring are thef fox and red here,and the lcs of
these strings taken pair wise is ef.

This [ef] is returned to the Huffman class and we get the following code for Huffman.

CODE:

class Huffman {

// recursive function to print the


// huffman-code through the tree traversal.
// Here s is the huffman - code generated.
static Map<String, Integer> Chars = new HashMap();
public static void huff(String word)
{
Scanner sc = new Scanner(word).useDelimiter("");
while (sc.hasNext()) {
String c = sc.next();
//checks if a character is present in the HashMap and updates its count.
if (Chars.containsKey(c)) {
Chars.put(c, Chars.get(c) + 1);
}
//adds character to HashMap if it is not already present in the HashMap
else {
Chars.put(c, 1);
}
}
}
public static void printCode(HuffmanNode root, String s) throws
NullPointerException
{

// base case; if the left and right are null


// then its a leaf node and we print
// the code s generated by traversing the tree.
if (root.left
== null
&& root.right
== null
&& Character.isLetter(root.c)) {

// c is the character in the node


System.out.print(root.c);

return;
}

// if we go to left then add "0" to the code.


// if we go to the right add"1" to the code.

// recursive calls for left and


// right sub-tree of the generated tree.
printCode(root.left, s + "0");
printCode(root.right, s + "1");
}

// main function
public static void xyz() throws NullPointerException
{

Scanner s = new Scanner(System.in);

// number of characters.

// creating a priority queue q.


// makes a min-priority queue(min-heap).
PriorityQueue<HuffmanNode> q
= new PriorityQueue<HuffmanNode>(100, new MyComparator());

for (String x: Chars.keySet()) {

// creating a Huffman node object


// and add it to the priority queue.
HuffmanNode hn = new HuffmanNode();

hn.c =x.charAt(0);
hn.data =Chars.get(x);

hn.left = null;
hn.right = null;

// add functions adds


// the huffman node to the queue.
q.add(hn);
}

// create a root node


HuffmanNode root = null;

// Here we will extract the two minimum value


// from the heap each time until
// its size reduces to 1, extract until
// all the nodes are extracted.
while (q.size() > 1) {

// first min extract.


HuffmanNode x = q.peek();
q.poll();

// second min extract.


HuffmanNode y = q.peek();
q.poll();

// new node f which is equal


HuffmanNode f = new HuffmanNode();

// to the sum of the frequency of the two nodes


// assigning values to the f node.
f.data = x.data + y.data;
f.c = '-';

// first extracted node as left child.


f.left = x;

// second extracted node as the right child.


f.right = y;

// marking the f node as the root node.


root = f;
System.out.println(root);
// add this node to the priority-queue.
q.add(f);
}

// print the codes by traversing the tree


printCode(root, "");
}
}

// node class is the basic structure


// of each node present in the Huffman - tree.
class HuffmanNode {

int data;
char c;

HuffmanNode left;
HuffmanNode right;
}

// comparator class helps to compare the node


// on the basis of one of its attribute.
// Here we will be compared
// on the basis of data values of the nodes.
class MyComparator implements Comparator<HuffmanNode> {
public int compare(HuffmanNode x, HuffmanNode y)
{

return x.data - y.data;


}
}

OUTPUT:

The largest common susequence is encoded as 01 using Huffman encoding.

You might also like