Bhaskar - CSE-2 - DAA Experential Learning

DAA Experential Learning
Submitted By:Bhaskar Vivek Agarwal
About Assignment:
I have attempted the first question of the assignment which can be divided into
three parts:
Assumption:The two log files are treated as two strings.
1. The first part include the pattern searching algorithm followed by the
usage of hash maps. Time complexity of the first part is O(nlogn)+c, where
c is some constant work done. Space Complexity for the same is O(n).
2. The second part is based on the Longest Common Subsequence(LCS)

algorithm which has Time complexity in my case O(n^2) and space
complexity is O(n).
3. The third part is based on the Huffman encoding algorithm.The time

complexity of this algorithm is O(nlogn) and space complexity is O(n).
Data Structures USED:

st
In 1 part I have used Array/String Data structure followed by hash maps.
Their inbuilt functions of both string and hash maps were very useful.
In 2nd part I have used Array Data Structure which was very useful in
creating the 2-D matrix for the LCS matrix.
2-D LCS matrix schematic diagram
In 3rd part there is main usage of Tree Data structure,Priority Queue and
hash maps.
An example of Huffman tree.
An example of priority queue used in encoding the characters.
Algorithms/Pseudocode
1. Pattern_Search and hash_maps:

1.Create two String lists.
2.Use for each loop to traverse and print the frequent common words.
3.Create hashmaps to get the strings index-wise.
4. Using another String list(Cand) the stored words are sorted.
5.Now using the sublist function the top 3 most frequent words are returned.
2. Longest Common Subsequence(LCS):(Pseudocode)
X and Y be two given sequences

Initialize a table LCS of dimension X.length * Y.length
X.label = X
Y.label = Y
LCS[0][] = 0
LCS[][0] = 0
Start from LCS[1][1]
Compare X[i] and Y[j]
If X[i] = Y[j]
LCS[i][j] = 1 + LCS[i-1, j-1]
Point an arrow to LCS[i][j]
Else
LCS[i][j] = max(LCS[i-1][j], LCS[i][j-1])
Point an arrow to max(LCS[i-1][j], LCS[i][j-1])
3.Huffman Coding:
1. Input:-Number of message with frequency count.

2. Output: - Huffman merge tree.
3. Begin
4. Let Q be the priority queue,
5. Q= {initialize priority queue with frequencies of all symbol or message}
6. Repeat n-1 times
7. Create a new node Z
8. X=extract_min(Q)
9. Y=extract_min(Q)
10. Frequency(Z) =Frequency(X) +Frequency(y);
11. Insert (Z, Q)
12. End repeat
13. Return (extract_min(Q))
14. End.
CODES:
st nd
Since 1 and 2 question were done in the same class I am attaching one code for it.
There are commented explanation as well in some places.
Programming Language used: Java, IDE: Eclipse.
import java.util.*;
public class hg {
public static List<String> Pattern_Search(String first,String second)
{
List<String> arr_second = Arrays.asList(second.split(" "));
List<String> list = new ArrayList<String>();
for (String word : first.split(" ")) {
if(arr_second.contains(word))
{
list.add(word);
}
}
if (list == null || list.size() == 0) {
return new ArrayList<String>();
}
Map<String, Integer> map = new HashMap();

for (String s: list) {
map.put(s, map.getOrDefault(s, 0) + 1);
}
List<String> cand = new ArrayList(map.keySet());

Collections.sort(cand, (w1, w2) -> map.get(w1) != map.get(w2) ?
map.get(w2) - map.get(w1) : w1.compareTo(w2));
String St1= LCS(cand.subList(0, 1),cand.subList(1, 2));

if(St1.length()>St2.length() && St2.length()>St3.length())
{
System.out.println(St1);
}
else if(St2.length()>St1.length() && St2.length()>St1.length())
{
}
else
{
}
return cand.subList(0, 3);//returns the 3 words that are most frequent in the
two strings.
}
public static String LCS(List<String> str1,List<String> str2)
{
String s1=str1.toString();
String s2=str2.toString();
int l1 = s1.length();
int l2 = s2.length();
int dp[][]=new int[l1 + 1][l2 + 1];
// Making LCS Matrix
for (int i = 0; i <= l1; i++)

{
dp[i][0] = 0;
}
for (int i = 0; i <= l2; i++)
{
dp[0][i] = 0;
}
for (int i = 1; i <= l1; i++)
{
for (int j = 1; j <= l2; j++)
{
if (s1.charAt(i - 1) == s2.charAt(j - 1))
{
dp[i][j] = 1 + dp[i - 1][j - 1];
}
else
{
dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
}
}
}
// Storing LCS
String lcs="";
int i = l1, j = l2;
while (i > 0 && j > 0)
{
// If current character in both the strings are same, then current
character is part of LCS
if (s1.charAt(i - 1) == s2.charAt(j - 1))
{
lcs=lcs+s1.charAt(i-1);
i--;
j--;
}
// If current character in X and Y are different & we are moving

upwards
else if (dp[i - 1][j] > dp[i][j - 1])
{
i--;
}
// If current character in X and Y are different & we are moving
leftwards
else
{
j--;
}
}
lcs= new StringBuilder(lcs).reverse().toString();
// Object[] objects = lcs.toArray();
// System.out.println(lcs);
return lcs;
}
public static void main(String args[])

{
System.out.println("Enter sentence 1");
String first=sc.nextLine();
System.out.println("Enter sentence 2");
String second=sc.nextLine();
hg ob=new hg();
Scanner sc=new Scanner(System.in);
System.out.println(ob.Pattern_Search(first,second));
}
}
Taking two strings as input and getting output as:
The first List<string> is the output of 2nd ques and 2nd is the output of 1st.
The most common three words occurring are thef fox and red here,and the lcs of
these strings taken pair wise is ef.
This [ef] is returned to the Huffman class and we get the following code for Huffman.
CODE:
class Huffman {
// recursive function to print the

// huffman-code through the tree traversal.
// Here s is the huffman - code generated.
static Map<String, Integer> Chars = new HashMap();
public static void huff(String word)
{
Scanner sc = new Scanner(word).useDelimiter("");
while (sc.hasNext()) {
String c = sc.next();
//checks if a character is present in the HashMap and updates its count.
if (Chars.containsKey(c)) {
Chars.put(c, Chars.get(c) + 1);
}
//adds character to HashMap if it is not already present in the HashMap
else {
Chars.put(c, 1);
}
}
}
public static void printCode(HuffmanNode root, String s) throws
NullPointerException
{
// base case; if the left and right are null

// then its a leaf node and we print
// the code s generated by traversing the tree.
if (root.left
== null
&& root.right
== null
&& Character.isLetter(root.c)) {
// c is the character in the node

System.out.print(root.c);
return;
}
// if we go to left then add "0" to the code.

// if we go to the right add"1" to the code.
// recursive calls for left and

// right sub-tree of the generated tree.
printCode(root.left, s + "0");
printCode(root.right, s + "1");
}
// main function
public static void xyz() throws NullPointerException
{
Scanner s = new Scanner(System.in);
// number of characters.
// creating a priority queue q.

// makes a min-priority queue(min-heap).
PriorityQueue<HuffmanNode> q
= new PriorityQueue<HuffmanNode>(100, new MyComparator());
for (String x: Chars.keySet()) {
// creating a Huffman node object

// and add it to the priority queue.
HuffmanNode hn = new HuffmanNode();
hn.c =x.charAt(0);
hn.data =Chars.get(x);
hn.left = null;
hn.right = null;
// add functions adds

// the huffman node to the queue.
q.add(hn);
}
// create a root node

HuffmanNode root = null;
// Here we will extract the two minimum value

// from the heap each time until
// its size reduces to 1, extract until
// all the nodes are extracted.
while (q.size() > 1) {
// first min extract.

HuffmanNode x = q.peek();
q.poll();
// second min extract.

HuffmanNode y = q.peek();
q.poll();
// new node f which is equal

HuffmanNode f = new HuffmanNode();
// to the sum of the frequency of the two nodes

// assigning values to the f node.
f.data = x.data + y.data;
f.c = '-';
// first extracted node as left child.

f.left = x;
// second extracted node as the right child.

f.right = y;
// marking the f node as the root node.

root = f;
System.out.println(root);
// add this node to the priority-queue.
q.add(f);
}
// print the codes by traversing the tree

printCode(root, "");
}
}
// node class is the basic structure

// of each node present in the Huffman - tree.
class HuffmanNode {
int data;
char c;
HuffmanNode left;
HuffmanNode right;
}
// comparator class helps to compare the node

// on the basis of one of its attribute.
// Here we will be compared
// on the basis of data values of the nodes.
class MyComparator implements Comparator<HuffmanNode> {
public int compare(HuffmanNode x, HuffmanNode y)
{
return x.data - y.data;

}
}
OUTPUT:
The largest common susequence is encoded as 01 using Huffman encoding.

Bhaskar - CSE-2 - DAA Experential Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bhaskar - CSE-2 - DAA Experential Learning

Uploaded by

Copyright:

Available Formats

DAA Experential Learning

Submitted By:Bhaskar Vivek Agarwal

Assumption:The two log files are treated as two strings.

2. The second part is based on the Longest Common Subsequence(LCS)

3. The third part is based on the Huffman encoding algorithm.The time

Data Structures USED:

An example of Huffman tree.

An example of priority queue used in encoding the characters.

1. Pattern_Search and hash_maps:

X and Y be two given sequences

1. Input:-Number of message with frequency count.

Map<String, Integer> map = new HashMap();

List<String> cand = new ArrayList(map.keySet());

String St1= LCS(cand.subList(0, 1),cand.subList(1, 2));

// Making LCS Matrix

for (int i = 0; i <= l1; i++)

// If current character in X and Y are different & we are moving

public static void main(String args[])

Taking two strings as input and getting output as:

// recursive function to print the

// base case; if the left and right are null

// c is the character in the node

// if we go to left then add "0" to the code.

// recursive calls for left and

Scanner s = new Scanner(System.in);

// creating a priority queue q.

for (String x: Chars.keySet()) {

// creating a Huffman node object

// add functions adds

// create a root node

// Here we will extract the two minimum value

// first min extract.

// second min extract.

// new node f which is equal

// to the sum of the frequency of the two nodes

// first extracted node as left child.

// second extracted node as the right child.

// marking the f node as the root node.

// print the codes by traversing the tree

// node class is the basic structure

// comparator class helps to compare the node

return x.data - y.data;

The largest common susequence is encoded as 01 using Huffman encoding.

You might also like