Download as pdf or txt
Download as pdf or txt
You are on page 1of 102

91.

102: Honors Computing II Data Structures

Willie Boag
Spring 2013 May 10, 2013

Table of Contents
Kruskals MST
Abstract, Description Algorithm, Challenges Testing C Implementation

Simple Line Editor


Abstract, Description Drawbacks, Reflection C Implementation

29

Topological Sort
Abstract, Description Algorithm, Testing Reflection C Implementation

50

Bloom Filters
Abstract, Description Benefits, Drawbacks Implementation Choices, Motivation Results C Implementation

69

Fast Fourier Transform


Abstract Fourier Transform Original Algorithm, Fast Fourier Transform A Taste of Recursion Why It Matters, Conclusion C Implementation

84

Appendices
A: Kruskals MST Testing Code B: Topological Sort Testing Code

98

Kruskals MST
Abstract: A minimum spanning tree (MST) of a graph is a new graph that connects all of the vertices of the original in such a way that the sum of the edge weights of the tree is at a minimum. There are many algorithms that have been developed that efficiently find a MST of a graph. Kruskals algorithm is one example of a greedy algorithm.

Description: A graph is a set of vertices and the edges that connect them. The edges on the graph can be either one-directional (directed) or two-directional (undirected). Often times, every edge also has an associated weight. For instance, if you represent a road map as a graph, then the cities would be vertices, roads would be edges, and the time it takes to drive from one city to another one would be represented by the weight of the edge joins them. It is usually very useful to find the minimum spanning tree of a given graph. An MST has the property that all cycles are gone (which means it removes some edges from the original graph). One necessary condition for a MST to exist is that the graph must be connected. Since a MST of a graph can only be formed by removing edges, it would be impossible to connect every edge if the graph already starts disconnected. This issue had to be addressed when I was generating test cases for my program.

Algorithm: Kruskals algorithm is greedy. A greedy algorithm is one that it decides what action to take based on the best immediate choice. The reason this MST algorithm is greedy is because it decides which edge to consider next by selecting the edge of lowest weight. At any point of the process, the graph that we store the answer in represents a forest (a forest is just a collection of unconnected trees). At each iteration of the algorithm, the next minimum weight is selected. A decision is made as to whether to add that edge to the forest- if the edge combines two trees into one larger tree, then add it, but if the edge would create a cycle in a current tree, then discard it. Once the forest has been unified into one large tree, the MST has been found.

Challenges: While writing this program, the two most challenging functions to write were KruskalMST() and set_union(). Because we were given partially completed code to start from, some of the implementation choices were selected for us. As a result, I found that Collection (the array of sets) was a little clumsy to work with. I think I wouldve preferred a set of sets, because removing an element from the set was less intuitive when using the array. Replacing two sets with one set union created holes in the array, which seemed more awkward than it needed to be. The biggest problem that I had during this assignment was freeing all of my allocated space. I probably spent three times as much time trying to find all of my un-freed space as I did actually writing the program. The reason that I knew I had un-freed space was because I ran the command valgrind --leak-check=full --track-origins=yes v which monitored my allocated memory and returned a summary of how many pointers were unfreed at the termination of my program. This command was shown to me last semester, although I do not know much at all about how it actually works. That being said, it was very helpful! After a few days of searching for my memory leaks, I finally fixed them all. The most subtle allocation bug that I saw came from set_union(), where any elements that were in both sets S1 and S2 would have one copy of the data stored in the union while the other copy was forgotten about. I couldve fixed the problem by freeing the extra copy inside set_union(), but I did not want to mutilate the arguments. Ultimately, I decided to copy all of the data into the new set rather than just passing pointers. This made freeing the data much more straightforward.

Testing To test my MST program, I needed lots of test graphs. I wrote a program that generated random graphs. Unfortunately, my first attempt at this resulted in graphs that were not necessarily always connected. Since a MST requires a connected graph, my first approach failed. The solution, however, was very simple. After generating a random graph, I then made an edge from 0 to every other vertex with a very large weight. This allowed the algorithm to function normally but with the back-up edges that always connected everything to 0 (if need be). This ensured that my graphs were always connected.

C Implementation:

Makefile Makefile 8

Header Files globals.h graph.h heap.h queue.h queue_interface.h set.h setinterface.h 9 10 11 12 13 14 15

Source Files main.c globals.c graph.c heap.c queue_interface.c set.c setinterface.c 16 17 18 22 25 26 28

# # Programmer: Willie Boag # # Makefile for Kruskals Minimum Spanning Tree # mst: main.o globals.o graph.o heap.o queue_interface.o set.o setinterface.o gcc g o mst main.o globals.o graph.o heap.o queue_interface.o set.o setinterface.o main.o: main.c graph.h globals.h gcc g ansi pedantic Wall c main.c globals.o: globals.c globals.h gcc g ansi pedantic Wall c globals.c graph.o: graph.c graph.h queue.h set.h queue_interface.h setinterface.h globals.h gcc g ansi pedantic Wall c graph.c heap.o: heap.c heap.h globals.h gcc g ansi pedantic Wall c heap.c queue_interface.o: queue_interface.c queue_interface.h queue.h graph.h globals.h gcc g ansi pedantic Wall c queue_interface.c set.o: set.c set.h globals.h gcc g ansi pedantic Wall c set.c setinterface.o: setinterface.c setinterface.h set.h globals.h gcc g ansi pedantic Wall c setinterface.c clean: rm f *.o

/********************************************************************/ /* Programmer: Willie Boag */ /* */ /* globals.h (Kruskals MST) */ /********************************************************************/ #ifndef _globals #define _globals #define DATA( L ) ( ( L ) > datapointer ) #define NEXT( L ) ( ( L ) > next ) typedef enum { OK, ERROR } status ; typedef enum { FALSE=0 , TRUE=1 } bool ; typedef void *generic_ptr ; extern int compare_vertex( generic_ptr *a, generic_ptr *b ) ; #endif

/********************************************************************/ /* Programmer: Willie Boag */ /* */ /* graph.h (Kruskals MST) */ /********************************************************************/ #ifndef _graph #define _graph #include "globals.h" typedef int vertex ; typedef struct { int weight ; int from ; vertex vertex_number ; } edge ; #define UNUSED_WEIGHT (32767) #define WEIGHT(p_e) ((p_e) > weight ) #define VERTEX(p_e) ((p_e) > vertex_number ) typedef enum { directed, undirected } graph_type ; typedef struct { graph_type type ; int number_of_vertices; edge **matrix ; } graph_header, *graph ; extern extern extern extern extern extern extern #endif status init_graph( graph *p_G, int vertex_cnt, graph_type type ) ; void destroy_graph( graph *p_G ) ; status add_edge( graph G, vertex vertex1, vertex vertex2, int weight ); void graph_size( graph G, int *p_vertex_cnt, int *p_edge_cnt ); status KruskalMST( graph G, graph *T ) ; status write_graph( graph G ) ; int min_weight( graph G ) ;

/***************************************************************/ /* Programmer: Willie Boag */ /* */ /* heap.h (Kruskals MST) */ /***************************************************************/ #ifndef _heap #define _heap #define HEAPINCREMENT 128 #include "globals.h" typedef struct { generic_ptr *base; int nextelement; int heapsize ; } heap ; extern extern extern extern #endif status bool status status init_heap( heap *p_H ) ; empty_heap( heap *p_H ) ; heap_insert( heap *p_H , generic_ptr data , int (*p_cmp_f) () ) ; heap_delete( heap *p_H, int element, generic_ptr *p_data, int (*p_cmp_f)() ) ;

/*******************************************************************/ /* Programmer: Willie Boag */ /* */ /* queue.h (Kruskals MST) */ /*******************************************************************/ #ifndef _queue #define _queue #include "heap.h" typedef heap queue ; #define #define #define #define #endif init_queue( p_Q ) init_heap( (heap *) p_Q) empty_queue( p_Q) empty_heap( (heap *) p_Q) qadd(p_Q, data, p_cmp_f) heap_insert( (heap *) p_Q, data, p_cmp_f ) qremove(p_Q, p_data, p_cmp_f) heap_delete( (heap *) p_Q, 0, p_data, p_cmp_f)

/************************************************************/ /* Programmer: Willie Boag */ /* */ /* queue_interface.h (Kruskals MST) */ /************************************************************/ #ifndef _queueinterface #define _queueinterface #include "queue.h" extern status qadd_edge(queue *p_Q , int from, int to, int weight, int (*p_cmp_func)() ) ; extern status qremove_edge(queue *p_Q, int *from, int *to, int *weight, int (*p_cmp_func)() ) ; #endif

/************************************************************/ /* Programmer: Willie Boag */ /* */ /* set.h (Kruskals MST) */ /************************************************************/ #ifndef _set #define _set #include "globals.h" typedef struct { generic_ptr *base ; generic_ptr *free ; int universe_size ; } set ; extern extern extern extern extern ment ) #endif status status bool status status ; set_insert( set *p_S, generic_ptr element, int (*p_cmp_f)() ) ; init_set( set *p_S, int size ) ; set_member( set *p_S, generic_ptr element, int (*p_cmp_f)() ) ; set_write( set *p_S, status (*p_write_f)() ) ; set_union( set *p_S1, set *p_S2, set *p_S3, int (*p_func_cmp)(), int sizeofele

/*****************************************************************/ /* Programmer: Willie Boag */ /* */ /* setinterface.h (Kruskals MST) */ /*****************************************************************/ #ifndef _setinterface #define _setinterface #include "globals.h" #include "set.h" extern status vertex_set_insert( set *p_S , int v ) ; #endif

/******************************************************************/ /* Programmer: Willie Boag */ /* */ /* main.c (Kruskals MST) */ /******************************************************************/ #include <stdio.h> #include <stdlib.h> #include "graph.h" #include "globals.h" int main( int argc, char *argv[] ){ FILE *fileptr ; int weight, from, to, numberofvertices graph G, T ; fileptr = fopen( argv[1], "r"); fscanf( fileptr, "%d", &numberofvertices ); ;

init_graph( &G, numberofvertices, undirected ); init_graph( &T, numberofvertices, undirected ); while ( fscanf( fileptr, "%d %d %d", &from, &to, &weight ) != EOF ) add_edge( G, from, to, weight ) ; printf("\nThe edges of the original graph are: \n" ) ; KruskalMST( G , &T ) ; printf("\nThe edges of the MST are: \n" ) ; write_graph( T ) ; min_weight( T ) ) ; write_graph( G ) ;

printf("\n The minimum total weight is %d.\n ", destroy_graph( &G ) ; destroy_graph( &T ) ; fclose( fileptr ) ; return 0 ; }

/************************************************************/ /* Programmer: Willie Boag */ /* */ /* globals.c (Kruskals MST) */ /************************************************************/ #include "globals.h" extern int compare_vertex( generic_ptr *a, generic_ptr *b ) { return *(int *)a } *(int *)b ;

/*******************************************************************/ /* Programmer: Willie Boag */ /* */ /* graph.c (Kruskals MST) */ /*******************************************************************/ #include #include #include #include #include #include "globals.h" "queue.h" "graph.h" "set.h" "queue_interface.h" "setinterface.h"

#include <stdlib.h> #include <stdio.h> extern status init_graph( graph *p_G, int vertex_cnt, graph_type type ) { graph G ; int i, j ; G = (graph) malloc ( sizeof(graph_header)) ; if ( G == NULL ) return ERROR ; G > number_of_vertices = vertex_cnt ; G > type = type ; G > matrix = (edge **) malloc ( vertex_cnt * sizeof(edge *)); if ( G > matrix == NULL ) { free(G) ; return ERROR ; }

G > matrix[0] = (edge *) malloc(vertex_cnt*vertex_cnt*sizeof(edge)) ; if (G > matrix[0] == NULL ) { free ( G > matrix ); free( G ); return ERROR ; } for ( i = 1 ; i < vertex_cnt; i++ ) G > matrix[i] = G > matrix[0] + vertex_cnt * i ; for ( for G G G } i = 0 ; i < vertex_cnt ; i++ ) ( j = 0 ; j < vertex_cnt ; j++ ) { > matrix[i][j].weight = UNUSED_WEIGHT ; > matrix[i][j].vertex_number = j ; > matrix[i][j].from = i ;

*p_G = G ; return OK ; } extern void destroy_graph( graph *p_G ) { free((*p_G) > matrix[0] ) ; free((*p_G) > matrix ) ; free(*p_G) ; p_G = NULL ; } extern status add_edge ( graph G, vertex vertex1, vertex vertex2, int weight ) { if ( vertex1 < 0 || vertex1 > G > number_of_vertices ) return ERROR ; if ( vertex2 < 0 || vertex2 > G > number_of_vertices ) return ERROR ; if ( weight <= 0 || weight >= UNUSED_WEIGHT ) return ERROR ;

G > matrix[vertex1][vertex2].weight = weight ; if( G > type == undirected ) G > matrix[vertex2][vertex1].weight = weight ; return OK ; } extern void graph_size ( graph G, int *p_vertex_cnt , int *p_edge_cnt ) { int i, j, edges ; *p_vertex_cnt = G > number_of_vertices ; edges = 0 ; for ( i = 0 ; i < G > number_of_vertices ; i++ ) for ( j = 0 ; j < G > number_of_vertices ; j++ ) if ( G > matrix[i][j].weight != UNUSED_WEIGHT ) edges++ ; if ( G > type == undirected ) edges /= 2 ; *p_edge_cnt = edges ; } extern edge *edge_iterator( graph G, vertex vertex_number, edge *p_last_return ) { vertex other_vertex ; if (vertex_number < 0 || vertex_number >= G > number_of_vertices) return NULL ; if (p_last_return == NULL) other_vertex = 0 ; else other_vertex = VERTEX(p_last_return) + 1 ; for ( ; other_vertex < G> number_of_vertices ; other_vertex++) { if (G > matrix[vertex_number][other_vertex].weight != UNUSED_WEIGHT) return &G > matrix[vertex_number][other_vertex] ; } return NULL ; } static int What_Set_Am_I_In( set *Collection, int v, int n ) { int i ; for ( i = 0 ; i < n ; i++ ) { if ( Collection[i].base == NULL ) continue ; if ( set_member( &Collection[i], (generic_ptr) &v, compare_vertex ) == TRUE ) break ; } if ( i == n ) return 1 ; return i ; } static int collection_size( set *Collection, int numberofvertices ) { int i, size = 0 ;

for( i = 0 ; i < numberofvertices ; i++ )

if ( Collection[i].base != NULL ) size++ ; return size ; } static int compare_weight( generic_ptr a, generic_ptr b ) { return WEIGHT((edge *) a) WEIGHT((edge *) b) ; } extern status KruskalMST( graph G, graph *T ) { int i, S1, S2, numberofvertices, numberofedges ; int from, to, weight ; queue Q ; set *Collection, S3 ; edge *p_edge ; generic_ptr *item ; graph_size( G, &numberofvertices, &numberofedges ) ; /* Special case: graph has only one vertex. */ if (numberofvertices == 1) return OK ; /* Construct a priority queue Q containing all edges. */ init_queue( &Q ) ; for (i = 0 ; i < numberofvertices ; i++) { p_edge = NULL ; while ( (p_edge = edge_iterator( G, i, p_edge)) != NULL) qadd_edge( &Q, p_edge>from, VERTEX(p_edge), WEIGHT(p_edge), compare_weight ) ; } /* Create an array of sets. */ Collection = (set *) malloc( sizeof(set) * numberofvertices ) ; for (i = 0 ; i < numberofvertices ; i++) { init_set( &Collection[i], 1 ) ; vertex_set_insert( &Collection[i], i ) ; } /* While the tree is not fully formed. */ while ( collection_size(Collection, numberofvertices) > 1 ) { qremove_edge( &Q, &from, &to, &weight, compare_weight ) ; S1 = What_Set_Am_I_In( Collection, from, numberofvertices ) ; S2 = What_Set_Am_I_In( Collection, to, numberofvertices ) ; if ( S1 != S2 ) { init_set( &S3, numberofvertices ) ; set_union( &Collection[S1], &Collection[S2], &S3, compare_vertex, sizeof(vertex) ) ; /* Free data of S1 and S2. */ for (item = Collection[S1].base ; item < Collection[S1].free ; item++) free( *item ) ; free( Collection[S1].base ) ; for (item = Collection[S2].base ; item < Collection[S2].free ; item++) free( *item ) ;

free( Collection[S2].base ) ; Collection[S2].base = NULL ; Collection[S1] = S3 ; add_edge( *T, S1, S2, weight ) ; } } /* Free all reserved space. */ while ( empty_queue( &Q ) == FALSE ) qremove_edge( &Q, &from, &to, &weight, compare_weight ) ; free( Q.base ) ; for ( i = 0 ; i < numberofvertices ; i++) free( Collection[S1].base[i] ) ; free( Collection[S1].base ) ; free( Collection ) ; return OK ; } extern status write_graph( graph G ) { int i, j, numberofvertices, numberofedges ; graph_size( G, &numberofvertices, &numberofedges ); for ( i = 0 ; i < numberofvertices ; i++ ) {

for ( j = 0 ; j < numberofvertices ; j++ ) { if( G > matrix[i][j].weight != UNUSED_WEIGHT ) { printf( printf( printf( printf( printf( } } } return OK ; } extern int min_weight( graph G ) { int sum = 0, i, j, numberofvertices, numberofedges ; graph_size( G, &numberofvertices, &numberofedges ); for ( i = 0 ; i < numberofvertices ; i++ ) { "\n") ; "%d ", G > matrix[i][j].from ) ; "%d ", G > matrix[i][j].vertex_number ) ; "%d ", G > matrix[i][j].weight ) ; "\n" ) ;

for ( j = 0 ; j < numberofvertices ; j++ ) { if( G > matrix[i][j].weight != UNUSED_WEIGHT ) sum = sum + G > matrix[i][j].weight ; } } return sum/2 ; }

/***************************************************************/ /* Programmer: Willie Boag */ /* */ /* heap.c (Kruskals MST) */ /***************************************************************/ #include "globals.h" #include "heap.h" #include <stdlib.h> static void siftdown( heap *p_H, int parent, int (*p_cmp_f)() ) ; static void siftup( heap *p_H, int element, int (*p_cmp_f)() ); extern status init_heap( heap *p_H) { p_H > base = (generic_ptr *) malloc(HEAPINCREMENT*sizeof(generic_ptr)) ; if ( p_H > base == NULL ) return ERROR ; p_H > nextelement = 0 ; p_H > heapsize = HEAPINCREMENT ; return OK; } extern bool empty_heap ( heap *p_H ) { return (p_H > nextelement == 0) ? TRUE : FALSE ; } extern status heap_insert( heap *p_H, generic_ptr data, int (*p_cmp_f)() ) { generic_ptr *newbase ; /* * Insert data into p_H, p_cmp_f() is a comparison function that * returns a value less than 0 if its first argument is less than * its second, 0 if the arguments are equal. Otherwise, p_cmp_f() * returns a value greater than 0. * * The data is inserted in the heap by placing it at the end and * using siftup() to find its proper position. */ if (p_H > nextelement == p_H > heapsize) { /* * Not enough space in the array, so more must be allocated. */ newbase = (generic_ptr *) realloc( p_H > base, (p_H > heapsize + HEAPINCREMENT) * sizeof(generic_ptr)) ; if (newbase == NULL) return ERROR ; p_H > base = newbase ; p_H > heapsize += HEAPINCREMENT ; } p_H > base[p_H > nextelement] = data ; siftup( p_H, p_H > nextelement, p_cmp_f ) ; p_H > nextelement ++ ; return OK ; } extern void siftup( heap *p_H, int element, int (*p_cmp_f)() ) {

int parent ; int cmp_result; generic_ptr tmpvalue ; if ( element == 0 ) return ; parent = (element 1)/2; cmp_result = (*p_cmp_f)(p_H > base[element], p_H > base[parent] ); if (cmp_result >= 0 ) return ; tmpvalue = p_H > base[element] ; p_H > base[element] = p_H > base[parent] ; p_H > base[parent] = tmpvalue ; siftup(p_H, parent, p_cmp_f ); return; } extern status heap_delete( heap *p_H, int element, generic_ptr *p_data, int (*p_cmp_f)() ){ if ( element >= p_H > nextelement ) return ERROR ; *p_data = p_H > base[element] ; p_H > nextelement ; if ( element != p_H > nextelement ) { p_H > base[element] = p_H > base[p_H > nextelement ] ; siftdown(p_H, element, p_cmp_f ); } return OK ; } static void siftdown ( heap *p_H, int parent, int (*p_cmp_f)() ) { /* * p_H is a heap except for parent. Find the correct place for parent * by swapping it with the smaller of its children. If a swap is * made, p_H is a heap except for the childs position, so call * siftdown() recursively. */ int leftchild, rightchild, swapelement ; int leftcmp, rightcmp, leftrightcmp ; generic_ptr tmpvalue ; leftchild = 2 * parent + 1 ; rightchild = leftchild + 1 ; if (leftchild >= p_H > nextelement) /* * No children. */ return ; leftcmp = (*p_cmp_f)(p_H>base[parent], p_H>base[leftchild] ) ; if (rightchild >= p_H > nextelement) { /* * No right child. */ if (leftcmp > 0) { tmpvalue = p_H>base[parent] ; p_H>base[parent] = p_H>base[leftchild] ; p_H>base[leftchild] = tmpvalue ; }

return ; } rightcmp = (*p_cmp_f)( p_H>base[parent], p_H>base[rightchild] ) ; if (leftcmp > 0 || rightcmp > 0) { /* * Two children. Swap with smaller child. */ leftrightcmp = (*p_cmp_f)( p_H>base[leftchild], p_H>base[rightchild] ) ; swapelement = (leftrightcmp < 0) ? leftchild : rightchild ; tmpvalue = p_H>base[parent] ; p_H>base[parent] = p_H>base[swapelement] ; p_H>base[swapelement] = tmpvalue ; siftdown( p_H, swapelement, p_cmp_f ) ; } return ; }

/****************************************************************/ /* Programmer: Willie Boag */ /* */ /* queue_interface.c (Kruskals MST) */ /****************************************************************/ #include #include #include #include "queue_interface.h" "globals.h" "queue.h" "graph.h"

#include <stdlib.h> extern status qadd_edge( queue *p_Q , int from, int to, int weight, int (*p_cmp_func)( ) edge *p_edge = ( edge * ) malloc(sizeof( edge ) ) ; if ( p_edge == NULL ) return ERROR ; p_edge > from = from ; p_edge > vertex_number = to ; p_edge > weight = weight ; if (qadd(p_Q, (generic_ptr) p_edge, p_cmp_func) == ERROR) { free(p_edge) ; return ERROR ; } return OK ; } extern status qremove_edge( queue *p_Q, int *from, int *to, int *weight, int (*p_cmp_func)() ) { edge *p_edge ; if ( qremove( p_Q, (generic_ptr *) &p_edge, p_cmp_func ) == ERROR ) return ERROR ; *from *to *weight = p_edge > from ; = p_edge > vertex_number ; = p_edge > weight ; ){

free( p_edge ) ; return OK ; }

/******************************************************************/ /* Programmer: Willie Boag */ /* */ /* set.c (Kruskals MST) */ /******************************************************************/ #include <stdio.h> #include <stdlib.h> #include "set.h" #include "globals.h" #define MAX(a , b) ( ( (a) > (b) ) ? (a) : (b) ) #define MINIMUM_INCREMENT 100 #define member_count(p_S) ( (int) ((p_S) > free (p_S) > base) ) typedef char byte ; static status memcopy( byte *to, byte *from, int count ) { while ( count > 0 ) *to++ = *from++ ; return OK ; } extern status init_set( set *p_S, int size ) { /* * Initialize a set of size elements. This set implementation * uses a dynamic array. */ p_S > universe_size = MAX(size, MINIMUM_INCREMENT) ; p_S > base = (generic_ptr *) malloc( p_S>universe_size * sizeof(generic_ptr)) ; if (p_S>base == NULL) return ERROR ; p_S>free = p_S> base ; return OK ; } extern status set_insert( set *p_S , generic_ptr element, int (*p_cmp_f)() ) { generic_ptr *newset ; /* * Insert element into the set. The dynamic array should * grow if needed. */ if ( set_member( p_S, element, p_cmp_f ) == TRUE ) return OK ; if ( p_S > universe_size == member_count(p_S) ) { newset = (generic_ptr *) realloc( p_S > base, (p_S>universe_size + MINIMUM_INCREMENT) * sizeof(generic_ptr) ) ; if (newset == NULL) return ERROR ; p_S > base = newset ;

p_S > free = p_S > base + p_S > universe_size ; p_S > universe_size += MINIMUM_INCREMENT ; } *p_S>free = element ; p_S>free++ ; return OK ; } bool set_member( set *p_S, generic_ptr element, int (*p_cmp_f)() ) { /* * Determine if element is in the set (using the passed comparison * function p_cmp_f()). Search the set sequentially. */ generic_ptr *item ; for (item = p_S>base ; item < p_S>free ; item++) if ( (*p_cmp_f)(*item, element) == 0) return TRUE ; return FALSE ; } extern status set_union( set *p_S1, set *p_S2, set *p_S3, int (*p_cmp_f)(), int sizeofelemen t ) { /* * Store the union of sets *p_S1 and *p_S2 into the set *p_S3. */ generic_ptr *item, tmp ; for (item = p_S1>base ; item < p_S1>free ; item++) { tmp = malloc( sizeofelement ) ; memcopy( (byte *) tmp, (byte *) *item, sizeofelement ) ; set_insert( p_S3, tmp, p_cmp_f) ; } for (item = p_S2>base ; item < p_S2>free ; item++) { if (set_member( p_S3, *item, p_cmp_f) == TRUE) continue ; tmp = malloc( sizeofelement ) ; memcopy( (byte *) tmp, (byte *) *item, sizeofelement ) ; set_insert( p_S3, tmp, p_cmp_f) ; } return OK ; } extern status set_write( set *p_S, status (*p_write_f)( ) ) { generic_ptr *item ; for ( item = p_S > base ; item < p_S > free; item++ ) (*p_write_f)(*item) ; return OK ; }

/*****************************************************************/ /* Programmer: Willie Boag */ /* */ /* setinterface.c (Kruskals MST) */ /*****************************************************************/ #include #include #include #include "globals.h" "setinterface.h" "set.h" <stdlib.h>

extern status vertex_set_insert( set *p_S , int v ) { int *p_int = ( int * ) malloc( sizeof( int ) if ( p_int == NULL ) return ERROR ; *p_int = v ; if ( set_insert( p_S, (generic_ptr) p_int, compare_vertex ) == ERROR ) { free( p_int ) ; return ERROR ; } return OK ; } ) ;

Simple Line Editor


Abstract: This is a program that can edit text files. Rather than being a full screen editor, it interacts with the data one line at a time.

Description: The Simple Line Editor is a case study in Data Structures: An Advanced Approach Using C by Esakov and Weiss. It utilizes Doubly-Linked Lists. The operations that it can accomplish include insert, delete, print, cut & paste, save, and quit.

Drawbacks: This program cannot add lines to the end of a file. In addition, if you try to cut & paste lines of text to the front of the file, the lines are lost/deleted. Although it is not really a bug, the interface for the program is not user-friendly. The driver cannot process commands that have spaces, which makes the interface messier to deal with.

Reflection Honestly, I did not like this application at all. I first started it in January (when I was going through all of Esakov & Weiss on my own). It was the fourth application that relied on some form of a linked list, and it had a third set of primitives to copy (ordinary, circular, double). Since I had been going through the whole book in about two weeks, I found the program to be very boring- especially since I had just finished working on the LISP interpreter the day before. As a result, I began to dislike the Simple Line Editor. I decided to stop working my way through Esakov & Weiss and start working on other schoolwork. I revisited the code for this program in the first few days of May, and I still did not like it. I felt as though it was neither a learning experience (unlike LISP, which was) nor was it an actually useful application. Finishing the code was not an exciting process. Fortunately, I did manage to have some fun with the code. There were many primitive functions not in the book, which I needed to write myself. They all dealt with traversing their way through the list. I decided that I could make the program more fun by writing all of these functions recursively. That was my favorite part of the assignment.

C Implementation

Makefile Makefile 32

Header Files globals.h dlists.h interface.h user.h 33 34 35 36

Source Files main.c dlists.c interface.c user.c 37 39 43 44

# # Programmer: Willie Boag # # Makefile for Simple Line Editor # sle: main.o dlists.o user.o interface.o gcc ansi pedantic Wall o sle main.o dlists.o user.o interface.o main.o: main.c dlists.h user.h gcc ansi pedantic Wall c g main.c dlists.o: dlists.c dlists.h gcc ansi pedantic Wall c g dlists.c user.o: user.c user.h dlists.h interface.h globals.h gcc ansi pedantic Wall c g user.c interface.o: interface.c interface.h dlists.h gcc ansi pedantic Wall c g interface.c clean: rm f *.o

/************************************************************/ /* Programmer: Willie Boag */ /* */ /* globals.h (Simple Line Editor) */ /************************************************************/ #ifndef _globals #define _globals #define DATA( L ) ( ( L ) > datapointer ) #define NEXT( L ) ( ( L ) > next ) #define PREV( L ) ( ( L ) > previous ) typedef enum { OK, ERROR } status ; typedef enum { FALSE=0, TRUE=1 } bool ; typedef void *generic_ptr ; #define #define #define #define #define #define #define E_IO 1 E_SPACE 2 E_LINES 3 E_BADCMD 4 E_DELETE 5 E_MOVE 6 MAXERROR 7

#define BUFSIZE 80 #endif

/*****************************************************************/ /* Programmer: Willie Boag */ /* */ /* dlists.h (Simple Line Editor) */ /*****************************************************************/ #ifndef _dlists #define _dlists #include "globals.h" typedef struct double_node double_node, *double_list; struct double_node { generic_ptr datapointer; double_list previous; double_list next; } ; extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern extern #endif status allocate_double_node( double_list *p_L, generic_ptr data ) ; void free_double_node( double_list *p_L ) ; status init_double_list( double_list *p_L ) ; bool empty_double_list( double_list L ) ; status double_insert( double_list *p_L, generic_ptr data ) ; status double_append( double_list *p_L, generic_ptr data) ; status double_delete( double_list *p_L, generic_ptr *p_data ) ; status double_delete_node( double_list *p_L, double_list node ) ; void cut_list( double_list *p_L, double_list *p_start, double_list *p_end ) ; void paste_list( double_list *p_target, double_list *p_source ) ; int double_length( double_list L ) ; double_list nth_double_node( double_list L, int n ) ; status double_traverse( double_list L, status (*p_func_f)() ) ; int double_node_number( double_list L ) ; double_list nth_relative_double_node( double_list L, int n ) ; void destroy_double_list( double_list *p_L, void (*p_func_f)() ) ;

/**********************************************************/ /* Programmer: Willie Boag */ /* */ /* interface.h (Simple Line Editor) */ /**********************************************************/ #ifndef _interface #define _interface #include "dlists.h" extern status string_double_append( double_list *p_L, char *buffer ) ; #endif

/****************************************************/ /* Porgammer: Willie Boag */ /* */ /* user.h (Simple Line Editor) */ /****************************************************/ #ifndef _user #define _user #include "globals.h" #include "dlists.h" extern int readfile( char *filename, double_list *p_L ); extern int writefile( char *filename, double_list *p_L ); extern extern extern extern int int int int insertlines( char *linespec, double_list *p_head, double_list *p_current ) ; deletelines(char *linespec, double_list *p_head, double_list *p_current ) ; movelines( char *linespec, double_list *p_head, double_list *p_current ) ; printlines( char *linespec, double_list *p_head, double_list *p_current ) ;

#endif

/********************************************************************/ /* Programmer: Willie Boag */ /* */ /* main.c (Simple Line Editor) */ /********************************************************************/ #include "dlists.h" #include "user.h" #include #include #include #include <stdlib.h> <ctype.h> <string.h> <stdio.h>

void printerror( int errnum ) ; int main( int argc, char *argv[] ) { /* * A simple text editor. */ char filename[BUFSIZ]; char buffer[BUFSIZ]; double_list linelist, currentline; bool file_edited, exit_flag; int rc; init_double_list(&linelist); printf("Enter the name of the file to edit: "); gets(filename); if ((rc = readfile(filename, &linelist)) != 0) { printerror(rc); exit(1); } printf("%d lines read.\n", double_length(linelist)); currentline = nth_double_node(linelist, 1); file_edited = FALSE; exit_flag = FALSE; while (exit_flag == FALSE) { printf("cmd: "); gets(buffer); /* * Implement the following commands: * p print * d delete * i insert * m move * w write * q quit */ switch (toupper(buffer[0])) { case \0: break; case P: rc = printlines(&buffer[1], &linelist, &currentline); if (rc) printerror(rc); break; case D: file_edited = TRUE; rc = deletelines(&buffer[1], &linelist, &currentline); if (rc) printerror(rc); break;

case I: file_edited = TRUE; rc = insertlines(&buffer[1], &linelist, &currentline); if (rc) printerror(rc); break; case M: file_edited = TRUE; rc = movelines(&buffer[1], &linelist, &currentline); if (rc) printerror(rc); break; case W: if (buffer[1] != \0) strcpy(filename, &buffer[1]); rc = writefile(filename, &linelist); if (rc != 0) printerror(rc); else printf("%d lines written\n", double_length(linelist)); file_edited = FALSE; break; case Q: /* * If text has been modified, cant quit without writing * unless you enter q two times in a row. */ if (file_edited == TRUE) { printf("File modified. Enter W to save, Q to discard.\n"); file_edited = FALSE; } else exit_flag = TRUE; break; default: printerror(E_BADCMD); break; } } return 0; } void printerror( int errnum ) { /* * Print error message to standard output. */ static char *errmsg[] = { "io error", "out of memory space", "invalid line specification", "invalid command", "error deleting lines" }; if (errnum < 0 || errnum >= MAXERROR) { printf("System Error. Invalid error number: %d\n", errnum); return; } printf("%s\n",errmsg[errnum1]); return; }

/********************************************************/ /* Programmer: Willie Boag */ /* */ /* dlists.c (Simple Line Editor) */ /********************************************************/ #include #include #include #include <stdio.h> <stdlib.h> "dlists.h" "globals.h"

extern status allocate_double_node( double_list *p_L, generic_ptr data ) { double_list L ; L = (double_list) malloc(sizeof(double_node)); if (L == NULL) return ERROR; *p_L = L; DATA(L) = data; PREV(L) = NULL; NEXT(L) = NULL; return OK; } extern void free_double_node( double_list *p_L ) { free(p_L); *p_L = NULL; return; } extern status init_double_list( double_list *p_L ) { /* * Initialize *p_L by setting the list pointer to NULL. * Always return OK (a different implementation * may allow errors to occur). */ *p_L = NULL; return OK; } extern bool empty_double_list( double_list L ) { /* Return TRUE if L is an empty list, FALSE otherwise. */ return (L == NULL) ? TRUE : FALSE; } extern status double_insert( double_list *p_L, generic_ptr data ) { /* Insert a new node containing data as the first item in *p_L. */ double_list L; if (allocate_double_node(&L, data) == ERROR) return ERROR; if (empty_double_list(*p_L) == TRUE) { PREV(L) = NEXT(L) = NULL; } else {

NEXT(L) = *p_L; PREV(L) = PREV(*p_L); PREV(*p_L) = L; if (PREV(L) != NULL) NEXT(PREV(L)) = L; } *p_L = L; return OK; } extern status double_append( double_list *p_L, generic_ptr data) { /* Append a node to the end of a double_list. */ double_list L, temp; if (allocate_double_node(&L, data) == ERROR) return ERROR; if (*p_L == NULL) { *p_L = L; } else { for ( temp = *p_L ; NEXT(temp) != NULL ; ) temp = NEXT(temp); NEXT(temp) = L; PREV(L) = temp; } return OK; } extern status double_delete( double_list *p_L, generic_ptr *p_data ) { /* * Delete the first node in *p_L and return the DATA in p_data. */ if (empty_double_list(*p_L) == TRUE) return ERROR; *p_data = DATA(*p_L); return double_delete_node(p_L, *p_L); } extern status double_delete_node( double_list *p_L, double_list node ) { /* * Delete node from *p_L. */ double_list prev, next; if (empty_double_list(*p_L) == TRUE) return ERROR; prev = PREV(node); next = NEXT(node); if (prev != NULL) NEXT(prev) = next; if (next != NULL) PREV(next) = prev; if (node == *p_L) { if (next != NULL) *p_L = next; else *p_L = prev; }

free_double_node(p_L); return OK; } extern void cut_list( double_list *p_L, double_list *p_start, double_list *p_end ) { /* *Extract the range of nodes *p_start *p_end from *p_L. */ double_list start, end ; start = *p_start ; end = *p_end ; if (PREV(start)) NEXT(PREV(start)) = NEXT(end) ; if (NEXT(end)) PREV(NEXT(end)) = PREV(start) ; if (*p_L == start) *p_L = NEXT(end) ; PREV(start) = NEXT(end) = NULL ; } extern void paste_list( double_list *p_target, double_list *p_source ) { /* * Take *p_source and put it after *p_target. Assumes * *p_source is the first node in the list. */ double_list target, source, lastnode ; if (empty_double_list(*p_source) == TRUE) /* * Nothing to do. */ return ; if (empty_double_list(*p_target) == TRUE) *p_target = *p_source ; else { source = *p_source ; target = *p_target ; lastnode = nth_double_node(source, 1) ; NEXT(lastnode) = NEXT(target) ; if (NEXT(target) != NULL) PREV(NEXT(target)) = lastnode ; PREV(source) = target ; NEXT(target) = source ; } *p_source = NULL ; } extern int double_length( double_list L ) { if (L == NULL) return 0 ; return double_length( NEXT(L) ) + 1 ;

} extern double_list nth_double_node( double_list L, int n ) { if (L == NULL) return NULL ; if (n == 1) { for ( ; NEXT(L) != NULL ; L = NEXT(L) ) ; return L ; } if (n == 1) return L ; return nth_double_node( NEXT(L), n 1 ) ; } extern status double_traverse( double_list L, status (*p_func_f)() ) { if (L == NULL) return OK ; if ((*p_func_f)(DATA(L)) == ERROR) return ERROR ; return double_traverse( NEXT(L), p_func_f ) ; } extern int double_node_number( double_list L ) { if (L == NULL) return 0 ; return double_node_number( PREV(L) ) + 1 ; } extern double_list nth_relative_double_node( double_list L, int n ) { if (n == 0) return L ; if (n == 1) return PREV(L) ; if (n > 0) return nth_relative_double_node( NEXT(L), n 1 ) ; return nth_relative_double_node( PREV(L), n + 1 ) ; } extern void destroy_double_list( double_list *p_L, void (*p_func_f)() ) { if (empty_double_list(*p_L) == TRUE) return ; destroy_double_list( &NEXT(*p_L), p_func_f ) ; (*p_func_f)( DATA(*p_L) ) ; free_double_node( p_L ) ; *p_L = NULL ; }

/**********************************************************/ /* Programmer: Willie Boag */ /* */ /* interface.c (Simple Line Editor) */ /**********************************************************/ #include #include #include #include "dlists.h" "globals.h" <string.h> <stdlib.h>

extern status string_double_append( double_list *p_L, char *buffer ) { char *str ; str = (char *) malloc( sizeof(char) * (strlen((char *)buffer) + 1) ) ; if (str == NULL) return ERROR ; strcpy(str, (char *) buffer) ; if (double_append(p_L, (generic_ptr) str) == ERROR) { free(str) ; return ERROR ; } return OK ; }

/*********************************************************************/ /* Programmer: Willie Boag */ /* */ /* user.c (Simple Line Editor) */ /*********************************************************************/ #include #include #include #include #include #include #include #include "globals.h" "dlists.h" "user.h" "interface.h" <stdio.h> <stdlib.h> <string.h> <ctype.h>

static FILE *outputfd; static status writeline( char *s ) ; static int parse_linespec( char *linespec, double_list head, double_list current, double_lis t *p_start, double_list *p_end ) ; static int parse_number( char *numberspec, double_list head, double_list current, double_lis t *p_node ) ; extern int readfile( char *filename, double_list *p_L ) { /* * Read data from filename and put in the linked list *p_L. */ char buffer[BUFSIZ]; FILE *fd; if ((fd = fopen(filename, "r")) == NULL) return 0; while (fgets(buffer, BUFSIZ, fd) != NULL) { if (string_double_append(p_L, buffer) == ERROR) { fclose(fd) ; return E_SPACE; } } fclose(fd); return 0; } extern int writefile( char *filename, double_list *p_L ) { /* * Output the data in *p_L to the output file, filename. * Use the static global variable outputfd to store the output * file descriptor so that it can be used by writeline(). */ status rc; if ((outputfd = fopen(filename, "w")) == NULL) return E_IO; rc = double_traverse(*p_L, writeline); fclose(outputfd); return (rc == ERROR) ? E_IO : 0; }

static status writeline( char *s ) {

/* * Write a single line of output to outputfd. Outputfd * must point to a file previously pened with fopen (as * is done in writefile(). */ if (fputs(s, outputfd) == EOF) return ERROR; return OK; } extern int insertlines( char *linespec, double_list *p_head, double_list *p_current ) { /* * Insert new lines before the current line. */ double_list newdata, startnode, endnode, lastnode; status rc; int cmp, parseerror; char buffer[BUFSIZ]; /* * If the list is empty, no linespec is allowed. */ if (empty_double_list(*p_head) == TRUE) { if (strlen(linespec) != 0) return E_LINES; startnode = endnode = NULL; } else { /* * If a linespec is given, it better be a single line number */ parseerror = parse_linespec(linespec, *p_head, *p_current, &startnode, &endnode); if (parseerror) return parseerror; if (startnode != endnode) return E_LINES; } /* * Collect the new lines in newdata. Then "paste" the list before * startnode. */ init_double_list(&newdata); do { printf("insert>"); fgets(buffer, BUFSIZ, stdin); cmp = strcmp(buffer, ".\n"); if (cmp != 0) { rc = string_double_append(&newdata, buffer); if (rc == ERROR) return E_SPACE; } } while (cmp != 0); if ( empty_double_list(newdata) == TRUE) return 0; if (startnode == NULL) { /* * Empty list */ *p_head = newdata; *p_current = nth_double_node(newdata, 1); } else if (PREV(startnode) == NULL) { /* * Insert before the first line. */

lastnode = nth_double_node(newdata, 1); paste_list(&lastnode, p_head); *p_head = newdata; *p_current = startnode; } else { /* * Insertin the middle of the list. */ paste_list(&PREV(startnode), &newdata); *p_current = startnode; } return 0; } extern int deletelines(char *linespec, double_list *p_head, double_list *p_current ) { /* * Delete some lines (according to linespec from p_head. * Update p_current to be after last line deleted. * If the last line is deleted, make p_current be before first line. */ double_list startnode, endnode, tmplist; double_list new_current; int startnumber, endnumber; int rc; rc = parse_linespec(linespec, *p_head, *p_current, &startnode, &endnode); if (rc) return rc; startnumber = double_node_number(startnode); endnumber = double_node_number(endnode); if (startnumber > endnumber) { tmplist = startnode; startnode = endnode; endnode = tmplist; } new_current = nth_relative_double_node(endnode, 1); if (new_current == NULL) new_current = nth_relative_double_node(startnode, 1); cut_list(p_head, &startnode, &endnode); *p_current = new_current; destroy_double_list(&startnode, free); return 0; } extern int movelines( char *linespec, double_list *p_head, double_list *p_current ) { /* * Move lines to after p_current. Make sure the lines moved * do not include p_current. */ double_list startnode, endnode; double_list tmpnode; int startnumber, endnumber; int rc, currentnumber; int tmp; rc = parse_linespec(linespec, *p_head, *p_current, &startnode, &endnode ); if (rc) return rc; startnumber = double_node_number(startnode); endnumber = double_node_number(endnode);

currentnumber = double_node_number(*p_current); /* * Make sure start < end. */ if (startnumber > endnumber) { tmp = startnumber; startnumber = endnumber; endnumber = tmp; tmpnode = startnode; startnode = endnode; endnode = tmpnode; } /* * Do not include the current line in the ones being moved. */ if (currentnumber >= startnumber && currentnumber <= endnumber) return E_LINES; cut_list(p_head, &startnode, &endnode); paste_list(&PREV(*p_current), &startnode); return 0; } extern int printlines( char *linespec, double_list *p_head, double_list *p_current ) { /* * Print out lines. Direction indicates whether going forward or * backward. */ double_list startnode, endnode ; int startnumber, endnumber, count, direction ; int rc ; rc = parse_linespec( linespec, *p_head, *p_current, &startnode, &endnode ) ; if (rc) return rc ; startnumber = double_node_number( startnode ) ; endnumber = double_node_number( endnode ) ; direction = (startnumber < endnumber) ? 1 : 1 ; count = (endnumber startnumber) * direction + 1 ; while ( count ) { printf("%d %s", startnumber, (char *) DATA(startnode) ) ; startnumber += direction ; startnode = nth_relative_double_node(startnode, direction) ; } *p_current = endnode ; return 0 ; } static int parse_linespec( char *linespec, double_list head, double_list current, double_lis t *p_start, double_list *p_end ) { /* * Parse linespec (consisting of numberspec,numberspec). * Set p_start to the starting line and p_end to the ending line. */ int rc ;

char *nextnumber ; if (*linespec == \0) *p_start = current ; else { rc = parse_number(linespec, head, current, p_start) ; if (rc) return rc ; } nextnumber = strchr(linespec, ,) ; if (nextnumber == NULL) *p_end = *p_start ; else { rc = parse_number(nextnumber + 1, head, current, p_end) ; if (rc) return rc ; } if (*p_start == NULL || *p_end == NULL) return E_LINES ; return 0 ; } static int parse_number( char *numberspec, double_list head, double_list current, double_lis t *p_node ) { /* * Parse a single numberspec. */ char numberbuffer[BUFSIZ], *p_num ; int nodenumber ; int direction ; if (*numberspec == .) { /* * Start with the current line. */ *p_node = current ; numberspec++ ; } else if (*numberspec == $) { /* * Start with the last line. */ *p_node = nth_double_node(head, 1) ; if (*p_node == NULL){ return E_LINES ; } numberspec++ ; } else if (isdigit(*numberspec)) { /* * Have a line number. */ p_num = numberbuffer ; while (isdigit(*numberspec)) *p_num++ = *numberspec++ ; *p_num = \0 ; nodenumber = atoi( numberbuffer ) ; *p_node = nth_double_node(head, nodenumber) ; if (*p_node == NULL) return E_LINES ; } else return E_LINES ;

/* * Any plusses or minuses? */ if (*numberspec == +) { direction = 1 ; numberspec++ ; } else if (*numberspec == ) { direction = 1 ; numberspec++ ; } else direction = 0 ; /* * If a digit and previously saw a plus or minus, figure * offset from p_node. */ if (isdigit(*numberspec) && direction != 0) { p_num = numberbuffer ; while ( isdigit(*numberspec)) *p_num++ = *numberspec++ ; *p_num = \0 ; nodenumber = atoi( numberbuffer) * direction ; *p_node = nth_relative_double_node(*p_node, nodenumber) ; if (p_node == NULL) return E_LINES ; direction = 0 ; } /* * If direction is 0 (meaning no offset or offset was parsed ok) * and at end of this numberspec, then everything is ok. */ if (direction == 0 && (*numberspec == \0 || *numberspec == ,)) return 0 ; else return E_LINES ; }

Topological Sort
Abstract: A topological sort of a partial ordering is an arrangement of elements of a set in such a way that they satisfy the rules by a given comparison function. Such an arrangement comes up in many fields of study. One example of a partial ordering is a set of courses and their prerequisites. The pre-requisite rules impose restrictions on the order in which classes can be chosen. A topological sort that partial ordering would simply be a list of classes to take so that you always take the pre-requisites of a class first.

Description: In Mathematics, the combination of a set and a comparison function form a partial ordering if the comparison function is transitive, reflexive, and anti-symmetric. A topological sort of a partial ordering is a permutation of the elements of the set such that there are no conflicts with the order established by the comparison function. As mentioned above, two necessary conditions for a partial ordering to exist is that the comparison function is anti-symmetric and transitive. In terms of what that means on a graph, the graph cannot have cycles. As a result, only Directed Acyclic Graphs (DAGs) can be topologically sorted. If a cycle were to exist, then it would be the equivalent of a never-ending circle of pre-requisites.

Algorithm The general idea behind my chosen topological sorting algorithm is as follows. Because there are no cycles in the graph, I know that there is at least one minimum element. I loop through the vertices of the graph until I find that minimum element. I print that element, and remove it from consideration. This brings me back to the situation where I can find a new minimum element. This process repeats until I eventually visit every node in my graph exactly once. At that point, a valid topological sort has been found.

Testing In order to verify that my program produced a valid topological sort, I wrote an automated program that generated a random DAG, topologically sorted it using my program, and then compared the sort to the ordering imposed by the edges of the graph. This whole process was run in a loop that executed however many times I choose. Once it passed 1000/1000 cases, I accepted that it was (likely) a correct algorithm. The hardest part of my testing program was generating the DAG. My nave attempt involved generating a graph of random numbers in such a way that any two vertices had a chance of being connected. Unfortunately, this resulted in cyclic graphs (and topological sorts do not exist for cyclic graphs). My next attempt involved trying to build the graph in so that I would never insert edges that caused cycles. This, too, proved to be very difficult to manage, and I was forced to find another way. Eventually, I got the idea to generate a completely random graph (as I first did) and then run Kruskals Minimum Spanning Tree (MST) algorithm in order to eliminate cycles. This was pretty much what I ended up using, except that a MST only exists for connected graphs. On my randomly generated graphs, connectivity was not necessarily implied. As a result, I had to modify the algorithm to find a minimum spanning forest rather than a tree.

Reflection I really enjoyed this problem. I think that it is really interesting trying to sort a partial ordering. When we first learn the sorting problem, we sort integers, which have a total ordering. Although it seems like it would be simpler to sort a less-constrained partial ordering (and maybe it is), it seems to me that it is harder. The current algorithm that I used for this topological sort (in which I find the minimum element, remove it, and repeat) is the partial ordering analog of Selection Sort. I have tried thinking of how other sorting algorithms would be implemented on partial orderings, but that is where I find issues. For instance, quicksort works by partitioning an array into two sub arrays- one array is full of elements less than the pivot and one array is full of elements greater than the pivot. So how could we apply this method to a partial ordering? We cannot partition the array into the two sub arrays, because we do not have the luxury of every two elements being comparable. Surprisingly (surprising to me, at least) our problem has become harder because of our less-constraining comparison function.

C Implementation:

Makefile Makefile 54

Header Files globals.h graph.h list.h queue.h 55 56 57 58

Source Files main.c graph.c list.c queue.c 59 60 65 67

# # Programmer: Willie Boag # # Makefile for Topological Sort # tsort: main.o graph.o queue.o list.o gcc o tsort main.o graph.o queue.o list.o main.o: main.c globals.h graph.h gcc ansi pedantic Wall c g main.c graph.o: graph.c graph.h globals.h queue.h gcc ansi pedantic Wall c g graph.c queue.o: queue.c queue.h globals.h list.h gcc ansi pedantic Wall c g queue.c list.o: list.c list.h globals.h gcc ansi pedantic Wall c list.c clean: rm f *.o

/*************************************************************/ /* Programmer: Willie Boag */ /* */ /* globals.h (Topological Sort) */ /*************************************************************/ #ifndef _globals #define _globals #define DATA( L ) ( ( L ) > datapointer ) #define NEXT( L ) ( ( L ) > next ) #define RIGHT(T) ( (T) > right ) #define LEFT(T) ( (T) > left ) typedef enum { OK, ERROR } status ; typedef enum { FALSE=0 , TRUE=1 } bool ; typedef void *generic_ptr ; #endif

/********************************************************/ /* Programmer: Willie Boag */ /* */ /* graph.h (Topological Sort) */ /********************************************************/ #ifndef _graph #define _graph #include "globals.h" typedef int vertex ; typedef struct { int weight; vertex vertex_number; } edge ; #define UNUSED_WEIGHT (32767) #define WEIGHT(p_e) ((p_e) > weight) #define VERTEX(p_e) ((p_e) > vertex_number) typedef enum {directed, undirected } graph_type ; typedef enum {DEPTH_FIRST, BREADTH_FIRST, TOPOLOGICAL} searchorder ; typedef struct { graph_type type ; int number_of_vertices ; edge **matrix ; } graph_header, *graph ; extern extern extern extern extern extern extern extern #endif status traverse_graph( graph G, searchorder order, status (*p_func_f)() ) ; status init_graph( graph *p_G, int vertex_cnt, graph_type type ) ; void destroy_graph( graph *p_G ) ; status add_edge( graph G, vertex vertex1, vertex vertex2, int weight ) ; status delete_edge( graph G, vertex vertex1, vertex vertex2 ) ; bool isadjacent( graph G, vertex vertex1, vertex vertex2 ) ; void graph_size( graph G, int *p_vertex_cnt, int *p_edge_cnt ) ; edge *edge_iterator( graph G, vertex vertex_number, edge *p_last_return ) ;

/**************************************************************/ /* Programmer: Willie Boag */ /* */ /* list.h (Topological Sort) */ /**************************************************************/ #ifndef _list #define _list #include "globals.h" typedef struct node node, *list ; struct node { generic_ptr datapointer; list next; } ; extern status allocate_node( list *p_L, generic_ptr data ) ; extern void free_node(list *p_L ) ; extern extern extern extern extern extern status init_list( list *p_L ) ; bool empty_list( list L ) ; status insert( list *p_L, generic_ptr data ) ; status append( list *p_L, generic_ptr data ) ; status delete( list *p_L, generic_ptr *p_data ) ; status delete_node( list*p_L, list node ) ;

extern status traverse( list L, status (*p_func_f) () ) ; extern status find_key( list L, generic_ptr key, int (*p_cmp_f)(), list *p_keynode ) ; extern list list_iterator( list L, list lastreturn ) ; extern void destroy( list *p_L, void (*p_func_f)() ) ; #endif

/***************************************************/ /* Porgrammer: Willie Boag */ /* */ /* queue.h (Topological Sort */ /***************************************************/ #ifndef _queue #define _queue #include "globals.h" #include "list.h" typedef struct { node *front ; node *rear ; } queue ; #define FRONT(Q) ((Q) > front) #define REAR(Q) ((Q) > rear) status init_queue( queue *p_Q ) ; bool empty_queue( queue *p_Q ) ; status qadd( queue *p_Q, generic_ptr data ) ; status qremove( queue *p_Q , generic_ptr *p_data ) ; void qprint( queue Q, status (*p_func_f)() ) ; #endif

/******************************************************************/ /* Programmer: Willie Boag */ /* */ /* main.c (Topological Sort) */ /******************************************************************/ #include "graph.h" #include "globals.h" #include <stdio.h> status write_vertex( int a ) { printf( " %d ", a ) ; return OK ; } int main( int argc, char *argv[] ){ FILE *fileptr ; int weight ; int from ; int to ; int numberofvertices ; graph G ; fileptr = fopen( argv[1], "r" ) ; fscanf(fileptr, "%d", &numberofvertices ) ; init_graph( &G, numberofvertices, directed ) ; while (fscanf( fileptr, "%d %d %d", &from, &to, &weight) != EOF) add_edge( G, from, to, weight ) ; printf("\n Topological Traversal: ") ; traverse_graph( G, TOPOLOGICAL, write_vertex ) ; printf("\n\n") ; destroy_graph( &G ) ; fclose(fileptr) ; return 0 ; }

/********************************************************/ /* Programmer: Willie Boag */ /* */ /* graph.c (Topological Sort) */ /********************************************************/ #include #include #include #include <stdlib.h> "globals.h" "graph.h" "queue.h"

#include <stdio.h> extern status init_graph( graph *p_G, int vertex_cnt, graph_type type ) { graph G ; int i, j ; G = (graph) malloc(sizeof(graph_header)) ; if (G == NULL) return ERROR ; G > number_of_vertices = vertex_cnt ; G > type = type ; G > matrix = (edge **) malloc(vertex_cnt * sizeof(edge *)) ; if (G > matrix == NULL) { free(G) ; return ERROR ; } G >matrix[0] = (edge *) malloc(vertex_cnt * vertex_cnt * sizeof(edge)) ; if (G >matrix[0] == NULL) { free(G > matrix) ; free(G) ; return ERROR ; } for (i = 1; i < vertex_cnt ; i++) G >matrix[i] = G > matrix[0] + vertex_cnt * i ; for (i = 0 ; i < vertex_cnt ; i++) { for (j = 0 ; j < vertex_cnt ; j++) { G > matrix[i][j].weight = UNUSED_WEIGHT ; G > matrix[i][j].vertex_number = j ; } } *p_G = G ; return OK ; } extern void destroy_graph( graph *p_G ) { free((*p_G) > matrix[0] ) ; free((*p_G) > matrix ) ; free(*p_G) ; p_G = NULL ; } extern status add_edge( graph G, vertex vertex1, vertex vertex2, int weight ) { if (vertex1 < 0 || vertex1 > G > number_of_vertices) return ERROR ; if (vertex2 < 0 || vertex2 > G > number_of_vertices) return ERROR ; if (weight <= 0 || weight >= UNUSED_WEIGHT) return ERROR ; G > matrix[vertex1][vertex2].weight = weight ;

if (G > type == undirected) G > matrix[vertex2][vertex1].weight = weight ; return OK ; } extern status delete_edge( graph G, vertex vertex1, vertex vertex2 ) { if (vertex1 < 0 || vertex1 > G > number_of_vertices) return ERROR ; if (vertex2 < 0 || vertex2 > G > number_of_vertices) return ERROR ; G > matrix[vertex1][vertex2].weight = UNUSED_WEIGHT ; if (G > type == undirected) G > matrix[vertex2][vertex1].weight = UNUSED_WEIGHT ; return OK ; } extern bool isadjacent( graph G, vertex vertex1, vertex vertex2 ) { if (vertex1 < 0 || vertex1 > G > number_of_vertices) return FALSE ; if (vertex2 < 0 || vertex2 > G > number_of_vertices) return FALSE ; return (G > matrix[vertex1][vertex2].weight == UNUSED_WEIGHT) ? FALSE : TRUE ; } extern void graph_size( graph G, int *p_vertex_cnt, int *p_edge_cnt ) { int i , j ,edges ; *p_vertex_cnt = G > number_of_vertices ; edges = 0 ; for (i = 0 ; i < G > number_of_vertices ; i++) for (j = i + 1 ; j < G > number_of_vertices ; j++) if (G > matrix[i][j].weight != UNUSED_WEIGHT) edges++ ; *p_edge_cnt = edges ; return ; } extern edge *edge_iterator( graph G, vertex vertex_number, edge *p_last_return ) { vertex other_vertex ; if (vertex_number < 0 || vertex_number >= G > number_of_vertices) return NULL ; if (p_last_return == NULL) other_vertex = 0 ; else other_vertex = VERTEX(p_last_return) + 1 ; for ( ; other_vertex < G> number_of_vertices ; other_vertex++) { if (G > matrix[vertex_number][other_vertex].weight != UNUSED_WEIGHT) return &G > matrix[vertex_number][other_vertex] ; } return NULL ; } static status breadth_first_search( graph G, vertex vertex_number, bool visited[], status (* p_func_f)() ) { edge *tmp, *p_edge ; queue Q ;

visited[vertex_number] = TRUE ; if ((*p_func_f)(vertex_number) == ERROR) return ERROR ; init_queue(&Q) ; p_edge = NULL ; while ( (p_edge = edge_iterator(G, vertex_number, p_edge)) != NULL) qadd( &Q, (generic_ptr) p_edge ) ; while ( empty_queue(&Q) == FALSE ) { qremove( &Q, (generic_ptr *) &tmp) ; if (visited[VERTEX(tmp)] == FALSE) { visited[VERTEX(tmp)] = TRUE ; if ((*p_func_f)(VERTEX(tmp)) == ERROR) return ERROR ; p_edge = NULL ; while ( (p_edge = edge_iterator(G, VERTEX(tmp), p_edge)) != NULL) qadd( &Q, (generic_ptr) p_edge ) ; } } return OK ; } static status depth_first_search( graph G, vertex vertex_number, bool visited[], status (*p_ func_f)() ) { edge *p_edge ; status rc ; visited[vertex_number] = TRUE ; if ((*p_func_f)(vertex_number) == ERROR) return ERROR ; p_edge = NULL ; while ( (p_edge = edge_iterator(G, vertex_number, p_edge)) != NULL) if (visited[VERTEX(p_edge)] == FALSE) { rc = depth_first_search(G, VERTEX(p_edge), visited, p_func_f) ; if (rc == ERROR) return ERROR ; } return OK ; } static int *count_predecessors( graph G ) { int vertex_cnt, edge_cnt, *pred ; int i ; edge *p_edge ; graph_size( G, &vertex_cnt, &edge_cnt) ; pred = (int *) malloc( sizeof(int) * vertex_cnt ) ; if (pred == NULL) return NULL ;

for (i = 0 ; i < vertex_cnt ; i++) pred[i] = 0 ; for (i = 0 ; i < vertex_cnt ; i++) { p_edge = NULL ; while ( (p_edge = edge_iterator( G, i, p_edge)) != NULL ) pred[VERTEX(p_edge)]++ ; } return pred ; } static int extract_min( int pred[], int n ) { int i ; /* Uses assumption that at least element has a value of zero. */ for (i = 0 ; i < n ; i++) if (pred[i] == 0) { pred[i] = 1 ; return i ; } /* Should never get here. */ return 1 ; } static status topological_sort( graph G, status (*p_func_f)() ) { int vertex_cnt, edge_cnt, *pred ; int count = 0, ind ; edge *p_edge ; graph_size( G, &vertex_cnt, &edge_cnt) ; pred = count_predecessors( G ) ; while ( count < vertex_cnt ) { ind = extract_min( pred, vertex_cnt ) ; if ((*p_func_f)( ind ) == ERROR) return ERROR ; count++ ; p_edge = NULL ; while ( (p_edge = edge_iterator( G, ind, p_edge)) != NULL ) pred[VERTEX(p_edge)] ; } free(pred) ; return OK ; } extern status traverse_graph( graph G, searchorder order, status (*p_func_f)() ) { status rc ; bool *visited ; int vertex_cnt, edge_cnt ;

int i ; graph_size( G, &vertex_cnt, &edge_cnt) ; visited = (bool *) malloc(sizeof(bool) * vertex_cnt); if (visited == NULL) return ERROR ; for (i = 0 ; i < vertex_cnt ; i++) visited[i] = FALSE ; for ( rc = OK, i = 0 ; i < vertex_cnt && rc == OK ; i++) { if (visited[i] == FALSE) { switch (order) { case DEPTH_FIRST: rc = depth_first_search(G, i, visited, p_func_f) ; break ; case BREADTH_FIRST: rc = breadth_first_search(G, i, visited, p_func_f) ; break ; case TOPOLOGICAL: i = vertex_cnt ; rc = topological_sort( G, p_func_f ) ; break ; } } } free(visited) ; return OK ; }

/**********************************************************/ /* Programmer: Willie Boag */ /* */ /* list.c (Topological Sort) */ /**********************************************************/ #include <stdlib.h> #include "list.h" #include "globals.h" status allocate_node( list *p_L, generic_ptr data ) { list L = (list) malloc(sizeof(node)); if (L == NULL) return ERROR; *p_L = L; DATA(L) = data; NEXT(L) = NULL; return OK; } void free_node( list *p_L ) { free(*p_L); *p_L = NULL; } status init_list( list *p_L ) { *p_L = NULL; return OK; } bool empty_list( list L ) { return (L == NULL) ? TRUE : FALSE; } status insert( list *p_L, generic_ptr data ) { list L; if (allocate_node(&L, data) == ERROR) return ERROR; NEXT(L) = *p_L; *p_L = L ; return OK; } status append( list *p_L, generic_ptr data ) { list L, tmplist; if (allocate_node(&L, data) == ERROR) return ERROR; if (empty_list(*p_L) == TRUE) *p_L = L; else { for (tmplist = *p_L; NEXT(tmplist)!=NULL; tmplist=NEXT(tmplist)); NEXT(tmplist) = L; } return OK; } status delete( list *p_L, generic_ptr *p_data ) { if ( empty_list(*p_L)) return ERROR; *p_data = DATA(*p_L); return delete_node(p_L, *p_L); } status delete_node( list *p_L, list node ) { list L;

if (empty_list(*p_L) == TRUE) return ERROR; if (*p_L == node) *p_L = NEXT(*p_L); else { for (L = *p_L; L != NULL&& NEXT(L) != node; L = NEXT(L)); if (L == NULL ) return ERROR; else NEXT(L) = NEXT(node); } free_node(&node); return OK; } status traverse( list L, status (*p_func_f) () ) { if (empty_list(L)) return OK; if ((*p_func_f)(DATA(L)) == ERROR) return ERROR; return traverse(NEXT(L), p_func_f); } status find_key( list L, generic_ptr key, int (*p_cmp_f)(), list *p_keynode ) { list curr = NULL; while ( ( curr = list_iterator(L, curr)) != NULL ) { if ((*p_cmp_f)(key, DATA(curr)) == 0 ) { *p_keynode = curr; return OK; } } return ERROR; } list list_iterator( list L, list lastreturn ) { return (lastreturn == NULL) ? L : NEXT(lastreturn); } void destroy( list *p_L, void (*p_func_f) () ) { if (empty_list(*p_L) == FALSE) { destroy(&NEXT(*p_L), p_func_f); if (p_func_f != NULL) (*p_func_f)(DATA(*p_L)); free_node(p_L); } }

/****************************************************/ /* Programmer: Willie Boag */ /* */ /* queue.c (Topological Sort) */ /****************************************************/ #include <stdlib.h> #include "globals.h" #include "queue.h" #include "list.h" #include <stdio.h> extern status init_queue( queue *p_Q ) { /* *Initialize the queue to empty. */ FRONT( p_Q ) = NULL; REAR( p_Q ) = NULL ; return OK ; } extern bool empty_queue( queue *p_Q ) { /* * Return TRUE if queue is empty, FALSE otherwise. */ return (FRONT(p_Q) == NULL) ? TRUE : FALSE ; } extern status qadd( queue *p_Q, generic_ptr data ) { /* * Add data to p_Q. */ list newnode ; if (allocate_node(&newnode, data) == ERROR) return ERROR; if (empty_queue(p_Q) == FALSE) { NEXT(REAR(p_Q)) = newnode ; REAR(p_Q) = newnode ; } else { FRONT(p_Q) = REAR(p_Q) = newnode ; } return OK ; } extern status qremove( queue *p_Q, generic_ptr *p_data ) { /* * Remove a value from p_Q and put in p_data. */ list nodeinfront ;

if (empty_queue(p_Q) == TRUE) return ERROR; nodeinfront = FRONT(p_Q) ; *p_data = DATA(nodeinfront) ; if (REAR(p_Q) == FRONT(p_Q)) REAR(p_Q) = FRONT(p_Q) = NULL ; else FRONT(p_Q) = NEXT(nodeinfront) ; return OK ; } extern void qprint( queue Q, status (*p_func_f)() ) { node *temp ; if (empty_queue(&Q) == TRUE) return ; qremove( &Q, (generic_ptr *) &temp ) ; (*p_func_f)( temp ) ; qprint( Q, p_func_f ) ; return ; }

Bloom Filters
Abstract: Often times, data is stored in sets. Two of the most important operations on sets are insertion and testing for membership. When time and space are the most important aspect of data look-up, the Bloom Filter data structure can be used. The drawback of Bloom Filters is that they are probabilistic by nature. Consequently, they could report a false positive for membership. However, they never report false negatives.

Description: There are many ways to represent a set in memory. If there exists a one-to-one correspondence between your data and the natural numbers, then the obvious choice would be a bit vector. However, such a correspondence is not always feasible, for instance when storing strings. Another implementation could be a hash table (which has a function that will map your data to an index into your table). The good thing about a hash table is that it strives to locate data very quickly. When data is stored in the table, its location is determined via the hashing function. The goal is two quickly compute the index returned to the hashing function, which would only take some constant amount of time. However, one of the most significant disadvantages of a hash table results from collisions (when different pieces of data map to the same index). One solution for this is problem is chaining pieces data together when they collide. In this solution, although data would hash to the index in a constant amount of time, there would still be the

possibility of traversing a chain of data in order to find a particular element. As a result, constant search time is not guaranteed. Bloom Filters are similar to hash tables, except that they have guaranteed constant search time (but at the price of certainty). They are implemented as bit vectors, but not with the usual one-to-one correspondence. When data needs to be stored, multiple hash functions are calculated using that data, and each hash produces its own index into the bit vector. The bit at each hashed index is then set to true. An important point here is that the inserted data is not actually stored. Instead, resultant data is generated from it, and that resultant data is stored (in the bit vector). Consequently, when you test whether previously inserted data is in the set, the hash functions will generate the same indices that have already been set to true, and it will be determined that the data must have been added to the set already.

Benefits: As mentioned above, Bloom Filters are implemented as bit vectors, which means that they take up very little space. This is a significant gain over hash tables, which not only store the inserted data, but also have lots of unused space in order to probabilistically avoid collisions. In addition, Bloom Filters have guaranteed constant look-up time. Unlike hash tables, Bloom Filters never chain in the event of a collision. Consequently, the time required for the yes or no to be generated by searching for a member is independent of the number of collisions. Furthermore, there can never be false negatives, because once a member has been added to the set, its hash-generated indices will always be set to true. Therefore, when it is tested for membership at a later time, the test will never fail.

Drawbacks: If a piece of data that is not actually added in the set just happens to hash to indices that have all been set to true by other members of the set, then the Bloom Filter will mistakenly believe that the current data was inserted, and report a false positive for membership. Just as with hash tables, the higher the number of collisions, the less ideal the performance of the data structure. Despite requiring little memory cost and guaranteed constant search time, collisions result in the generation of false positives. As a result, Bloom Filters would not be appropriate for situations that require high accuracy.

Implementation Choices: For my implementation, I chose to use an array of integers (instead of bits) in order to simplify the code. My code focuses more on the concept than the efficiency of space. Furthermore, my hashing functions are very simple. More sophisticated (usually determined by the kind of data to be evaluated) would be used in practice in order to avoid collisions. I chose to represent my data structure as a struct with two fields: the bit array, and the size of that array. My functions are able to initialize the structure, insert, test for membership, delete, and free the structure. My Bloom Filter was implemented in a polymorphic manner. As a result, I created very basic interface functions. In addition, I separated the hashing functions from the primitives, allowing them to be modified without re-compilation of the primitive functions. I was able to delete from my Bloom Filter, because my array (of integers) could hold more values than just true or false. I chose to store how many inserted elements have mapped to each index. As a result, I could safely delete from the array without the fear of zeroing out the index of a member that also hashed to the same index as the element to be removed. I used three hashing functions. I made the decision to modulo the calculated index inside of my primitives, as opposed to inside of my hashing functions. This decision was made so that the hashing functions do not need to know how large the Bloom Filter bit array is. For the purposes of dealing with strings, the three chosen functions calculate: 1. The sum of the ASCII values of the characters of the string. 2. The product of the ASCII values of the characters of the string. 3. The length of the string.

Motivation I wanted to implement and discuss Bloom Filters, because I find the idea of the probabilistic data structures to be fascinating. Just like with the randomized pivot selection of quicksort, there is a seemingly mystical quality of randomness that can allow our algorithms and data structures to perform better (in expectation) when we dont know their exact behavior before runtime. This concept seems counter intuitive, yet powerful.

Results In this example, I inserted the words of poem.txt into my Bloom Filter. I then tested to see if 20 very common English words were (probably) in my set. The words to, of, and you (which actually are in the file) were correctly stated to have been found using the Bloom Filter. In addition, the words be, in, and have were also predicted to be in my set, because they hashed to the same indices that were set to true by the words of poem.txt. One false positive example is explored in more detail below, showing which words caused the false positive in.

in hash1: hash2: hash3: 23 30 2

to 3 12 2

belong 23 16 6

base 27 30 4

C Implementation:

Makefile Makefile 74

Header Files globals.h bloom.h bloom_interface.h hash.h 75 76 77 78

Source Files main.c bloom.c bloom_interface.c hash.c 79 80 82 83

# # # # #

Programmer: Willie Boag Makefile for Bloom Filters

bloom: bloom.o main.o hash.o bloom_interface.o gcc o bloom bloom.o main.o hash.o bloom_interface.o bloom.o: bloom.c bloom.h globals.h hash.h gcc ansi pedantic Wall c bloom.c main.o: main.c bloom.h globals.h bloom_interface.h gcc ansi pedantic Wall c main.c hash.o: hash.c hash.h globals.h gcc ansi pedantic Wall c hash.c bloom_interface.o: bloom_interface.c bloom.h globals.h gcc ansi pedantic Wall c bloom_interface.c clean: rm f *.o

/*************************************************************/ /* Progammer: Willie Boag */ /* */ /* globals.h (Bloom Filters) */ /*************************************************************/ #ifndef _globals #define _globals typedef enum { OK, ERROR } status ; typedef enum { FALSE=0 , TRUE=1 } bool ; typedef void *generic_ptr ; #endif

/*****************************************************/ /* Programmer: Willie Boag */ /* */ /* bloom.h (Bloom Filters) */ /*****************************************************/ #ifndef _bloom #define _bloom #include "globals.h" typedef struct { int * base ; int size ; } bloom ; extern extern extern extern extern #endif status init_bloom( bloom *p_B, int size ) ; void bloom_insert( bloom *p_B, generic_ptr data ) ; bool bloom_member( bloom *p_B, generic_ptr data ) ; status bloom_delete( bloom *p_B, generic_ptr data ) ; void destroy_bloom( bloom *p_B ) ;

/*****************************************************/ /* Programmer: Willie Boag */ /* */ /* bloom_interface.h (Bloom Filters) */ /*****************************************************/ #ifndef _bloom_interface #define _bloom_interface #include "globals.h" #include "bloom.h" extern void str_bloom_insert( bloom *p_B, char *data ) ; extern bool str_bloom_member( bloom *p_B, char *data ) ; extern status str_bloom_delete( bloom *p_B, char *data ) ; #endif

/*********************************************/ /* Programmer: Willie Boag */ /* */ /* hash.h (Bloom Filters) */ /*********************************************/ #ifndef _hash #define _hash extern int hash1( generic_ptr str ) ; extern int hash2( generic_ptr str ) ; extern int hash3( generic_ptr str ) ; #endif

/********************************************************/ /* Programmer: Willie Boag */ /* */ /* main.c (Bloom Filters) */ /********************************************************/ #include "globals.h" #include "bloom.h" #include "bloom_interface.h" #include <stdio.h> #include <stdlib.h> #define CALL_USAGE "\n\tCall Usage: ./bloom input_file comparison_file\n\n" int main( int argc, char *argv[] ) { FILE *fid1, *fid2 ; char word[20] ; bloom B ; if (argc != 3) { fprintf( stderr, CALL_USAGE ) ; exit(1) ; } fid1 = fopen( argv[1], "r" ) ; fid2 = fopen( argv[2], "r" ) ; /* Errorcheck file opens. */ if (fid1 == NULL) { fprintf( stderr, "\n\tERROR: Could not open file: %s\n\n", argv[1] ) ; exit(1) ; } if (fid2 == NULL) { fprintf( stderr, "\n\tERROR: Could not open file: %s\n\n", argv[2] ) ; exit(1) ; } init_bloom( &B, 32 ) ; /* Read input data into Bloom Filter. */ while (fscanf( fid1, "%s ", word ) != EOF) str_bloom_insert( &B, word ) ; /* Check comparison data for membership. */ while (fscanf( fid2, "%s ", word ) != EOF) if ( str_bloom_member(&B, word) ) printf("\n\"%s\": maybe", word ) ; else printf("\n\"%s\": NO", word ) ; printf("\n\n\n" ) ; destroy_bloom( &B ) ; fclose( fid1 ) ; fclose( fid2 ) ; return 0 ; }

/*****************************************************/ /* Programmer: Willie Boag */ /* */ /* bloom.c (Bloom Filters) */ /*****************************************************/ #include #include #include #include #include "globals.h" "bloom.h" "hash.h" <stdlib.h> <string.h>

#include <stdio.h> extern status init_bloom( bloom *p_B, int size ) { int i ; bloom B ; B.base = (int *) malloc( sizeof(int) * size ) ; if (B.base == NULL) return ERROR ; B.size = size ;

for (i = 0 ; i < size ; i++) B.base[i] = 0 ; *p_B = B ; return OK ; } extern void bloom_insert( bloom *p_B, generic_ptr data ) { int h1, h2, h3 ; h1 = hash1( data ) % p_B>size ; h2 = hash2( data ) % p_B>size ; h3 = hash3( data ) % p_B>size ; p_B>base[h1]++ ; p_B>base[h2]++ ; p_B>base[h3]++ ; } extern bool bloom_member( bloom *p_B, generic_ptr data ) { int h1, h2, h3 ; h1 = hash1( data ) % p_B>size ; h2 = hash2( data ) % p_B>size ; h3 = hash3( data ) % p_B>size ; if (p_B>base[h1] == 0) return FALSE ; if (p_B>base[h2] == 0) return FALSE ; if (p_B>base[h3] == 0) return FALSE ; return TRUE ; } extern status bloom_delete( bloom *p_B, generic_ptr data ) { int h1, h2, h3 ; h1 = hash1( data ) % p_B>size ; h2 = hash2( data ) % p_B>size ; h3 = hash3( data ) % p_B>size ;

if (p_B>base[h1] < 0) return ERROR ; if (p_B>base[h2] < 0) return ERROR ; if (p_B>base[h3] < 0) return ERROR ; p_B>base[h1] ; p_B>base[h2] ; p_B>base[h3] ; return OK ; } extern void destroy_bloom( bloom *p_B ) { free( p_B > base ) ; p_B > base = NULL ; }

/*****************************************************/ /* Programmer: Willie Boag */ /* */ /* bloom_interface.c (Bloom Filters) */ /*****************************************************/ #include "globals.h" #include "bloom.h" extern void str_bloom_insert( bloom *p_B, char *data ) { bloom_insert( p_B, (generic_ptr) data ) ; } extern bool str_bloom_member( bloom *p_B, char *data ) { return bloom_member( p_B, (generic_ptr) data ) ; } extern status str_bloom_delete( bloom *p_B, char *data ) { return bloom_delete( p_B, (generic_ptr) data ) ; }

/*******************************************************/ /* Programmer: Willie Boag */ /* */ /* hash.c (Bloom Filters) */ /*******************************************************/ #include "globals.h" #include <string.h> extern int hash1( generic_ptr data ) { int sum = 0 ; char *str = (char *)data; while (*str != \0) sum += (int) *str++ ; return sum > 0 ? sum : sum ; } extern int hash2( generic_ptr data ) { int prod = 1 ; char *str = data; while (*str != \0) prod *= (int) *str++ ; return prod > 0 ? prod : prod ; } extern int hash3( generic_ptr data ) { return strlen( (char *) data ) ; }

Fast Fourier Transform


Abstract: This is the algorithm that changed the world. The Fourier Transform is a tool that makes analyzing waves and signals very easy and waves are everywhere (light, sound, radar, etc). Unfortunately, although the output of the transform is easy to work with, it used to be very expensive to actually go through the process of applying the transform which defeated the purpose of it altogether. But in 1965, James Cooley and John Tukey discovered an efficient way to calculate the Fourier Transform of a signal. Their efficient algorithm is called the Fast Fourier Transform (FFT), because it is much more efficient than the original method for computing the Discrete Fourier Transform (DFT). As a result, massive amounts of data are able to be analyzed and processed in near-real time, profoundly impacting a large range technology- including MRIs and police radar guns.

Fourier Transform: Using such a transform, any wave can be broken down into its fundamental building blocks sine waves. The idea is that for any given wave, you can represent it as the sum of sine waves of different frequencies. Such a representation allows for easy calculations in noise reduction and convolution of waves. The concept of breaking a wave down into its natural building blocks is better understood using a simpler example. Consider two 3-dimensional vectors, which we can call a and b. The goal is to add these two vectors together. There are many different methods that can be used to add them (such as physically moving them for the tail-to-tip method), but the most natural way to do it is to break the two vectors down into their x, y, and z components and add the corresponding components together. The important step here, was breaking our vectors down into their x, y, and z coordinates. This made the addition of a and b very easy. To relate this analogy back to the Fourier Transform, it is the transform that actually breaks a signal into its corresponding sine waves. The output of the transform would be the coefficients of each wave.

Original Algorithm: The original method for calculating the Fourier Transform involves generating a matrix of complex numbers. The data representing the signal is then stored in a vector and multiplied by the complex matrix. This process is very inefficient, because the time and space required to store the matrix of n by n values (where n is the length of the signal vector) grows very quickly as n becomes large. The un-scalability of this algorithm is what makes it so impractical. Computers needed to analyze large amounts of data signals very quickly, and that is just not possible with the complex matrix method.

Fast Fourier Transform: Cooley and Tukey realized that they could take advantage of the special structure of the complex matrix that relied on the periodic nature of the complex entries. By separating all of the odds rows from the even rows of the calculated output vector, they saw an incredible pattern! The grouped rows could be arranged in such a way that each group was formed by the Fourier Transform of a smaller vector. I will say that once again, because of how critical that discovery is. They found a recursive formula that calculates the FFT of a vector of length n by calculating the FFTs of two vectors of length (n/2). The calculation of each (n/2)-sized vector would generate two (n/4)-sized vectors. This process of solving simpler problems could continue could continue until the problem becomes very easy to solve. Without getting too technical in my analysis, I will just say that repeatedly cutting the input size in half allowed for very efficient calculations. In addition, there was no longer the need to calculate and store the very large complex matrix for this algorithm. As a result, the FFT greatly reduced the cost of performing the Fourier Transform on a signal vector.

A Taste of Recursion: Without going into too much detail, this section will point out some of the repetitive structures of the Fast Fourier Transform calculations. The goal of this section is not to understand every little detail, but instead to begin to see the overall picture. Colors have been used as an aid when referencing the equations. Even without a full understanding of the mechanics of the algorithm, the important take-away is that we are able to divide the problem into smaller subproblems. The written expression on the upper-left-hand side of the page describes the components of a length-4 signal vector after applying the Fourier Transform matrix multiplication. In this picture, the terms of the form (from the complex matrix) have already been evaluated, which is where the complex numbers involving e came from. On the upper-right-hand side of the page, I have grouped the odd rows and the even rows and separated the two groups with a black line. In the even grouping, notice that the terms (x0 + x2) and (x1 + x3) are the same across the top two rows. In addition, the odd grouping has (x0 x2) and (x1 x3) in the same situation for the bottom two rows. The important idea behind this observation is that the large system of equations for calculating the DFT of a length-4 vector can be re-arranged into two smaller groups that exhibit remarkable similarities in structure. Furthermore, the two terms (x0 + x2) and (x0 x2) define the DFT of the length-2 vector {x0, x2}. Likewise, (x1 + x3) and (x1 x3) is the DFT of the vector {x1, x3}. As promised, the length-4 DFT can be written in terms of two length-2 DFTs. For the curious learner, it can be show that the coefficients (that is, the terms involving e) are identical within groups. The reason for this involves the periodic structure of complex numbers. In other words, in both groups have the same characteristic of f(x0,x2) + scale * f(x1,x3), except the top row of each group has a plus sign and the bottom row has a minus sign. There is a picture at the bottom of the page to clarify the above statements. We can obtain equal coefficients for the even vector by rewriting the second row, changing the + (as it has above) to -. We can do the same for the coefficients of the odd vector, making sure to again change the + to in the second row. This concludes the analysis for the length-4 FFT.

Why It Matters: Although that seems like a lot of work for simplifying an algorithm, the gains in efficiency of the FFT over the original DFT algorithm are incredible. The important thing to consider is how fast the computation time grows as the length of the signal vector increases. I have implemented both algorithms for the Fourier Transform, and compared the computation difference for signal vectors of very large length.

When reading the output of the time command, the important number to consider is the one on the middle line that says user. That is the time that was required for the actual algorithm to run. By comparing these two times, we can see that for a vector of length 8192, the FFT took .148 seconds. Compare that to the 60.392 seconds required for the ordinary DFT algorithm. In case you do not have your calculator with you, the FFT was about 408 times faster.

Conclusion: The FFT is tremendously useful for breaking down signals and waves into their natural building blocks. Once in their more natural form, computations such as combining two overlapping signals together becomes very easy. As a result, signals can be processed at incredibly high speeds. Imagine trying to meaningfully interpret MRI results if it took 400 times longer to process the data.

C Implementation:

Makefile Makefile 90

Header Files complex.h 91

Source Files fft.c dft.c complex.c 92 95 97

# # # # #

Programmer: Willie Boag Makefile for Fast Fourier Transform

all: dft fft fft: fft.o complex.o gcc o fft fft.o complex.o lm dft: dft.o complex.o gcc o dft dft.o complex.o lm dft.o: dft.c complex.h gcc ansi pedantic Wall c dft.c D_GNU_SOURCE complex.o: complex.c complex.h gcc ansi pedantic Wall c complex.c clean: rm f *.o

/***************************************************************/ /* Programmer: Willie Boag */ /* */ /* complex.h (Fast Fourier Transform) */ /***************************************************************/ #ifndef _complex #define _complex typedef struct { double real ; double imaginary ; } complex ; typedef enum { FALSE=0 , TRUE=1 } bool ; typedef enum { ERROR, OK } status ; complex load_complex( double real, double imaginary ) ;

complex add_complex( complex a, complex b ) ; complex multiply_complex( complex a, complex b ) ; complex subtract_complex( complex a, complex b ) ; #endif

/***************************************************************/ /* Programmer: Willie Boag */ /* */ /* fft.c (Fast Fourier Transform) */ /***************************************************************/ #include #include #include #include "complex.h" <stdio.h> <stdlib.h> <math.h>

typedef enum { FORWARD, INVERSE } direction ; /* Global variable. */ complex *omega ; complex *create_vector( int n ) { return (complex *) malloc( sizeof(complex) * n ) ; } void fill_omega_vector( int n ) { int i ; double real, imag ; omega = create_vector( n ) ; for ( i = 0 ; i < n ; i++ ) { real = cos( (2 * M_PI/n) * i ) ; imag = sin( (2 * M_PI/n) * i ) ; omega[i] = load_complex( real, imag ) ; } } void print_vector( complex v[], int n ) { int i ; printf( "\n" ) ; for ( i = 0 ; i < n ; i++ ) printf( "\t%f %f\n", v[i].real, v[i].imaginary ) ; printf( "\n\n" ) ; } complex *FFT( complex *p, int k , int m, direction dir ) { int n, i ; complex *transform ; complex *evens, *odds ; complex *p_e, *p_o ; complex scale, scaled_odd ; n = pow(2, k ) ; transform = create_vector( n ) ; if (k == 0) { transform[0] = p[0] ; return transform ; } p_e = create_vector( n/2 ) ; p_o = create_vector( n/2 ) ;

/* collect evens */ for ( i = 0 ; i < n/2 ; i++ ) p_e[i] = p[2*i] ; /* collect odds */ for ( i = 0 ; i < n/2 ; i++ ) p_o[i] = p[2*i + 1] ; /* Two n/2 FFTs */ evens = FFT( p_e, k1, 2*m, dir ) ; odds = FFT( p_o, k1, 2*m, dir ) ; for ( i = 0 ; i < n/2 ; i++ ) { /* Forward or Inverse transform? */ scale.real = omega[m*i].real ; scale.imaginary = (((dir == FORWARD) ? 1 : 1) * omega[m*i].imaginary) ; /* scale * odd */ scaled_odd = multiply_complex( scale, odds[i] ) ; /* even + (scale /* even (scale transform[ transform[ n/2 + } /* Scale result by 1/n for inverse FFT. */ if (dir == INVERSE && m == 1) for (i = 0 ; i < n ; i++) { transform[i].real /= (double) n ; transform[i].imaginary /= (double) n ; } free( p_e ) free( p_o ) free( evens ) free( odds ) ; ; ; ; * * i i odd) */ odd) */ ] = add_complex( evens[i], scaled_odd ) ; ] = subtract_complex( evens[i], scaled_odd ) ;

return transform ; } complex *pointwise_complex_multiply( complex *u, complex*v, int n ) { int i ; complex *w ; w = (complex *) malloc( sizeof(complex) * n ) ; for ( i = 0 ; i < n ; i++ ) w[i] = multiply_complex( u[i], v[i] ) ; return w ; } int main( int argc, char *argv[] ) { int i, n , k ; complex *u, *v, *w ; complex *uf, *vf, *wf ; n = atoi( argv[1] ) ; k = ceil( log( (double) n ) / log( 2.0 ) ) ; fill_omega_vector( n ) ; u = create_vector( n ) ; v = create_vector( n ) ; for ( i = 0 ; i < n/2 ; i++ ) { u[i] = load_complex( i + 1.0, 0.0 ) ; u[i+n/2] = load_complex( 0.0 , 0.0 ) ;

v[i] = load_complex( v[i+n/2] = load_complex( } uf = FFT(u, k, 1, FORWARD ) ; vf = FFT(v, k, 1, FORWARD ) ;

1.0 0.0

, 0.0 ) ; , 0.0 ) ;

wf = pointwise_complex_multiply( uf, vf, n ) ; w = FFT(wf, k, 1, INVERSE) ; printf("\n\nw:") ; print_vector(w, n) ; free( free( free( free( free( free( free( omega ) ; u ) ; uf ) ; v ) ; vf ) ; w ) ; wf ) ;

return 0 ; }

/*****************************************************/ /* Programmer: Willie Boag */ /* */ /* dft (vs. Fast Fourier Transform) */ /*****************************************************/ #include #include #include #include "complex.h" <stdlib.h> <stdio.h> <math.h>

complex *complex_matrix_vector_multiply( complex *A, complex *x, int n ) ; complex *pointwise_complex_multiply( complex *u, complex*v, int n ) ; void print_vector( complex v[], int n ) { int i ; printf( "\n" ) ; for ( i = 0 ; i < n ; i++ ) printf( "\t%f %f\n", v[i].real, v[i].imaginary ) ; printf( "\n\n" ) ; } int main( int argc, char *argv[] ) { int n, i, j ; double real, imag ; complex *F, *IF ; complex *u, *uf ; complex *v, *vf ; complex *w, *wf ; n = atoi( argv[1] ) ; F = (complex *) malloc( sizeof(complex) * n * n ) ; IF = (complex *) malloc( sizeof(complex) * n * n ) ; u v = (complex *) malloc( sizeof(complex) * n ) ; = (complex *) malloc( sizeof(complex) * n ) ;

for ( i = 0 ; i < n ; i++ ) { /* Fourier Transform matrix for ( j = 0 ; j < n ; j++ ) real = cos(((2 * M_PI)/n) imag = sin(((2 * M_PI)/n) } /* Inverse Fourier Transform matrix */ for ( j = 0 ; j < n ; j++ ) { real = (1.0/n) * cos(((2 * M_PI)/n) * ((i * j) % n) ) ; imag = (1.0/n) * sin(((2 * M_PI)/n) * ((i * j) % n) ) ; IF[i*n + j] = load_complex( real, imag ) ; } } /* Fill vectors with data */ for ( i = 0 ; i < n/2 ; i++ ) { u[i] = load_complex( i + 1.0, 0.0 ) ; u[i+n/2] = load_complex( 0.0 , 0.0 ) ; */ { * ((i * j) % n) ) ; * ((i * j) % n) ) ;

F[i*n + j] = load_complex( real, imag ) ;

v[i] = load_complex( 1.0 , 0.0 ) ; v[i+n/2] = load_complex( 0.0 , 0.0 ) ; } /* Perform Fourier Transform */ uf = complex_matrix_vector_multiply( F, u, n ) ; vf = complex_matrix_vector_multiply( F, v, n ) ; wf = pointwise_complex_multiply( uf, vf, n ) ; w = complex_matrix_vector_multiply( IF, wf, n ) ; printf("\n\nw:") ; print_vector(w, n) ; free( free( free( free( free( free( free( free( u ) ; uf ) ; v ) ; vf ) ; w ) ; wf ) ; F ) ; IF ) ;

return 0 ; } complex *complex_matrix_vector_multiply( complex *A, complex *x, int n ) { int i, j ; complex *b ; complex sum, prod ; b = (complex *) malloc( sizeof(complex) * n ) ; for ( i = 0 ; i < n ; i++ ) { sum = load_complex( 0.0, 0.0 ) ; for ( j = 0 ; j < n ; j++ ) { prod = multiply_complex( A[i*n + j], x[j] ) ; sum = add_complex( sum, prod ) ; } b[i] = sum ; } return b ; } complex *pointwise_complex_multiply( complex *u, complex*v, int n ) { int i ; complex *w ; w = (complex *) malloc( sizeof(complex) * n ) ; for ( i = 0 ; i < n ; i++ ) w[i] = multiply_complex( u[i], v[i] ) ; return w ; }

/***************************************************************/ /* Programmer: William George Boag */ /* */ /* complex.c (Fast Fourier Transform) */ /***************************************************************/ #include "complex.h" #include <stdio.h> extern complex load_complex( double real, double imaginary ) { complex c ; c.real = real ; c.imaginary = imaginary ; return c ; } extern complex add_complex( complex a, complex b ) { complex sum ; sum.real = a.real + b.real ; sum.imaginary = a.imaginary + b.imaginary ; return sum ; } extern complex multiply_complex( complex a, complex b ) { complex prod ; double ar, ai, br, bi ; ar = a.real ; ai = a.imaginary ; br = b.real ; bi = b.imaginary ; prod.real = (ar * br) (ai * bi) ; prod.imaginary = (ar * bi) + (ai * br) ; return prod ; } extern complex subtract_complex( complex a, complex b ) { complex diff ; diff.real = a.real b.real ; diff.imaginary = a.imaginary b.imaginary ; return diff ; }

Appendices
Appendix A Kruskals MST Testing Code 99

Appendix B Topological Sort Testing Code 100

/****************************************************************/ /* Programmer: Willie Boag */ /* */ /* create.c */ /* */ /* Task: Create a randomly generated graph that is guaranteed */ /* to be connected. The graph is stored in the file */ /* tmp.txt */ /****************************************************************/ #include <stdlib.h> #include <stdio.h> #include <time.h> int main( int argc, char *argv[] ) { FILE *fid ; int vertices, edges ; int i ; int a, b , weight ; /* Set new seed. */ srand(time(NULL)) ; fid = fopen( "tmp.txt", "w" ) ; if (fid == NULL) { fprintf( stderr, "\n\tERROR: Could not open file: tmp.txt\n\n" ) ; exit(1) ; } /* Randomly determine number of edges and vertices. */ vertices = rand() % 20 + 1 ; edges = rand() % 50 ; /* Create a graph file. */ fprintf( fid, "%d\n", vertices ) ; for (i = 0 ; i < edges ; i++) { /* Random edge */ a = rand() % vertices ; b = rand() % vertices ; weight = rand() % 30 ; /* No selfloops. */ if (a == b) b = (b + 1) % vertices ; fprintf( fid, "%d %d %d\n", a, b, weight ) ; } /* Guarantee that graph is connected. */ for (i = 0 ; i < vertices ; i++) fprintf( fid,"0 %d 1000\n", i ) ; fclose( fid ) ; return 0 ; }

/************************************************************************/ /* Programmer: Willie Boag */ /* */ /* Program: compare.c */ /* */ /* Task: Find a topological sort of a set of randomly created graphs. */ /* Then, verify that the computed solution is correct. */ /************************************************************************/ #include #include #include #include <stdlib.h> <stdio.h> <time.h> <string.h>

#define CALL_USAGE "\n\tCall Usage: ./compare [n]\n\n" void create_graph( void ) ; int is_topo( void ) ; int main( int argc, char *argv[] ) { int n = 20, failures = 0 ; int i ; char command[70] ; /* Call Usage errorcheck */ if (argc != 1 && argc != 2) { fprintf( stderr, CALL_USAGE ) ; exit(1) ; } /* Change number of iterations, if desired. */ if (argc == 2) n = atoi(argv[1]) ; for (i = 0 ; i < n ; i++) { /* Create random graph (possibly containing cycles. */ create_graph() ; /* Convert the graph into a DAG using the Minimum Spanning Forest algorithm. */ system( "~wboag/Public/msf .tmp.txt > .tmp2.txt" ) ; /* Find a topological sort of the DAG. */ sprintf( command, "tsort .tmp2.txt > .tmp.txt" ) ; system( command ) ; /* Verify if toplogical sort is correct. */ if ( !is_topo() ) { failures++ ; fprintf(stderr, "\n\tFAILURE #%d", failures ) ; fprintf(stderr, "\n\tFiles: %d, %d\n\n", failures*2+1, failures*2 + 2 ) ; sprintf( command, "cp .tmp.txt fail%d.txt",2*failures + 1 ) ; system( command ) ; sprintf( command, "cp .tmp2.txt fail%d.txt",2*failures + 2 ) ; system( command ) ; } } /* Analysis of tests. */ printf("\n\nPassed on %d/%d random graphs.\n\n\n", n failures, n ) ; return 0 ; }

void create_graph( void ) { FILE *fid ; int vertices, edges ; int i ; int a, b , weight ; /* Set new seed. */ srand(time(NULL)) ; fid = fopen( ".tmp.txt", "w" ) ; if (fid == NULL) { fprintf( stderr, "\n\tERROR: Could not open file: .tmp.txt\n\n" ) ; exit(1) ; } /* Randomly determine number of edges and vertices. */ vertices = rand() % 20 + 1 ; edges = rand() % 50 ; /* Create a graph file. */ fprintf( fid, "%d\n", vertices ) ; for (i = 0 ; i < edges ; i++) { /* Random edge */ a = rand() % vertices ; b = rand() % vertices ; weight = rand() % 30 ; /* No selfloops. */ if (a == b) b = b + 1 % vertices ; fprintf( fid, "%d %d %d\n", a, b, weight ) ; } fclose( fid ) ; } int is_topo( void ) { FILE *fid ; int vertices ; int i, j ; int *visited, **depends ; int a, b, weight ; /* Update dependency matrix for given DAG. */ fid = fopen(".tmp2.txt", "r") ; fscanf( fid, "%*[^:]:" ) ; fscanf( fid, "%d", &vertices ) ; visited = (int *) malloc( sizeof(int) * vertices ) ;

depends = (int **) malloc( sizeof(int *) * vertices ) ; for (i = 0 ; i < vertices ; i++) depends[i] = (int *) malloc( sizeof(int) * vertices ) ; /* Initialize the visited "bit vector" */ for (i = 0 ; i < vertices ; i++) visited[i] = 0 ; /* Initialize the dependency matrix. */ for (i = 0 ; i < vertices ; i++) for (j = 0 ; j < vertices ; j++) depends[i][j] = 0 ; while (fscanf(fid, "%d %d %d", &a, &b, &weight) != EOF)

depends[b][a] = 1 ; fclose( fid ) ; /* Sweep through alleged topological sort. Check dependencies */ fid = fopen( ".tmp.txt", "r" ) ; fscanf( fid, "%d", &a ) ; for (i = 0 ; i < vertices ; i++) for ( j = 0 ; j < vertices ; j++) /* Case: dependency is present, but they are out of order. */ if ( depends[i][j] == 1 && visited[j] == 0) return 0 ; else visited[i] = 1 ; fclose(fid ) ; return 1 ; }

You might also like