Professional Documents
Culture Documents
PDS Project
PDS Project
Fall 2008
Data Structures Project
GENERAL PROJECT INFORMATION
Course Name
Project Name
Project Description
PDS is a data store for personal use. PDS supports all the basic
operations needed for managing persistent data.
Project Type
Individual
Due Date
17-Oct-2008, 10 AM
PROJECT COMPONENTS
Data Files
Data files used by PDS are binary files containing multiple data objects where each data
object has a fixed size. Each data object may contain many fields with some
restrictions as given below. The project uses two data files corresponding to different
two data object types as explained below. The two data object types are related to
each other such that every base_object is related to exactly one related_object. For
example, employee is an example of a base_object and department is an example of a
related_object. In this example, every employee object is related to exactly one
department object.
Data file name is obtained by concatenating .dat extension to the data object name. For
example, if data object name is employee, then the data file name will be
employee.dat.
file_1.dat
This data file contains data corresponding first data object type. Each data object in this
file contains any number of data fields with the restriction that the first two fields should
always be key and the related key. The general structure is as shown below:
unsigned int key;
unsigned int related_key;
some_data_type field_3
some_data_type field_4
some_data_type field_5
some_data_type field_6
In this structure, key represents a unique identifier of the data object and related_key
refers to the unique identifier of another data object with which this data object is
related.
This structure of the file with the data is pictorially shown below:
Key
related_key
field_3
field_4
field_5
(unsigned int)
(unsigned int)
Key
related_key
field_3
field_4
field_5
(unsigned int)
(unsigned int)
Key
related_key
field_3
field_4
field_5
(unsigned int)
(unsigned int)
Key
related_key
field_3
field_4
field_5
(unsigned int)
(unsigned int)
Key
related_key
field_3
field_4
field_5
(unsigned int)
(unsigned int)
field_6
field_6
field_6
field_6
field_6
file_2.dat
This data file contains data corresponding to the second data object type with which
every object of the first data object type is related. This data object in this file contains
any number of data fields with the restriction that the first field should always be the key.
The general structure is as shown below:
unsigned int key;
some_data_type field_2
some_data_type field_3
some_data_type field_4
some_data_type field_5
some_data_type field_6
In this structure, key represents a unique identifier of the data object and related_key
refers to the unique identifier of another data object with which this data object is
related.
This structure of the file with the data is pictorially shown below:
Key
(unsigned int)
Key
(unsigned int)
Key
(unsigned int)
Key
(unsigned int)
Key
(unsigned int)
field_2
field_3
field_4
field_5
field_6
field_2
field_3
field_4
field_5
field_6
field_2
field_3
field_4
field_5
field_6
field_2
(unsigned int)
related_key
(unsigned int)
field_3
field_4
field_5
field_6
field_3
field_4
field_5
field_6
Index files
Index files are binary files that contain a set of data objects where each data object
contains information regarding the position of data objects inside the data file based on
the key value. The index file name is obtained by concatenating .ndx extension to the
data_object_name. For example, if data object name is employee, then the data file
name will be employee.dat and the index file name will be employee.ndx.
The structure of the index file is fixed and contains the following information:
/* Array of positions indicating free positions implemented as queue */
int free_list[100]
/* Index information */
unsigned int key
/*primary key of table */
unsigned int offset
/*offset of data object in a file*/
unsigned char flag
/* to mark deleted data object */
The above structure is pictorially shown below:
The PDS project contains two index files corresponding to the two data files
CREATE OPERATIONS
int loadDataStore ( char *base_object_name, char *related_object_name )
Return 0 for success, 1 for failure
Load the index file corresponding to the two data object names into memory. The free
list array is loaded into a queue data structure and the actual index information is loaded
into a binary search tree or a hash map.
Algorithm
Repeat the following steps for each data object index file:
Step 0: Use a global data structure to hold the index structure (BST and Hash Table)
for the two data object index files. Example:
Step:1 Load the free list array data (first 400 bytes of the corresponding index
file) into integer array of size 100 into the global data structure. (Hint: use fread
to read the entire array in one call).
Step:2 Load structure data of an index file starting past free list array one by
one and add it to the Binary Search Tree or Hash Table on that data.
Step: 3 If operation not successful return 1 else return 0.
Algorithm
/* To do */
Step: 1 If data_object_name = child file name
Then
goto Step: 2
Else
goto Step: 5.
UPDATE OPERATIONS
Modifies an existing data object accessed using the index file based on the given key.
This method essentially seeks to that position in the file using the offset defined in the
index file, and overwrites the whole record with the passed values.
Prototype
Constraints
Key value cannot be changed.
Validate records that exceed the fixed length.
Algorithm
/* To do */
Step :1 First find offset of a key from BST data structure or Hash Table.
Step :2 Now update the record by updateDataObject with corresponding key
value in given data_object_name file.(user must take care of fixed length of
record).
Step :3 If operation not successful return 1 else return 0.
RETRIEVE
Fetches a set of data objects based on the search criteria given.
Care should be taken if a data object retrieved is marked for deletion in the index table.
If so, the data object should not be accessed from the file.
Prototype
Step: 4 Seek to the offset value from the data file and read the record from the child
table data file.
Step: 5 Take the value of the foreign key of this record as a key for the child index file
and fetch the offset from the index file.
Step: 6 Seek to the offset value from the child table data file and read all the matching
records.
Step: 7 Append these records to the record fetched in step 4 and store in the pointer **
result.
Step: 8 Return 'Null' if the query is invalid or if 'no records found'.
Constraints
Retrieve all keys
Conditional retrieve is available only on primary key or foreign key.
If a data object is found, then function should return a pointer to link list structure
(struct *)
If an invalid query is passed, return NULL //if no match found return NULL.
Common naming conventions should be followed for the field names and table
names based on the domain specified
/* How to differentiate between system related error (e.g file can not be
open) and normal failure operation (e.g. key not found)*/
DELETE
Marks a data object for deletion in the index file. Need not remove the data object from
the data file immediately.
Prototype
BST or Hash Table and also enqueue that particular key in free list
array.
Else
Go To Step: 3
Else
Call deleteCascadeByKey (char *baseObjectName, char
*relatedObjectName, int baseKey)
Step: 3 If operation not successful return 1 else return 0
Step: 4 Only keys of records which are valid must be fetched while creating the hash
map