Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

OLAP and Data Warehousing

Report 1
Integrated services project was created, consisted from 7 sequence containers:
1. Clean the table - that was used to clean all tables that were created in the previous
cycle, to make sure it’s possible to create the required table. The tables are flush only
when they exist.

Image 1. Clean tables container.

2. Load the table - to load the table from the CSV files. This container creates a
required table and inserts all data from CSV. The data is not cleaned in this step,
after this process the tables contain all original data from the files.

Image 2. Load data container.


3. Clean data - to make sure that data has sense. The incorrect grades were removed
(that are more than 2), The empty value in the exam field was fixed to contain the
value E or CW depending if the exam was taken. The teacher genders were
standardized with student genders. and when some fields in the teacher table were
null then had been replaced with “?” characters.

Image 3. Clean data container.

4. Data consistency - the consistency between PK and FK was fixed. When there were
some teachers without a title then they were joined to a new column with null values.
Also the teacher was fixed to make sure it can be joined with grades, the same as
course.

Image 4. Data consistency container.


5. Create a new attribute container - the new attributes were created based on
attributes from the grades table.

Image 5. Create new attributes container.

6. Create workload tables - to add a new dimension that describes per-semester


workload of teachers and students. The tables contain semester, grades, and
student_id or teacher_id columns.

Image 6. Create a workload tables container.


7. Restructure the grade table - the new table was created based on the grades table
and the grade table and temporary tables were removed from DB. To make it works
first temporary table was created with an additional id column and then data was
splitted to grades description and grades facts.

Image 7. Restructure the grade table container.

After that the whole package was run successfully. The result is presented below on the
image 8.

Next task was to deploy a multi dimensional cube. The following steps were made:
1. Deployment
2. New field was added (Grades Avg) and the average sum was hidden.
3. Attributes were added.
a. simple attributes
b. hierarchy attributes (specializations)

Then that cube was deployed again. The deployed cube is presented on the Image 9.
Image 8. Whole package was run successfully.

Image 9. Deployed cube.


After that it was possible to analyze the deployed multidimensional data.
1. The average grade is around 4.1 and there is 65194 grades in total.

2. Female students (K) received 2056 grades and males (M) 63138 grades. This is around 3%
and 97% of grades respectively.

3. Females students received better grades than males students. This is around 4.16 for women
and 4.10 for men.

4. Only on AIR male students get better average scores.


5. The grades are a lot of better when taken on exercises.

6. The average grade is growing each semester.

7. The best grades student received on ISK (INF) and the worst on ARR (AIR)
8. The greater the year the better the grades.

You might also like