Professional Documents
Culture Documents
Report Final
Report Final
2019-05-11
Holly Stephens
Frank Piva
Introduction
During this project we explored the process of data creation and design through modeling a drug
store chain. Our goals were to emulate the process of constructing a database from a list of business
requirements, populate the database, and interact with it. Along the way, we were also hoping to gain
insight into what the potential pitfalls are of the database design process.
ER Model
Upon initial construction of our ER diagram, we quickly discovered that there were many possible
ways to interpret and codify the list of business requirements. This demonstrates how paramount
establishing a continuous line of communication with the client is. Because there was no client for us to
interact with, we found that we had to make countless assumptions about what the solution would look
like.
There were several integrity constraints depicted in the business requirements that could not be
illustrated in our ER model. For example, the requirement that records in the sells table are to be
deleted when their associated parent record is deleted (enacted by a delete on cascade constraint in the
table definition).
2
From the aforementioned deletion constraint, we began to consider other relationships that might
behave similarly that were not specified in our list of requirements. For instance, if a patient died or a
doctor left the company should we update the has and prescribes table? Another consideration was
whether a doctor should be allowed to be their own patient.
One aspect of constructing our model that garnered much of our focus and attention was the
relationship between doctor, patient, and drug. We grappled with whether we should create a
separate entity, prescription, or if we should store the necessary attributes in a triple relationship
table, prescribes, we ended up deciding on the latter of these two options.
Relational Schema
We decided to use TEXT type instead of VARCHAR so we wouldn’t have to worry about specifying
what we thought an adequate amount of characters would be. During our research phase, we also
discovered that SQLite doesn’t actually enforce the restrictions place on VARCHAR attributes. A
programmer can declare an attribute to be a VARCHAR(20) and SQLite will happily store whatever the
user enters as a value. We specified constraints on certain values that we knew we could regulate. For
example, we decided that date values would be 10 characters long, 4 digits for the year, one digit for the
first dash, two digits for the month, one digit for the second dash, and two digits for the day. In a similar
fashion we went with 11 characters for social security numbers.
For all of our tables our functional dependencies depended on the primary key; this was by
design. We took our time when were creating the ER diagram so that we wouldn’t have too much
redundant information. Our functional dependencies are listed as follows:
● (company_name, pharmacy_name) → (end_date, legalese, start_date,
supervisor)
● (doctor_ssn, patient_ssn, trade_name) → (issue_date, quantity)
● (name) → (address, phone_number)
● (name) → (phone_number)
● (pharmacy, trade_name) → (price)
● (ssn) → (address, age, name)
● (ssn) → (name, specialty, years_of_experience)
● (trade_name) → (formula)
During this data creation process, some thought provoking questions arose: Are there any
patients that doctor shop? What is the most commonly prescribed drug? Which doctor specialty
prescribes the most drugs? What specialty is most common among primary care physicians? Which drug
has the highest variation in price by pharmacy? These are all questions that would be of use to the
hypothetical company this database was created for, by way of gauging business metrics and catching
potential oversights in the initial requirements.
Below is our sample data, along with the aforementioned business questions translated to SQL,
and the output generated by each query.
● Are there any ethical anomalies in prescription writing, such as a patient obtaining the same
prescription from different doctors? We should note that because the patient_ssn, doctor_ssn,
and trade_name together uniquely identify a prescription, and duplicates are deleted, this query
will show us if a patient is obtaining the same prescription from two different doctors.
SELECT name from patient WHERE ssn IN (SELECT patient_ssn FROM prescribes
group by patient_ssn,trade_name having count(trade_name) > 1);
name
-------------
Chris Cringle
trade_name pcount
------------- ----------
Levothyroxine 2
specialty
---------------
Family Medicine
8
SELECT specialty FROM doctor WHERE ssn IN (SELECT doctor_ssn FROM has
GROUP BY doctor_ssn ORDER BY count(*) DESC LIMIT 1);
specialty
---------------
Family Medicine
trade_name
----------
Lipitor
Conclusion
In this project we explored the process of applying database systems design to a real world
problem. This is a multistep process with each step needing multiple revisions. Even with the
over-simplified business requirements we started with, we found potential pitfalls and oversights at every
turn, underscoring the countless decisions a database architect is promoted with and emphasizing the
importance of maintaining a continuous dialogue with a client during this process.
In writing our own queries after creating our database along with some data, it became clear that
there are additional ways to capitalize from a project like this, such as setting a client up with tools (such
as views) for reporting on their data and tracking metrics. Despite the many steps and revisions in this
process, we determined without any communication from requesting client, what we have created would
still likely be considered a draft in practice.