Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

CST-338: Final Project 1

2019-05-11
Holly Stephens
Frank Piva

Introduction
During this project we explored the process of data creation and design through modeling a ​drug
store chain​. Our goals were to emulate the process of constructing a database from a list of business
requirements, populate the database, and interact with it. Along the way, we were also hoping to gain
insight into what the potential pitfalls are of the database design process.

ER Model
Upon initial construction of our ER diagram, we quickly discovered that there were many possible
ways to interpret and codify the list of business requirements. This demonstrates how paramount
establishing a continuous line of communication with the client is. Because there was no client for us to
interact with, we found that we had to make countless assumptions about what the solution would look
like.

There were several integrity constraints depicted in the business requirements that could not be
illustrated in our ER model. For example, the requirement that records in the ​sells​ table are to be
deleted when their associated parent record is deleted (enacted by a delete on cascade constraint in the
table definition).
2

From the aforementioned deletion constraint, we began to consider other relationships that might
behave similarly that were not specified in our list of requirements. For instance, if a patient died or a
doctor left the company should we update the ​has​ and ​prescribes​ table? Another consideration was
whether a doctor should be allowed to be their own patient.
One aspect of constructing our model that garnered much of our focus and attention was the
relationship between ​doctor​, ​patient​, and ​drug​. We grappled with whether we should create a
separate entity,​ prescription​, or if we should store the necessary attributes in a triple relationship
table, ​prescribes​, we ended up deciding on the latter of these two options.

Relational Schema
We decided to use ​TEXT​ type instead of ​VARCHAR​ so we wouldn’t have to worry about specifying
what we thought an adequate amount of characters would be. During our research phase, we also
discovered that SQLite doesn’t actually enforce the restrictions place on ​VARCHAR​ attributes. A
programmer can declare an attribute to be a​ VARCHAR(20)​ and SQLite will happily store whatever the
user enters as a value. We specified constraints on certain values that we knew we could regulate. For
example, we decided that date values would be 10 characters long, 4 digits for the year, one digit for the
first dash, two digits for the month, one digit for the second dash, and two digits for the day. In a similar
fashion we went with 11 characters for social security numbers.
For all of our tables our functional dependencies depended on the primary key; this was by
design. We took our time when were creating the ER diagram so that we wouldn’t have too much
redundant information. Our functional dependencies are listed as follows:
● (company_name, pharmacy_name) → (end_date, legalese, start_date,
supervisor)
● (doctor_ssn, patient_ssn, trade_name) → (issue_date, quantity)
● (name) → (address, phone_number)
● (name) → (phone_number)
● (pharmacy, trade_name) → (price)
● (ssn) → (address, age, name)
● (ssn) → (name, specialty, years_of_experience)
● (trade_name) → (formula)

CREATE TABLE contract(


company_name TEXT NOT NULL,
end_date CHAR(10) NOT NULL,
legalese TEXT NOT NULL,
pharmacy_name TEXT NOT NULL,
start_date CHAR(10) NOT NULL,
supervisor TEXT NOT NULL,
FOREIGN KEY (company_name) REFERENCES company(name) ON DELETE CASCADE,
FOREIGN KEY (pharmacy_name) REFERENCES pharmacy(name) ON DELETE CASCADE,
PRIMARY KEY (company_name, pharmacy_name)
);

CREATE TABLE company(


phone_number CHAR(12) NOT NULL,
name TEXT NOT NULL,
3

PRIMARY KEY (name)


);

CREATE TABLE doctor(


name TEXT NOT NULL,
specialty TEXT NOT NULL,
ssn CHAR(11) NOT NULL,
years_of_experience INTEGER NOT NULL,
PRIMARY KEY (ssn)
);

CREATE TABLE drug(


formula TEXT NOT NULL,
trade_name TEXT NOT NULL,
PRIMARY KEY (trade_name)
);

CREATE TABLE has(


doctor_ssn CHAR(11) NOT NULL,
patient_ssn CHAR(11) NOT NULL,
FOREIGN KEY (doctor_ssn) REFERENCES doctor(ssn) ON DELETE CASCADE,
FOREIGN KEY (patient_ssn) REFERENCES patient(ssn) ON DELETE CASCADE,
PRIMARY KEY (doctor_ssn, patient_ssn)
);

CREATE TABLE patient(


address TEXT NOT NULL,
age INTEGER NOT NULL,
name TEXT NOT NULL,
ssn CHAR(11) NOT NULL,
PRIMARY KEY (ssn)
);

CREATE TABLE pharmacy(


address TEXT NOT NULL,
phone_number CHAR(12) NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY (name)
);

CREATE TABLE prescribes(


doctor_ssn CHAR(11) NOT NULL,
issue_date CHAR(10) NOT NULL,
patient_ssn CHAR(11) NOT NULL,
quantity INTEGER NOT NULL,
trade_name TEXT NOT NULL,
FOREIGN KEY (doctor_ssn) REFERENCES doctor(ssn) ON DELETE CASCADE,
FOREIGN KEY (patient_ssn) REFERENCES patient(ssn) ON DELETE CASCADE,
4

FOREIGN KEY (trade_name) REFERENCES drug(trade_name) ON DELETE CASCADE,


PRIMARY KEY (doctor_ssn, patient_ssn, trade_name)
);

CREATE TABLE sells(


pharmacy_name TEXT NOT NULL,
price REAL NOT NULL,
trade_name TEXT NOT NULL,
FOREIGN KEY (pharmacy_name) REFERENCES pharmacy(name) ON DELETE CASCADE,
FOREIGN KEY (trade_name) REFERENCES drug(trade_name) ON DELETE CASCADE,
PRIMARY KEY (pharmacy_name, trade_name)
);

CREATE TRIGGER update_prescription BEFORE INSERT ON prescribes BEGIN


DELETE FROM prescribes
WHERE
doctor_ssn = NEW.doctor_ssn AND
patient_ssn = NEW.patient_ssn AND
trade_name = NEW.trade_name;
END;

Normalized Relational Schema


We mostly avoided having to normalize our relational schema. There was some concern about
the ​supervisor​ attribute on a ​contract​, but this didn’t present any problems since no other information
about the supervisor needed to be stored.
In the ​prescribes​ table, the primary key is a composite of foreign keys, referenced from the
doctor​, ​patient​, and ​drug​ tables. After some discussion we were able to determine that no functional
dependencies were present because each primary key would produce a different combination of
attributes. We avoided needing to normalize by having composite keys.
A place for potential redundancy in this table, however, was in the stipulation that a doctor may
prescribe a patient the same drug more than once. Since only the latest prescription needed to be stored
in the database, we worked around this by creating a trigger. The trigger checks, before an insert on the
prescribes table, if there is already a record that matches the primary key. If a match is found, it is deleted
from the table, allowing the new entry with the same primary key to be written.

Sample Data and SQL Queries


Once we initialized our schema, we began to populate our database. We researched local
pharmacies and drugs online, to create realistic data that would generate interesting results. Wanting to
keep our model reasonably sized, we ended up deciding to use four drugstore companies that are well
know in both the nationwide workforce and in the bay area. We quickly realized that all of these
companies are chais, meaning they have multiple pharmacies, often in the same city. We did our best to
reflect this in our small scale rendition of this system. Additionally, we replicated factual information about
the most frequently prescribed drugs in our database. Individual doctor and consumer information was
fabricated.
5

During this data creation process, some thought provoking questions arose: Are there any
patients that doctor shop? What is the most commonly prescribed drug? Which doctor specialty
prescribes the most drugs? What specialty is most common among primary care physicians? Which drug
has the highest variation in price by pharmacy? These are all questions that would be of use to the
hypothetical company this database was created for, by way of gauging business metrics and catching
potential oversights in the initial requirements.
Below is our sample data, along with the aforementioned business questions translated to SQL,
and the output generated by each query.

INSERT INTO company VALUES('CVS', '800-746-7287');


INSERT INTO company VALUES('Walgreens', '800-925-4733');
INSERT INTO company VALUES('Walmart', '800-925-6278 ');
INSERT INTO company VALUES('Rite Aid', '800-748-3243');
INSERT INTO pharmacy VALUES('CVS Pharmacy Marina', '268 Reservation Rd, Marina,
CA 93933', '831-384-1605');
INSERT INTO pharmacy VALUES('CVS Pharmacy Watsonville', '1966 Main St,
Watsonville, CA 95076', '831-722-1782');
INSERT INTO pharmacy VALUES('Walgreens Pharmacy', '1810 Freedom Blvd, Freedom,
CA 95019', '831-768-0183');
INSERT INTO pharmacy VALUES('Walmart Pharmacy', '150 Beach Rd, Marina, CA
93933', '831-883-9920');
INSERT INTO pharmacy VALUES('Rite Aid Pharmacy', '901 Soquel Ave, Santa Cruz,
CA 95062', '831-426-4303');
INSERT INTO drug VALUES('Lisinopril', 'C21H31N3O5');
INSERT INTO drug VALUES('Levothyroxine', 'C15H11O4I4N');
INSERT INTO drug VALUES('Azithromycin', 'C38H72N2O2H2O');
INSERT INTO drug VALUES('Metformin', 'C4H11N5');
INSERT INTO drug VALUES('Lipitor', 'C33H34FN2O52Ca');
INSERT INTO drug VALUES('Amlodipine', 'C20H25CIN2O');
INSERT INTO drug VALUES('Sertraline', 'C17H17Cl2N');
INSERT INTO sells VALUES('CVS Pharmacy Marina', 14.00, 'Lisinopril');
INSERT INTO sells VALUES('CVS Pharmacy Watsonville', 14.00, 'Lisinopril');
INSERT INTO sells VALUES('Rite Aid Pharmacy', 35.00, 'Lisinopril');
INSERT INTO sells VALUES('Walmart Pharmacy', 4.00, 'Lisinopril');
INSERT INTO sells VALUES('Walgreens Pharmacy', 19.00, 'Lisinopril');
INSERT INTO sells VALUES('CVS Pharmacy Marina', 33.00, 'Levothyroxine');
INSERT INTO sells VALUES('CVS Pharmacy Watsonville', 33.00, 'Levothyroxine');
INSERT INTO sells VALUES('Rite Aid Pharmacy', 58.00, 'Levothyroxine');
INSERT INTO sells VALUES('Walmart Pharmacy', 10.00, 'Levothyroxine');
INSERT INTO sells VALUES('Walgreens Pharmacy', 18.84, 'Levothyroxine');
INSERT INTO sells VALUES('CVS Pharmacy Marina', 40.00, 'Azithromycin');
INSERT INTO sells VALUES('Rite Aid Pharmacy', 45.00, 'Azithromycin');
INSERT INTO sells VALUES('Walmart Pharmacy', 33.00, 'Azithromycin');
INSERT INTO sells VALUES('Walgreens Pharmacy', 40.00, 'Azithromycin');
INSERT INTO sells VALUES('CVS Pharmacy Watsonville', 20.00, 'Metformin');
INSERT INTO sells VALUES('Rite Aid Pharmacy', 43.00, 'Metformin');
INSERT INTO sells VALUES('Walmart Pharmacy', 4.00, 'Metformin');
INSERT INTO sells VALUES('Walgreens Pharmacy', 23.00, 'Metformin');
6

INSERT INTO sells VALUES('CVS Pharmacy Watsonville', 114.00, 'Lipitor');


INSERT INTO sells VALUES('Rite Aid Pharmacy', 140.00, 'Lipitor');
INSERT INTO sells VALUES('Walmart Pharmacy', 9.00, 'Lipitor');
INSERT INTO sells VALUES('Walgreens Pharmacy', 226.00, 'Lipitor');
INSERT INTO sells VALUES('CVS Pharmacy Marina', 52.00, 'Amlodipine');
INSERT INTO sells VALUES('CVS Pharmacy Watsonville', 52.00, 'Amlodipine');
INSERT INTO sells VALUES('Rite Aid Pharmacy', 71.00, 'Amlodipine');
INSERT INTO sells VALUES('Walmart Pharmacy', 4.00, 'Amlodipine');
INSERT INTO sells VALUES('Walgreens Pharmacy', 58.00, 'Amlodipine');
INSERT INTO sells VALUES('CVS Pharmacy Marina', 46.00, 'Sertraline');
INSERT INTO sells VALUES('CVS Pharmacy Watsonville', 46.00, 'Sertraline');
INSERT INTO sells VALUES('Rite Aid Pharmacy', 85.00, 'Sertraline');
INSERT INTO sells VALUES('Walmart Pharmacy', 9.00, 'Sertraline');
INSERT INTO sells VALUES('Walgreens Pharmacy', 52.00, 'Sertraline');
INSERT INTO contract VALUES('CVS', 'CVS Pharmacy Marina', '01-01-2024',
'01-01-2001', 'Super Visor', 'this is a legalese');
INSERT INTO contract VALUES('CVS', 'CVS Pharmacy Watsonville', '01-01-2027',
'01-01-2001', 'Su Vi', 'this also is a legalese');
INSERT INTO contract VALUES('Walgreens', 'Walgreens Pharmacy', '04-01-2027',
'01-01-2000', 'Su Vor', 'so is this');
INSERT INTO contract VALUES('Walmart', 'Walmart Pharmacy', '01-01-2027',
'01-01-2001', 'Susu Sudio', 'this one too');
INSERT INTO contract VALUES('Rite Aid', 'Rite Aid Pharmacy', '01-01-2029',
'01-01-2011', 'Phil Collins', 'this one might not be..');
INSERT INTO doctor VALUES('000-01-3333', 'Dr. Dolittle', 'Family Medicine', 5);
INSERT INTO doctor VALUES('000-02-3333', 'Dr. Doesalot', 'Family Medicine',
15);
INSERT INTO doctor VALUES('000-03-3333', 'Dr. Doesokay', 'Dermatology', 9);
INSERT INTO doctor VALUES('000-04-3333', 'Dr. Doesnothing', 'Psychiatry', 7);
INSERT INTO doctor VALUES('000-05-3333', 'Dr. Doseoff', 'Anesthesiology', 10);
INSERT INTO doctor VALUES('000-06-3333', 'Dr. Dontpanic', 'Emergency medicine',
22);
INSERT INTO patient VALUES('111-08-3333','Chris Cringle', '111 A street',33);
INSERT INTO patient VALUES('111-09-3333', 'Hubert Cumberdale', '111 B street',
55);
INSERT INTO patient VALUES('111-10-3333', 'Mickey Mouse','111 C street', 72);
INSERT INTO patient VALUES('111-11-3333', 'Ol Man Jenkins', '111 D street',
83);
INSERT INTO patient VALUES('111-12-3333', 'Billy Thekid', '111 D street', 10);
INSERT INTO has VALUES('000-01-3333','111-08-3333');
INSERT INTO has VALUES('000-02-3333','111-09-3333');
INSERT INTO has VALUES('000-04-3333','111-10-3333');
INSERT INTO has VALUES('000-02-3333','111-11-3333');
INSERT INTO has VALUES('000-02-3333','111-12-3333');
INSERT INTO prescribes VALUES('000-01-3333','04-18-2019','111-08-3333', 30,
'Levothyroxine');
INSERT INTO prescribes VALUES('000-02-3333','04-18-2019','111-08-3333', 30,
'Levothyroxine');
7

INSERT INTO prescribes VALUES('000-01-3333','05-01-2019','111-08-3333', 30,


'Azithromycin');
INSERT INTO prescribes VALUES('000-02-3333','05-05-2019','111-08-3333', 30,
'Metformin');
INSERT INTO prescribes VALUES('000-02-3333','05-05-2019','111-09-3333', 30,
'Amlodipine');
INSERT INTO prescribes VALUES('000-02-3333','05-05-2019','111-09-3333', 30,
'Lisinopril');
INSERT INTO prescribes VALUES('000-02-3333','05-05-2019','111-11-3333', 30,
'Lisinopril');
INSERT INTO prescribes VALUES('000-02-3333','05-05-2019','111-12-3333', 30,
'Lipitor');
INSERT INTO prescribes VALUES('000-04-3333','03-01-2019','111-10-3333', 30,
'Sertraline');
INSERT INTO prescribes VALUES('000-04-3333','04-01-2019','111-10-3333', 30,
'Sertraline');
INSERT INTO prescribes VALUES('000-04-3333','05-01-2019','111-10-3333', 30,
'Sertraline');

● Are there any ethical anomalies in prescription writing, such as a patient obtaining the same
prescription from different doctors? We should note that because the patient_ssn, doctor_ssn,
and trade_name together uniquely identify a prescription, and duplicates are deleted, this query
will show us if a patient is obtaining the same prescription from two different doctors.

SELECT name from patient WHERE ssn IN (SELECT patient_ssn FROM prescribes
group by patient_ssn,trade_name having count(trade_name) > 1);

name
-------------
Chris Cringle

● What are the most commonly prescribed drugs?

SELECT trade_name, count(*) AS pcount FROM prescribes GROUP BY trade_name


ORDER BY pcount DESC LIMIT 1;

trade_name pcount
------------- ----------
Levothyroxine 2

● What specialty prescribes the most drugs?

SELECT specialty FROM doctor WHERE ssn IN (SELECT doctor_ssn FROM


prescribes GROUP BY doctor_ssn ORDER BY count(*) DESC LIMIT 1);

specialty
---------------
Family Medicine
8

● What specialty is most common among primary care physicians?

SELECT specialty FROM doctor WHERE ssn IN (SELECT doctor_ssn FROM has
GROUP BY doctor_ssn ORDER BY count(*) DESC LIMIT 1);

specialty
---------------
Family Medicine

● Which drug has the highest variation in price by pharmacy?

SELECT trade_name FROM sells GROUP BY trade_name ORDER BY max(price) -


min(price) DESC LIMIT 1;

trade_name
----------
Lipitor

Conclusion
In this project we explored the process of applying database systems design to a real world
problem. This is a multistep process with each step needing multiple revisions. Even with the
over-simplified business requirements we started with, we found potential pitfalls and oversights at every
turn, underscoring the countless decisions a database architect is promoted with and emphasizing the
importance of maintaining a continuous dialogue with a client during this process.
In writing our own queries after creating our database along with some data, it became clear that
there are additional ways to capitalize from a project like this, such as setting a client up with tools (such
as views) for reporting on their data and tracking metrics. Despite the many steps and revisions in this
process, we determined without any communication from requesting client, what we have created would
still likely be considered a draft in practice.

You might also like