Professional Documents
Culture Documents
Open Mart Improvement Plan
Open Mart Improvement Plan
Introduction
Retail
analytics
is
the
in-depth
process
of
retail
improvement
through
smarter
and
more
effective
business
decisions.
These
decisions
are
driven
by
the
analysiss
data
which
supports
possible
choices
and
options
for
retail
companies.
This
data
is
retrieved
through
studies
that
include
analyzing
past
retail
transactions
and
the
details
of
each.
Trends
can
be
observed
through
the
retail
data
leading
to
future
predictions
and
ultimately
a
more
efficiently
run
business.
Retail
analytics
can
help
implement
an
entirely
new
system
to
a
retail
operation,
completely
transforming
the
way
a
business
runs.
Problem
Description
Open*Mart
is
a
retail
company
specializing
in
providing
customers
with
the
products
they
need
whether
it
be
home
appliances,
groceries,
clothing,
computers,
or
more.
They
focus
on
providing
their
customers
with
specialized
sections,
each
focusing
on
different
product
types.
With
multiple
locations
around
the
U.S.,
Open*Mart
aims
to
run
each
locations
at
top
efficiency.
Dr.
Yoo,
the
manager
of
the
Monroeville,
PA
Open*Mart
location,
is
interested
in
conducting
a
retail
analysis
of
his
store.
Hes
looking
to
improve
the
sales
and
increase
profits;
all
while
making
things
run
more
smoothly.
As
a
retail
manager,
Dr.
Yoo
is
not
confident
enough
to
improve
his
store
on
his
own.
After
receiving
a
call
from
his
CEO,
Dr.
Liying,
Dr.
Yoo
is
hoping
to
provide
Dr.
Liying
with
a
detailed
analysis
of
an
efficiently
run
store.
Dr.
Yoo
has
hired
Dr.
Reddy,
an
employee
of
Customer
Relationship
Management
(CRM)
to
provide
him
with
detailed
transaction
and
demographic
data.
Mr.
Reddy
collected
transaction
data
related
to
the
previous
two
years
of
sales.
This
data
was
dumped
into
a
data
warehouse.
The
transaction
data
contains
the
following
information:
Customer
ID,
Item
Type,
Item
Number,
Vendor
ID,
Week,
Day,
and
Units
Bought.
Using
a
data
dictionary,
each
attribute
for
each
transaction
is
given
a
number
relating
to
a
specific
definition.
In
addition
to
the
transaction
data,
Mr.
Reddy
contacted
his
manager,
Mr.
David,
to
assist
him
in
the
data
analysis.
Mr.
David
gathered
demographic
data.
This
data
includes
information
pertaining
to
the
customers
of
the
Monroeville,
PA
Open*Mart
location.
Details
of
customers
family
size,
income,
ethnicity,
pets,
tvs,
ages,
children,
work
hours,
occupation,
education,
and
magazine
subscriptions
are
included
in
the
data.
Stored
similarly
to
the
transaction
data,
the
demographic
data
also
contains
numbers
relating
to
a
data
dictionary.
The
problem
at
hand
for
Dr.
Yoo
is
a
store
below
the
quality
Dr.
Liying
would
approve
of.
He
is
relying
on
Mr.
Reddy
to
provide
him
with
the
necessary
report
to
impress
Dr.
Liying
on
the
status
of
his
store.
Project
Objectives
Dr.
Yoo
is,
overall,
aiming
to
improve
his
store.
This
can
be
achieved
by
improving
sales
and
improving
efficiency.
Improving
sales
will
be
accomplished
by
pulling
in
more
customers.
1
Through
advertising
and
couponing,
more
customers
will
learn
about
more
deals
and
more
products.
It
is
important
to
know
how
to
advertise
to
customers
based
on
their
specific
needs.
Customers
can
be
grouped
based
on
family
and
transaction
characteristics.
Family
characteristics
include
family
size,
ages,
ethnicity,
income,
and
more.
Transaction
characteristics
include
items
bought,
quantity
of
items
bought,
frequency
of
purchases,
items
bought
together,
and
more.
Improving
efficiency
will
be
achieved
by
analyzing
trends
in
purchases.
By
knowing
what
has
been
purchased
together
and
when
it
has
been
purchased,
Dr.
Yoo
will
know
what
items
to
have
on
stock
for
the
future.
Improved
efficiency
can
also
be
reached
through
store
layout.
Placing
items
that
are
frequently
purchased
together
near
each
other,
customers
can
find
their
desired
products
more
quickly.
Simple
ways
to
locate
wanted
items
is
important;
it
keeps
customers
happy
so
that
they
are
sure
to
return
to
Dr.
Yoos
store
again.
Methodology
This
project
requires
turning
a
large
set
of
raw
data
into
usable
information
by
using
explicit
data
mining
techniques
to
give
a
store
specific
advice
and
recommendations.
The
first
step
that
needed
to
be
done
was
to
comprehend
the
database
schema.
This
schema
needs
to
cover
all
the
information
that
is
included
for
this
project.
For
this
to
work
properly
it
also
requires
the
use
of
primary
and
foreign
keys
in
order
to
build
relationships.
When
the
schema
is
completely
filled
out,
an
ER
diagram
can
be
fabricated
with
the
information.
The
ER
diagram
for
this
project
needed
to
have
many
tables
and
relationships
that
completely
cover
the
data
being
used.
It
was
determined
that
7
tables
should
be
used
for
this
project:
coupon
usage,
customer
information,
female
information,
male
information,
items
in
the
store,
transaction,
and
subscription.
Then
we
also
needed
6
relationships
in
order
to
link
the
tables,
this
included:
items
bought,
coupons
used,
subscriptions,
transactions,
males
in
household,
and
females
in
household.
With
all
this
set
up
the
next
step
is
to
set
up
these
data
bases
in
Microsoft
Access
and
upload
the
data
from
Microsoft
Excel.
Then
all
the
data
types
and
relationships
are
completed
so
that
the
Access
file
has
all
the
information
and
it
is
all
associated
together
logically.
The
Microsoft
Access
database
allows
us
to
write
different
queries
in
order
to
find
target
data.
This
was
the
main
focus
in
the
next
step
quires
were
written
that
allows
us
to
locate
useful
information.
The
queries
that
we
decided
to
write
included:
what
items
are
bought
together,
what
items
people
with
children
buy,
where
people
get
the
most
coupons
from,
what
are
the
major
subscriptions
and
what
they
buy,
finding
top
customers
and
products,
and
finally
what
is
the
most
popular
brand
of
snacks.
These
quires
give
us
information
that
allow
us
to
easily
identify
what
items
to
advertise,
how
to
layout
the
store,
and
what
type
of
products
to
sell
more
of.
After
the
queries
are
written
out
another
method
to
gather
information
on
a
database
is
K
Means
clustering.
K
Means
clustering
is
an
advance
algorithm
that
determines
the
buying
habits
of
customers
and
groups
them
into
similar
behaviors.
This
algorithm
was
written
in
Microsoft
Excel
with
VBA
coding
to
take
the
purchasing
information
of
2
products
and
group
their
buyers
by
how
much
they
buy.
This
was
done
for
8
pairs
of
items
to
better
understand
customers
buying
habits.
Along
with
K
means
clustering
another
tool
to
understand
the
2
customers buying habits is the similarity analysis. The similarity analysis gives a good understanding on what items are purchased by a certain demographic of people. This information can be used to send out coupons and advertisements to those demographics of people that buy a product the most. The last thing to do in the project was to take all the information that was gathered in the previous steps and make detailed recommendations that could benefit the company. These recommendations include product placement within the store, who to advertise certain products to, what products to buy more of, and what deals to give on items bought together. These recommendations could save the company a lot of money on advertising costs by only selecting a target demographic of people to publicize to. These recommendations can also lead to higher customer loyalty by sending deals to frequent customers. Database Design The information that was given pertaining to the stores transactions and customers were examined and split into seven tables in order to make the information easier to analyze. Each of the tables names and attributes can be seen below in the database schema. Transaction(TransactionID, CustomerID, Week, Day, UnitsBought) Item(TransactionID, ItemType, ItemNumber, VendorID) Coupon(TransactionID, CouponValue, CouponOrigin) Customer(CustomerID, FamilySize, Income, Ethnicity, Dogs, Cats, NumberTVs, Children) Subscriptions(CustomerID, Cable, Newspaper, BetterH&G, GoodHouse, LadiesHJ, McCalls, Redbook, ReadersDigest, Cosmopolitan, TVGuide, People, Glamour, Time, Newsweek) MaleInformation(CustomerID, Age, WorkHours, Occupation, Education) FemaleInformation(CustomerID, Age, WorkHours, Occuparion, Education) In order to set up the relationships for each of the tables given above, an ER diagram was constructed. The ER diagram can be found in the Appendix and shows how the whole database is related, as well as the primary keys for each table and all of the remaining attributes. Below are all of the queries that were used in order to analyze the information in the database. For each query there is a short summary of what it is meant to return, the code that was written, and a sample of the results. Types of Items Bought Organized by Number Children: This query organizes the type of item bought along with how many children the customer has. It then gives the number of each of the units bought. This query is used to determine how to advertise to people with children and also to better organize the store.
SELECT customer.children, item.itemtype, count(item.itemtype) AS numberofitems FROM customer, item, [transaction] WHERE customer.customerid = transaction.customerid AND transaction.transactionid = item.transactionid GROUP BY customer.children, item.itemtype;
Image 1
Where the Coupons Came From: This table shows the location where each of the coupons used came from. This query was used to place coupons in locations that they will be used and seen the most. SELECT couponorigin, count(couponorigin) AS number_used FROM coupon WHERE couponorigin > 18 GROUP BY couponorigin ORDER BY count(couponorigin) DESC;
Image 2
Number of Each Subscriptions that the Customers have: This shows an example of the number of subscriptions the customers have for Better home & Gardens. This code was repeated for each of the types of subscriptions. This query was used to determine what are the most popular magazine so that coupons and advertisements can be used more efficiently. SELECT count(betterhg) AS Better_home_garden FROM subsciption WHERE betterhg = "yes";
Image 3
Number
of
Each
of
the
Items
Bought:
This
query
tells
the
top
and
bottom
number
of
units
sold.
This
can
be
analyzed
to
determine
placement
in
the
store
along
with
how
to
advertise
the
items.
SELECT
Item.ItemType,
SUM
(Transaction.UnitsBought)
AS
TotalUnits
FROM
Item,
[Transaction]
WHERE
Item.TransactionID=Transaction.TransactionID
5
Image 4
Top Customers: This query tells the top customer by how many units they bought. This information is useful to send special promotions to these people in order to keep them loyal to the company. SELECT TOP 10 Sum(Transaction.UnitsBought) AS TotalUnits, Customer.CustomerID FROM Customer, [Transaction] WHERE Customer.CustomerID = Transaction.CustomerID GROUP BY Customer.CustomerID ORDER BY SUM(Transaction.UnitsBought) DESC;
Image 5
Items Bought by TV Owners: This query tells what units are bought by people who own televisions. This information is useful in determining what items to advertise on television. SELECT item.itemtype, count(transaction.unitsbought) AS number_units_bought FROM item, subsciption, [transaction] WHERE subsciption.customerid = transaction.customerid AND transaction.transactionid = item.transactionid AND subsciption.cable ="yes" GROUP BY item.itemtype;
Image 6
Items Bought by People Who Have the Top 3 Subscriptions: The first query is the types of items bought by people who have a subscription for Better Homes & Gardens. Better home and gardens was determined to be the 3rd most popular subscription so knowing which items people who had this subscription bought can help determine what items to advertise. SELECT item.itemtype, count(item.itemtype) AS Number_of_units FROM item, subsciption, [transaction] WHERE subsciption.customerid = transaction.customerid AND transaction.transactionid = item.transactionid AND subsciption.betterhg = "yes" GROUP BY item.itemtype ORDER BY count(item.itemtype);
Image 7
Items Bought by People Who Have a Subscription to Readers Digest: The next query is the types of items bought by people who have a subscription to Readers Digest. Readers Digest was the 2nd most popular subscription, so knowing which items people who had this subscription bought can help determine what items to advertise. SELECT item.itemtype, count(item.itemtype) AS Number_of_units FROM item, subsciption, [transaction] WHERE subsciption.customerid = transaction.customerid AND transaction.transactionid = item.transactionid AND subsciption.readersdigest = "yes" GROUP BY item.itemtype ORDER BY count(item.itemtype);
Image 8
Items
Bought
by
People
Who
Have
a
Subscription
to
the
Newspaper:
This
query
is
for
people
who
have
a
subscription
to
the
newspaper.
The
newspaper
had
the
most
subscriptions
of
any
other
magazine
so
knowing
which
items
people
who
had
this
subscription
bought
can
determine
what
items
to
advertise.
Also
because
the
newspaper
is
circulated
in
local
areas
it
is
the
most
effective
way
to
advertise
to
subscription
holders.
9
SELECT item.itemtype, count(item.itemtype) AS Number_of_units FROM item, subsciption, [transaction] WHERE subsciption.customerid = transaction.customerid AND transaction.transactionid = item.transactionid AND subsciption.newspaper = "yes" GROUP BY item.itemtype ORDER BY count(item.itemtype);
Image 9
Items
Bought
Together:
This
query
shows
what
items
are
bought
together
by
the
customers.
This
information
is
useful
to
determine
any
special
deals
to
place
on
items
along
with
how
to
place
the
items
in
the
store.
SELECT
Transaction.Week,
Item.ItemType,
SUM
(Transaction.UnitsBought)
as
Item_Bought
FROM
[Transaction],
Item
WHERE
Transaction.TransactionID=Item.TransactionID
GROUP
BY
Transaction.Week,
Item.ItemType
10
ORDER BY transaction.week;
Image 10
Number of Units Sold of 17 by Each Vendor: This query breaks down how much each vendor sells of item number 17. This is useful in order to see which vendor has the most popular product in order to buy more from them and less from unpopular types. Select item.vendorid, count(transaction.unitsbought) AS units_bought FROM item, transaction WHERE item.itemtype = 17 AND item.transactionid = transaction.transactionid GROUP BY item.vendorid ORDER BY count(transaction.unitsbought) DESC;
11
Image 11
Analytics
K-means
clustering
was
used
in
order
to
group
customers
together
based
on
the
products
that
they
buy.
The
top
four
items
that
customers
buy
and
the
bottom
two
items
that
customers
buy
were
compared
using
k-means
clustering.
First,
the
number
of
units
bought
by
each
customer
was
used
to
construct
a
list
of
each
customer
and
how
many
of
each
of
the
two
items
they
bought.
Next,
the
three
columns
of
information,
customer
ID
and
the
number
of
each
item
bought
by
that
individual,
was
put
into
an
Excel
file
that
already
contained
the
Visual
Basic
code
for
k-means
clustering.
The
VBA
code
was
altered
for
each
individual
situation.
The
number
of
clusters
was
either
3
or
4,
and
the
number
of
data
points
for
each
situation
was
different.
After
the
code
was
properly
altered,
it
was
run
and
the
results
gave
which
customer
was
in
each
cluster
and
the
centroid
of
the
clusters.
From
this
information
a
plot
could
be
constructed
making
it
easy
to
see
where
the
clusters
fell
on
the
graph.
All
of
the
k-means
clustering
plots
that
were
used
in
the
analysis
can
be
seen
in
the
Appendix.
The
plots
were
then
studied
in
order
to
determine
which
groups
of
customers
would
be
best
to
use
to
perform
similarity
analysis.
For
instance,
if
a
cluster
of
customers
is
buying
a
lot
of
one
item
and
only
a
little
bit
of
another
item,
these
people
could
be
offered
promotions
that
would
get
them
to
buy
more
of
the
less
bought
item.
Also
through
k-means
clustering,
customers
can
be
analyzed
to
see
if
there
are
clusters
of
people
that
might
already
be
interested
in
a
certain
item
and
coupons
could
get
them
to
buy
more
of
these
items.
12
Results Item Sale Analysis Looking at the total sales for each item overall and per day helps visualize and understand overall sales.
Seen here in graph 1 the total sales of each item over the two year period are shown. This reiterates the data given in the queries.
Units Sold
Units Sold
The
graph
shown
here,
graph
2,
shows
the
total
sales
of
each
item
per
week.
This
will
help
Dr.
Yoo
to
know
when
the
store
will
be
busiest
during
the
week.
13
Similarity
Analysis
Similarity
Coefficients
Matrix
1
2
3
4
1
-
2
0.33
-
3
0.33
0.67
-
4
0.67
0.67
0.33
-
5
0.67
0.33
0.33
0.67
6
0.67
0.67
0.67
0.67
7
0.33
0.67
0.67
0.33
8
0.33
1
0.67
0.67
9
0.17
0.5
0.5
0.5
10
0.5
0.17
0.17
0.5
5
-
0.33
0.67
0.33
0.5
0.83
6
-
0.33
0.67
0.5
0.17
Table
1
8 - 0.5 0.17
9 - 0.67
10 -
This
similarity
coefficient
matrix
shows
the
top
ten
customers
that
bought
the
most
items
within
the
analyzed
time
frame.
The
matching
coefficient
was
used
to
obtain
the
percentages
shown.
These
are
based
on
6
attributes
that
were
used
to
determine
the
similarities
between
the
10
different
families.
Analyzing
the
matrix,
the
families
with
the
most
similar
attributes
recommended
that
targeting
other
families
with
the
same
attributes
would
sell
more
items
in
the
store.
The
people
that
should
be
targeted
when
creating
advertisements
are
families
of
at
least
three
people,
with
incomes
that
are
average,
under
35,000.
Both
families
also
did
not
subscribe
to
the
paper,
so
newspaper
ads
wouldnt
be
as
effective
as
other
types
of
advertisement.
Pets
were
also
not
present
with
these
families,
so
specials
on
the
animal
supplies
would
also
not
affect
these
shoppers.
Similarity
analysis
on
different
items
These
five
similarity
coefficient
matrices
were
taken
from
specific
clusters
in
the
K-means
clustering
data.
Four
of
them
are
comparing
the
most
bought
items
in
order
to
know
the
attributes
for
the
people
that
are
buying
the
most
from
the
store.
The
attributes
that
were
looked
at
included
family
size,
income,
children
and
certain
subscriptions.
The
largest
number
in
the
matrix
gave
the
two
customers
that
were
most
similar.
Since
they
are
buying
the
top
items,
their
attributes
were
analyzed
and
found
whom
to
target
with
advertisements.
14
1 2 3 4 5
Similarity
Coefficients
Matrix
for
item
8
and
12
1
2
3
4
-
0.57
-
0.71
0.57
-
0.57
0.71
0.57
-
0.57
0.71
0.57
1.00
Table
2
5 -
Here
customer
4
and
5
are
the
most
similar
so
they
were
looked
at
closer
to
try
to
generalize
what
type
of
person
is
most
likely
to
buy
the
two
items.
The
people
that
are
most
likely
to
buy
eggs
and
cook
are
a
family
of
one
person,
that
doesnt
make
more
than
$35,000
and
has
no
subscriptions
to
cable
or
the
newspaper.
This
will
help
in
advertising
because
it
is
known
that
for
these
two
items
the
newspaper
and
T.V.
are
not
places
to
advertise
towards.
Similarity
Coefficients
Matrix
for
items
8
and
3
1
2
3
4
5
1
-
2
0.43
-
3
0.43
0.71
-
4
0.43
0.43
0.43
-
5
0.57
0.86
0.57
0.51
-
Table
3
In
this
matrix
items
3
and
8
were
compared.
Customer
2
and
5
were
the
most
similar,
so
they
were
analyzed
further.
It
was
observed
that
a
family
of
one
person,
that
doesnt
make
more
than
$35,000
and
doesnt
subscribe
to
the
newspaper,
should
be
targeted
for
these
two
items.
It
is
realized
that
it
would
be
a
good
idea
to
group
all
the
items
together
when
advertising
because
similar
people
buy
all
three.
Similarity
Coefficients
Matrix
for
items
17
and
3
1
2
3
4
5
1
-
2
0.43
-
3
0.57
0.57
-
4
0.43
0.43
0.57
-
5
0.43
0.43
0.86
0.71
-
Table
4
15
In
this
matrix,
families
5
and
4
were
most
similar.
Their
attributes
were
not
all
the
same
however,
they
did
share
the
attribute
of
children
under
the
age
of
11.
When
trying
to
sell
more
snacks
and
butter,
it
is
recommended
to
target
families
with
kids.
It
is
noted
that
the
families
have
cable,
so
using
cable
advertisements
would
be
efficient
to
target
them.
Similarity
Coefficients
Matrix
for
items
17
and
12
1
2
3
4
5
1
-
2
0.57
-
3
0.57
0.43
-
4
0.57
0.71
0.43
-
5
0.57
0.71
0.71
0.43
-
Table
5
This
similarity
matrix
did
not
really
do
a
good
job
in
telling
whom
to
target.
Families
2,
4
and
5
are
looked
at
to
see
what
they
had
in
common.
It
was
found
that
when
advertising
for
snacks
and
eggs,
newspaper
ads
would
not
be
very
effective.
This
is
due
to
the
fact
that
no
one
that
bought
these
items
subscribes
to
the
newspaper.
Similarity
Coefficients
Matrix
for
items
17
and
15
1
2
3
4
5
1
-
2
0.57
-
3
0.57
0.14
-
4
0.29
0.43
0.43
-
5
0.57
0.43
0.43
0.71
-
Table
6
This
analysis
gave
the
similarities
between
a
top
selling
item,
snacks,
and
a
lower
selling
item,
pizza.
The
two
families
that
had
the
same
income
under
$35,000
both
had
a
newspaper
subscription;
this
suggests
a
newspaper
ad
would
be
an
effective
choice.
16
Time
Series
Time
series
graphs
show
a
visual
representation
of
the
amount
of
each
product
bought
per
week
over
the
two
year
period.
300
250
200
150
100
50
0
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
101
Graph
3
1 2 3 4 5 6 7 8 9
This
graph
3
shows
the
time
series
for
all
items
over
the
two
year
period.
This
is
extremely
difficult
to
read,
however,
from
this
data
in
excel,
the
data
for
any
item
can
be
pulled
and
placed
into
an
individual
graph.
Graph
4
shows
the
sales
of
the
top
5
items
while
graph
5
shows
the
sales
of
the
bottom
5
items.
300
250
200
150
100
50
0
Buner
Cereal
Cook
Eggs
Snack
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101
Graph 4
17
300 250 200 150 100 50 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101 0 Cleansers Nuts Pill Pizza Soq
Graph 5
From
these
graphs
it
is
easy
to
see
the
difference
in
sales
while
also
noting
high
selling
weeks
and
low
selling
weeks.
This
data
is
taken
from
queries
looking
at
the
weeks,
the
items,
and
the
number
of
each
item
per
week.
The
importance
of
this
data
can
translate
into
many
areas
of
the
data
analysis.
Time
series
allow
Dr.
Yoo
to
estimate
sales
over
the
year.
This
leads
to
ordering
and
stocking
numbers.
Cutting
down
on
extra
products
ordered
can
save
money,
likewise,
not
ordering
enough
products
can
cause
disappointed
customers
and
declines
in
sales.
Trends
of
purchases
allow
Dr.
Yoo
to
be
fully
prepared
each
year.
Other
helpful
graphs
would
be
to
relate
high
selling
items
with
low
selling
items.
This
will
show
weeks
that
high
selling
items
peak
and
low
selling
items
drop.
300
250
200
150
100
50
0
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101
Graph 6
As
his
data
supply
increases
he
can
view
trends
in
weeks
during
the
year
when
one
item
is
always
particularly
high,
a
spike,
year
after
year.
This
will
allow
him
to
pair
this
item
with
18
another
item
that
has
a
particularly
low
sale
during
that
week.
With
coupons
discounting
lower
sales
items
with
regularly
priced
higher
sales
items,
the
sales
will
increase
for
those
lower
items.
An
example
could
be
taken
from
the
data
show
in
the
table
below.
This
table
shows
the
time
series
for
butter,
eggs,
nuts,
bacon,
and
pizza.
Butter
and
eggs
are
viewed
to
have
spikes
over
150
a
few
times
over
this
two
year
period.
With
such
a
high
sale
rate,
it
is
likely
that
this
trend
appears
every
year.
Looking
at
week
22
in
graph
6,
butter
spikes
to
250
sales
in
one
week.
With
so
many
purchases,
it
would
be
wise
to
manufacture
a
coupon
that
offers
a
deal
when
a
customer
purchases
butter;
they
get
nuts
at
a
discounted
price.
These
nuts
are
the
lowest
selling
item
during
that
week
22.
The
sales
of
nuts
should
increase,
therefore,
increasing
Dr.
Yoos
profits.
Recommendations
After
thoroughly
analyzing
the
data
that
was
supplied,
recommendations
were
planned
out
to
help
improve
Open*Marts
business.
One
of
the
queries
that
was
written
gave
the
number
of
each
product
that
was
bought
by
customers
that
have
TVs.
From
the
results
of
this
query
it
was
determined
that
for
the
three
top
items
from
this
list
they
should
be
advertised
on
TV.
These
items
include
item
17
(snacks),
item
5
(cereal),
and
item
12
(eggs).
A
query
comparing
which
products
customers
that
subscribe
to
Better
Home
and
Garden
bought
was
run
in
order
to
determine
which
item
would
be
best
to
advertise
in
this
magazine.
It
was
found
that
item
17
(snacks),
item
12
(eggs),
and
item
8
(cook)
would
all
benefit
from
being
advertised
in
Better
Home
and
Garden.
This
would
further
entice
the
people
that
buy
these
item
to
come
to
Open*Mart
to
buy
them.
Similar
queries
were
written
for
Readers
Digest
and
the
Newspaper.
Both
of
these
resulted
in
the
recommendation
to
advertise
item
17
(snacks),
item
5
(cereal),
and
item
12
(eggs)
in
the
given
subscriptions.
By
analyzing
the
vendors
that
supply
the
stores
items
it
was
found
that
Open*Mart
should
continue
to
buy
item
17
(snacks)
from
vendor
28400,
41200,
and
17423.
These
three
vendors
provide
the
brands
of
items
that
sell
best.
It
was
determined
that
the
top
3
customers
have
the
following
three
ID
numbers
15538702,
15514612,
and
15104398.
Since
these
three
customers
are
the
most
loyal
to
Open*Mart
and
buy
the
most
items,
coupons
should
be
sent
to
them
for
a
certain
percentage
off
their
next
purchase.
This
would
be
a
good
way
to
promote
customer
loyalty
and
reward
the
stores
best
supporters.
By
looking
into
where
most
of
the
coupons
used
originate,
the
best
way
of
providing
coupons
was
found.
Open*Mart
should
put
more
coupons
in
the
Sunday
Supplement
Vendor,
the
Newspaper
Ad
Store,
and
in-pack
with
other
purchases.
It
was
also
found
that
households
that
have
children
over
18
buy
the
most
from
Open*Mart.
Due
to
this
finding,
it
would
be
beneficial
to
send
these
families
coupon
booklets
so
that
they
keep
coming
back
to
Open*Mart
to
spend
their
money.
The
setup
of
the
store
can
be
very
helpful
in
promoting
the
items
that
people
normally
dont
buy.
Open*Mart
should
strategically
place
its
lowest
selling
items
in
the
front
of
the
store
19
where people constantly walk in and out. Similarly, the top selling items should be placed in the back of the store so that customers have to walk by all the other items and advertisements in order to get to what they came for. This will influence patrons to buy extra items when they come into Open*Mart which in turn will sell more products. The recommendations that sufficed from the similarity analysis could be summed up with these few generalizations. Advertise to people that have lower income salaries, and small families. Also have more advertisements on billboards, because not everyone subscribes to cable, newspaper, or magazines. Families with kids are more inclined to buy snacks; this could be used to advertise other products that might not be selling as well. By putting a coupon on certain snack items it could help boost sales. From the time series graphs seen in the results section, these graphs can assist with couponing and increasing item sales. Dr. Yoo could implement a system that creates coupons for the highest and lowest items per week. As seen in graph 6, the highest and lowest selling items can be paired together and marketed as a group and manufacture a weekly item of the week coupon. This will keep customers enticed and to continue shopping at his store. Group Members and Roles The beginning of the project took a lot of brainstorming. This stage of the project was mainly a group discussion about how we would tackle this assignment. As the assignment went on the tasks became split. Below is a list of team members and their contribution to the team. Patrick Clifford: Queries, Methodology Joe Gigliotti: Similarity Analysis, Data Input Brittany Murphy: Planning, Time Series Analysis, Intro/Problem Description/Project Objective of Report Michael Tomashefski: K-Means Clustering, ER Diagram, Store Layout Insights in Furthering into Future Industrial engineering and consulting are continuous improvement type of work. Dr. Yoos Monroeville location has been given a complete update. The way that he incorporates both the customer and transaction data into how he runs his store have been transformed into more efficient and more beneficial ways toward the customers and profit. However, there is always a way to improve and with unlimited time, Dr. Yoos Quick*Mart could be improved even further. In the future Dr. Yoo could implement further and even more in-depth data analysis. This could include examining more than just the top 10 customers. In order to increase sales it is important to figure out how Dr. Yoo could turn his bottom customers into top customers. The top customers have been analyzed, and now, Dr. Yoo understands what they are interested in and why they shop at Open*Mart. It could be helpful to, someday, understand every customer and what they look for when they shop at Open*Mart.
20
Through the time analysis chart, it is easy to see the item with the highest peak each week. Dr. Yoo could keep records of the top items each week along with the lowest items each week during the year. He could offer coupons that pair the lowest selling item to the highest selling item of each category, i.e. food, clothing, etc, to increase the sale of the lowest selling item for that week. Analyzing all items would take more time than the few that have been analyzed by Dr. Yoo in his first data analysis. Increasing inventory could be another way Dr. Yoo could reach out to more customers and increase sales. Some of his bottom customers may not be interested in the current inventory. Location expansion could be an idea for the CEO, Dr. Liying to look into for the future of not only the Monroeville location, but all Open*Mart locations across the nation. In addition to location improvements, Dr. Liying could look into overall Open*Mart industry improvements. In order for individual locations to run efficiently it is important for the Open*Mart distribution centers to also run efficiently. Possible improvements in the distribution centers could be automation, warehouse storage, truck packing during shipments, operation hours, and more. Retail works under the domino effect; keeping the top of the company efficient keeps all branches under the Open*Mart name running efficiently as well. Acknowledgements This project was greatly enhanced by the helpfulness and advice of Manini Madireddy and Akshay Ghurye. It would not have been possible without their help. They were a great resource of information throughout this project. References Elmasri, Ramez, and Shamkant Navathe. Fundamentals of Database Systems. 4th. Pearson Addison Wesley, 2003. Print.
21