Download as pdf or txt
Download as pdf or txt
You are on page 1of 2


Chapter 7: Zillow Case Study

1. What is the source data for Zillow?

A:​ Zillow uses data mining features to collect raw data from sources like
county records, public data, user-submitted data etc. Through data mining,
they spot trends across property values. They use Machine Learning to study
these trends and calculate Zestimate values. Data mining also allows the
company to see how accurate Zestimate values are over time.

2. Describe how Zillow uses business intelligence to create a unique

product for its customers.
A:​ Zillow’s database covers more than 90 million homes, which represents
95 percent of the homes in the United States. Zillow uses Business
Intelligence to recalculate home valuations for each property every day, so it
can provide historical graphs on home valuations over time. Moreover, Zillow
allows its users to search for home sales and rentals by different criteria
such as monthly rent/mortgage payment, number of bedrooms or
bathrooms etc.

3. Why would a person searching Zillow want to use a data mart?

A:​ “A data mart contains a subset of information for a given business unit.”
A person using Zillow would want to use a data mart so that they can
personalize their home search in accordance to their needs such as their
preferred location, house type, budget etc.

4. Why would Zillow use a data lake?

A:​ “Data lake is a storage repository that holds a vast amount of raw data in
its original format until the business needs it.” The data can be later used for
Machine Learning, Predictive Analytics etc. Zillow would want to use a data
lake so that they can store all the necessary data and transform them into a
data mart and data warehouse when necessary. Since Zillow gained a
massive amount of traffic in a short span of time, it is very important for
them to be able to process and manage massive amounts of data quickly in
order to retain their customers.

5. Explain dirty data and its impact on a business?

A: ​According to Wikipedia, “Dirty data, also known as rogue data, are
inaccurate, incomplete or inconsistent data, especially in a computer system
or database.” Dirty data can hamper a business’s reputation by misleading
its clients with inaccurate and faulty data. As a result, the business can lose
its current and potential customers and possibly end up with business


6. What would happen to Zillow if it experienced dirty data?

A: ​As a real estate database company, it is very important for Zillow to
provide accurate data and information to its target audience. If they
experienced dirty data, it would be impossible for them to provide their
service with accuracy. Customers wouldn’t rely on them anymore. It would
result in a bad reputation for the company, that could end up loss of
revenues and possibly the collapse of their business.

You might also like