Professional Documents
Culture Documents
Why Address Standardization and Validation Matters & What You Can Do About It - Data Science Central
Why Address Standardization and Validation Matters & What You Can Do About It - Data Science Central
Why Address Standardization and Validation Matters & What You Can Do About It - Data Science Central
DOWNLOAD NOW ›
×
Sign Up
Sign In
HOME AI ML DL ANALYTICS BIG DATA DATAVIZ HADOOP PODCASTS WEBINARS FORUMS JOBS EDUCATION MEMBERSHIP GROUPS SEARCH
CONTACT
Why Address Standardization and Validation Matters & What You Can Do About It
Posted by Farah Kim on June 5, 2020 at 1:54am
View Blog
Poor address data is a complex data quality challenge that affects customers, businesses, and mailing service. Each year, millions of
dollars get wasted in resolving the consequences of poor address data. Mailers spend over $20 billion on UAA mail, while direct costs to
the USPS is over $1.5 billion/year. All this unnecessary cost is the result of poor, mismanaged, invalidated address data.
Over the years, working with Fortune 500 clients, we have seen the consequences of poor address data - disgruntled customers,
ballooning costs, inefficient operations, marketing blunders, embarrassing mistakes.... the list goes on.
Take for a moment and imagine this.
You have over a million customer records and nearly 23% of that record is either incomplete or inaccurate - this is not taking into account
records that are duplicated and are unstructured.
That's nearly 230,000 records that may be rendered useless or may cost you thousands of dollars in managing return mails. This is a
situation most companies face today, regardless of the controls they've put in place. When the data input is done by humans, it will always
be significantly flawed.
Sure, it’s human nature to make mistakes. Most of the time, consumers are lax when it comes to providing their address information on
physical or web forms. They may misspell a state name, write abbreviations, miss out a street number or forget their ZIP Code. It’s
inevitable that some mistakes will be made and incorrect data will be entered.
Does it mean though that companies are helpless? Should poor address data be resolved via manual means - like calling up customers or
using other records like bank statements and bills to verify? You could do that, but it's going to cost you time and effort - not to mention,
you're not addressing the core problem; that is the non-standardization and validation of address data according to the USPS or any
authority standard of your country.
Unless you place strict data entry controls on your web form or physical form, there is very little chance your data will be in this perfect
state. So the first limitation here is address standardization and there is no way you can manually do this for hundreds and thousands of
rows of data.
Here are the USPS guidelines:
Always put the address and the postage on the same side of your mailpiece.
On a letter, the address should be parallel to the longest side.
All capital letters.
No punctuation.
At least 10-point type.
One space between city and state.
Two spaces between state and ZIP Code.
Simple type fonts.
Left justified.
Black ink on white or light paper.
No reverse type (white printing on a black background).
If your address appears inside a window, make sure there is at least 1/8-inch clearance around the address. Sometimes parts of the
address slip out of view behind the window and mail processing machines can’t read the address.
If you are using address labels, make sure you don’t cut off any important information. Also make sure your labels are on straight.
Mail processing machines have trouble reading crooked or slanted information.
Next, let's talk about validation.
The USPS is the official database of addresses in the United States. If you want to check the validity of your address data, you're going to
have to match it to the USPS database. To do that, you will need access to a CASS Certified Vendor who will validate your address by
matching it against the USPS database These vendors have updated CASS files which means any new address or changes in locations
matching it against the USPS database. These vendors have updated CASS files which means any new address or changes in locations
that are recorded by the USPS willSix
beSuccess
availableFactors
for the for
DOWNLOAD NOW
Getting Started with Machine Learning
vendor.
›
×
Here's the tricky part.
To validate this data, you have to standardize it.
Then again, once you've cleansed, standardized, and validated the data, the number of invalid or non-existent records goes down
significantly. You can filter those records, verify the legitimately of the entity and if necessary, call them up to ask for accurate information.
It's important though to choose a tool that lets you tackle all three aspects of this problem:
1. Cleaning: The ability to clean up data by identifying common data quality errors (typos, format, non-printable characters, negative
spacing etc)
2. Standardization: Turn this data into an acceptable USPS format.
3. Validation or Verification: Verifying this data by matching it with the USPS database.
Most address verification software does not have strong data matching capabilities, which is at the heart of this function. Your choice of
software should be able to match your address data and give a 100% accuracy rate. If it misses matches because the content is not exact
in nature, it is not the right solution for you.
At the end of the day though, tools and gadgets can only do as much. You will need to implement certain business strategies that can help
you take care of this problem. These could be:
Training:
The first step towards quality is training – make sure people who are handling, interacting, using, and entering data know the impact they
have in the process and on downstream applications. They need to understand the consequences of bad data on the entire organization
and not just on one member or customer. Employees practicing data quality rules should be rewarded and appreciated.
Involve Business Users in the Quality Process:
Data is not just an IT problem. Business users are equally responsible for managing data. In fact, they are the sole owners of customer
data that is often used in marketing and sales purposes. This is why they need to be involved in the process and also need to be trained
for using data management tools.
Data Governance:
Set up a data governance team to create a data management plan and ensure that the organization follows the plan where each employee
understands the plan, their rule within the plan, and the expectations that come along with the role.
Lock Down Data & User Roles:
If anyone in your team can open up the CRM or the data source, muddle around with data and leave no footprints, you are in for serious
trouble. It’s necessary to create master data holders who have the right to access, enter, or process critical data. This should come in the
data management plan.
Remember though, you don't have to do a blanket address quality upgrade. Start small. Identify departments or activities that require
address data to carry out tasks as mail or package delivery, newsletter, or billing and start optimizing the data for each process.
You're not a victim of bad data. With plenty of tools and solutions now available, you can sort your data and prevent negative outcomes.
What are you doing in your organization to manage bad address data?
DSC Podcast
While this latest DSC podcast isn’t about sandwiches, it is related to lunch, specifically the no free lunch theorem. In short, the theorem states that no
algorithm can be equally good at learning everything, which means that you can’t know in advance which algorithm will work best on your data. Download
now.
Most Popular Content on DSC
Six Success Factors for Getting Started with Machine Learning
DOWNLOAD NOW ›
×
To not miss this type of content in the future, subscribe to our newsletter.
Follow us: Twitter | Facebook
Views: 438
Like
0 members like this
Share Facebook
Comment
Welcome to
Data Science Central
Sign Up
or Sign In