Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

DATA MIGRATION

Data migration is necessary when a company upgrades its database or system software, either
from one version to another or from one program to an entirely different program. Software can
be specifically written, using entire programs or just scripts, to facilitate data migration. Such
upgrades or program switches can take place as a result of regular company practices or as a
result of directives mandated in the wake of a company takeover.

Data migration is a key element to consider when adopting any new system, either through
purchase or new development. One would think that any two systems that maintain the
same sort of data must have performed similar tasks. Therefore, information from one
system should map to the other with ease. However, this is rarely the case.

Although migrating data can be a fairly time-consuming process, the benefits can be worth
the cost for those that "live and die" by trends in data. Additionally, old applications need
not be maintained. This section includes a discussion of the following topics: Data migration
definition, Migrating legacy data decision, and How to migrate data.

What is data migration?

Some key terms in understanding data migration are:

 Legacy data is the recorded information that exists in your current storage system,
and can include database records, spreadsheets, text files, scanned images and
paper documents. All these data formats can be migrated to a new system.
 Data migration is the process of importing legacy data to a new system. This can
involve entering the data manually, moving disk files from one folder (or computer)
to another, database insert queries, developing custom software, or other methods.
The specific method used for any particular system depends entirely on the systems
involved and the nature and state of the data being migrated.
 Data cleansing is the process of preparing legacy data for migration to a new
system. Because the architecture and storage method of new or updated systems
are usually quite different, legacy data often does not meet the criteria set by the
new system, and must be modified prior to migration. For example, the legacy
system may have allowed data to be entered in a way that is incompatible with the
new system. Architecture differences, design flaws in the legacy system, or other
factors can also render the data unfit for migration in its present state. The data
cleansing process manipulates, or cleans, the legacy data so it conforms to the new
system's requirements.

back to Top

Do we migrate legacy data?

The "Do we migrate legacy data?" question has been asked ever since the first companies
put data in one repository and decided to change systems. Here are some commonly asked
questions:

 Should we bring the data over to the new system?


 If so, should we bring all or part of the data?
 If just part, which parts? - based on creation date? based on status of case open or
closed?÷or some combination of these?
 If we choose to bring over data on or after a certain date, which should it be - last 3
months, last 6 months, last year?...should it be all the data?
 Should we filter in or out specific data?
 Are our desired criteria extractable from the existing database?
 Are the desired fields of data importable into the new system?
 Is data migration from our legacy system included in the purchase of the new
system?
 If not, do we have the expertise in-house to script an automated process to migrate
the data?
 If not, will we hire someone to do this?
 Will we perform this task manually?

When deciding on data migration, all factors should be examined before making the
assumption that the whole dataset or none of the dataset should be moved over to the new
system. The proof is in whether these data records will be used and acted upon when the
new system and process is in place. Two key variables to consider in deciding on data
migration include data volume and data value.

Data volume is the easiest variable in the decision process. How many data records are we
talking about: 1000, 10,000, 100,000, 250,000? How many are expected to come into the
new system on a weekly/monthly basis to replenish this supply? Check to see if there are
any technical barriers to bringing over a certain amount of data and also if large databases
will affect performance of system functions like searching. If not, then 10 records or
100,000 records should not make any difference.

If volume is low, then it may be well worth doing a migration so there is some database for
users and for trend analysis. If volume is high, then it may make sense to examine the
age/value of the data and start filtering on certain criteria.

Data value is a much harder variable in the decision process. Many times there are
different perceptions concerning what value the existing data provide. If users are not
working with older data in the current system, chances are they may not work with older
data in the new system even with improved search functionality. If migrating, you may
want to look at shorter-term date parameters - why bog down a system's performance with
data that are never used?

Criteria, as discussed in the questions above, can be date parameters, but can also include
other factors. Extracting the exact data based on some of these factors will depend on the
abilities of your current system and database as well as the ability to write the detailed
extraction script. Keeping it simple when possible is the best approach. However, there may
be circumstances where filtering data may make sense.

Once you have determined which data you want to migrate, then determining what parts of
the data record will also be important.

back to Top

How do we migrate data?

Once the decision is made to perform data migration and before migration can begin the
following analyses must be performed:

 Analyze and define source structure (structure of data in the legacy system)
 Analyze and define target structure (structure of data in the new system)
 Perform field mapping (mapping between the source and target structure with data
cleansing, if necessary)
 Define the migration process (automated vs. manual)

To analyze and define source and target structures, analysis must be performed on the
existing system as well as the new system to understand how it works, who uses it, and
what they use it for. A good starting point for gathering this information is in the existing
documentation for each system. This documentation could take the form of the original
specifications for the application, as well as the systems design and documentation
produced once the application was completed. Often this information will be missing or
incomplete with legacy applications, because there may be some time between when the
application was first developed and now.

You may also find crucial information in other forms of documentation, including guides,
manuals, tutorials, and training materials that end-users may have used. Most often this
type of material will provide background information on the functionality exposed to end-
users but may not provide details of how the underlying processes work.

For this part of the analysis, you may actually need to review and document the existing
code. This may be difficult, depending on the legacy platform. For example, if data are
being migrated from an AS/400 application written in RPG, assistance from an experienced
RPG programmer will be required, if those skills are not available in-house. This can be an
expensive part of the analysis process, because a resource to perform this analysis for you
may be necessary, but it is a vital part of the process, especially if the application has been
running and maintained over a long period of time. Undocumented code or fixes that are
critical to the application and that have not been documented elsewhere may exist.

Another key area to examine is how the data in the system are stored (i.e., in flat files,
files, or tables). What fields are included in those files/tables and what indexes are in use?
Also, a detailed analysis of any server processes that are running that may be related to the
data must be performed (e.g., if a nightly process runs across a file and updates it from
another system).

Now that the source and target structures are defined, the mapping from the legacy to the
target should fall into place fairly easily. Mapping should include documentation that
specifically identifies fields from the legacy system mapped to fields in the target system
and any necessary conversion or cleansing.

Once the analysis and mapping steps are completed, the process of importing the data into
the new system must be defined. This process may be a combination of automation and
manual processes or may be completely automated or may be completely manual. For
example, a process may:

 Create data extractions from the legacy system


 Cleanse the data extractions according to mapping guidelines
 Import the data extractions into the new system using that system's import feature
 Verify and test samplings of the data in the new system by comparing data reports
to the old system

Bottom Line. Data migration is a key element to consider when adopting any new system
either through purchase or new development. Data migration is not a simplistic task and if
not given due consideration early in the process of developing or purchasing a new system
it could become a very expensive task.

Migrating PM Data from the TIMS Legacy System. In TIMS, information is linked by the
patient (known as client) to PM and Surveillance data. TIMS is a client-centered (or patient-
centered) application. The following is a diagram of data for TIMS PM activities associated
with a patient.

The following types of medical data can be entered and maintained for a patient:

 Diagnoses - TB information about a patient's condition, disease, date, site of disease,


etc. used to classify the client's TB status
 Previous Medical History - Information about previous medical conditions
 Test Data - Various tests performed for a patient including skin tests, chest x-rays,
blood tests, HIV tests, bacteriology testing, and physical examination
 Medications - Drugs prescribed for a patient to include start time, stop time, stop
reason, dosages, etc.
 Contacts - Information about people with whom the client has been associated and
who may have been exposed to TB through that association
 Hospitalizations - Information about previous or current hospital care the patient has
received
 Referrals - Information about the person or organization that sent the patient to be
examined

Basic demographic information is stored in the following tables in TIMS:

 Client - Patient's name, address, date of birth, sex, race ethnicity, etc.
 Address - Multiple address locations for a client
 Alert - Note on the client that may be worthwhile or even necessary to inform clinic
staff, such as hard to treat, drug resistant, etc.
 Alias - Multiple names used by a client
 Phone - Multiple telephone numbers used/owned by a client
 Disposition - Status of a client, such as open or closed and if closed, then reason for
closure (Moved, Lost, Unknown, etc.)
 Appointments - Records of appointments established for the client

TIMS provides two options for extracting data: an export facility and an ad-hoc query
system built into the system module of the application.

The export facility will allow you to export entire database tables contained in the TIMS
database into the following formats:

 CSV
 DBase 2
 DBase 3
 Excel
 HTML
 PowerSoft Report
 Text
 Lotus 123
 SQL Syntax

The ad-hoc query system will allow you to extract TIMS information by selecting specific
tables and data items within those tables as well as imposing criteria filters on the
information that is extracted. The ad-hoc query system allows export to the following
formats:

 CSV
 CSV with Headers
 DBase 2
 DBase 3
 DIF
 Excel
 Excel with Headers
 HTML Table
 PowerSoft Report
 SQL
 SYLK
 SYLK with Headers
 Text
 Text with Headers
 WKS
 WKS with Headers
 WK1
 WK1 with Headers
 Windows Metafile

The Comma Separated Value (CSV) is a text format with commas that separate field values.
This is a commonly used file format recognized by the greatest number of software
products, and is therefore recommended.

Patient information to be imported to another PM system may contain both the personal
patient information and medical data. Data tables to review for selecting data to migrate
include:

 Address
 Alias
 Appointment
 Bacttest
 Bloodtest
 Client
 Contact
 Diagnosis
 Disposition
 Drugsusceptibility
 Exposuresite
 History
 Hivtest
 Hospital
 Interview
 Phone
 Physical
 Prescription
 Race
 Referral
 Refsource
 Request
 Skintest
 Supervisor
 Susceptibility
 TherapyEvent
 Worker
 Workerassignment
 Workerlanguage
 XRay
A dictionary of data tables is built into the TIMS application. This is a useful resource for
more information to describe the data tables, display the fields of data, type of fields, size,
etc. in TIMS. Information from the data dictionary will include:

 Table name
 TIMS field name
 Common field name
 Field type
 Field size
 Description
 Usage

The data dictionary can be printed by each table or by the entire listing of files and their
descriptions from TIMS.

These tools will help you identify tables and the type of information stored within the tables.
Ultimately, your site knows what types of data you are storing for PM information. Visit the
TIMS website for current information on TIMS data migration, contact the TIMS Helpdesk at
(770) 234-6500 or email TIMS Help for more information.

You might also like