Clinical Codes - Technical Document

Clinical Codeset Mapping: ACHI to CPT and ASDG to CDT
Scope
The objective of the project is to match the Australian Classification of Health Interventions
(ACHI) code descriptions to the nearest Current Procedural Terminology (CPT) code
descriptions, thereby obtaining a code-to-code matching table that can be used to convert
pre-existing data. A similar mapping table needs to be created by matching the Australian
Schedule of Dental Services and Glossary (ASDSG) code descriptions to the Current Dental
Terminology (CDT) code descriptions. One-to-many mappings need to be preserved and a
matching score provided.
Background
ACHI codes
ACHI codes, which stand for the Australian Classification of Health Interventions, are a
system used to classify procedures and interventions performed in Australian healthcare
settings.
Structure
● 7-character numeric codes: Each code represents a specific intervention.

● Based on MBS & ADA schedules: ACHI codes draw heavily from the Medicare
Benefits Schedule (MBS) and Australian Schedule of Dental Services and Glossary
(ADA).
● Organized thematically: Codes are grouped into chapters based on anatomical site
and procedure type.
Limitations
● Specificity: Some procedures may require additional codes or modifiers for precise
description.
● Updates: Regular updates may require adaptations in coding practices.
● International comparability: Not directly compatible with international coding
systems like CPT, necessitating mapping for cross-border comparisons.
CPT Codes
CPT Codes (Current Procedural Terminology) are numeric codes used to report medical,
surgical, and other healthcare procedures performed in the United States.
Structure
● 5-digit numeric codes: Each code represents a specific procedure.
● Hierarchical categorization: Codes are grouped into sections (e.g., Surgery,
Medicine), subsections (e.g., Cardiovascular Surgery), and categories (e.g., Thoracic
Surgery).
● Modifiers: Additional codes can be added to specify laterality, approach, complexity,
etc.
Limitations
● Complexity: The coding system can be complex, requiring trained professionals for
accurate application.
● Granularity: Certain procedures might require multiple codes with modifiers for
precise representation.
● Non-intuitive structure: The hierarchical organization may not be intuitive for all
users.
● International compatibility: CPT codes are specific to the US and differ from
international systems like ICD-10-PCS, requiring mapping for global comparisons.
Additionally, CPT codes are maintained by the American Medical Association (AMA). They
are copyrighted, and usage requires licenses and adherence to coding guidelines. Using
incorrect CPT codes can have significant consequences, including billing errors, legal issues,
and compromised patient care.
Mapping
Unfortunately, directly mapping ACHI codes to CPT codes based solely on descriptions can
lead to inaccurate translations. This is because:
● Descriptions can be subjective and vary: Different coders interpret descriptions
differently, leading to inconsistencies in mappings.
● One ACHI code can map to multiple CPT codes: Depending on the specific
procedure details, a single ACHI code might translate to several CPT codes with
varying levels of complexity.
● CPT codes have specific coding rules: Accurately assigning CPT codes requires
considering factors like laterality, approach, and specific devices used, which
descriptions often lack.
While still under development, research is exploring machine learning models trained on
large datasets of coded procedures to learn relationships between code descriptions and
codes themselves. This might offer future possibilities for automated mapping, but
currently, it's not widely available or recommended for clinical use.
Existing approaches
● Supervised learning models: Researchers have explored training models on large

datasets of coded procedures, where both ACHI and CPT codes are present. The
model learns the relationships between descriptions and codes, aiming to predict
CPT codes for new ACHI descriptions.
● Natural Language Processing (NLP) techniques help machines understand text.
NLP can be used to analyze procedure descriptions and extract relevant features to
improve mapping accuracy.
Challenges and limitations
● Data availability: Training accurate ML models requires massive datasets of coded

procedures, which might be limited due to privacy concerns and data ownership
restrictions.
● Accuracy and reliability: Current models still struggle with achieving human-level
accuracy, especially for complex or ambiguous procedures.
● Explainability and transparency: Understanding how an ML model arrives at its
predictions can be challenging, raising concerns about potential biases and
inaccuracies.
Data
The input datasets for this project include ACHI, CPT, ASDSG, and CDT codesets, each with
the codes and their descriptions. The algorithm will be developed and tested on provisional
datasets while the final results will be delivered on complete current datasets.
Methodology
In Machine Learning approaches for mapping clinical codes, several NLP methodologies
play a key role. Here are some of the most prominent:
Text Representation
● Word Embeddings: Techniques like Word2Vec and GloVe convert words into
numerical vectors, capturing their semantic meaning and relationships. This helps
the model understand the nuances of language even with synonyms or
paraphrases.
● Sentence Embeddings: Techniques like BERT and Universal Sentence Encoder
encode entire sentences into vectors, considering context and sentiment along with
individual words. This enables the model to grasp the overall meaning of the
procedure description.
Feature Extraction
● Named Entity Recognition (NER): Identifies and classifies specific entities in the text,
such as anatomical locations, medical devices, and procedures themselves. This
helps pinpoint relevant information for code mapping.
● Part-of-Speech Tagging (POS): Labels each word with its grammatical function (verb,
noun, etc.). This helps understand the structure and relationships within the
sentence, aiding in accurate interpretation.
● Dependency Parsing: Creates a tree-like structure representing the grammatical
relationships between words. This captures how different parts of the sentence
contribute to the overall meaning, crucial for disentangling complex descriptions.
Explainability and Interpretability
Methods like Attention Mechanisms and LIME help understand which parts of the text the
model relies on for its predictions. This is crucial for identifying potential biases and
improving the overall trust and transparency of the system.
It's important to note that no single NLP technique is a silver bullet, and often combinations
of these methods are used for best results. The effectiveness of these techniques depends
heavily on the quality and size of the training data. Ongoing research and development are
continuously improving the accuracy and explainability of these NLP methods for medical
coding tasks. While these NLP methodologies show promise for future automated
mapping, ethical considerations and accuracy limitations demand careful evaluation before
widespread adoption in clinical settings.
Algorithms
Natural Language Processing Stage
Tokenization breaks down text into its basic units, like words, punctuation, or numbers,
and identifies individual words and phrases within procedure descriptions for further
analysis.
Part-of-Speech (POS) Tagging assigns grammatical labels (noun, verb, adjective, etc.) to
each token and helps understand the structure and relationships within sentences,
identifying relevant terms (e.g., verbs indicating procedures) and differentiating them from
less important ones (e.g., articles, prepositions).
Lemmatization reduces words to their base form (lemma), considering the grammatical
context. (e.g., "running" becomes "run"). This stage groups words with different verb
conjugations or noun plurals under their base form, improving consistency and reducing
vocabulary size for better code matching.
Named Entity Recognition (NER) identifies and classifies named entities like people,
locations, organizations, and in this case, potentially medical codes or anatomical terms.
This stage directly recognizes ACHI codes within descriptions if present, saving processing
time and effort. Additionally, identifying anatomical locations or medical devices mentioned
can aid in selecting appropriate CPT codes.
While these NLP stages can be helpful tools, relying solely on them for ACHI-CPT mapping is
not sufficient due to the complexity and nuances of medical coding. The accuracy and
effectiveness of these techniques depend heavily on the quality and domain-specificity of
the training data used.
Transformer Stage
Transformers are a powerful AI architecture using attention mechanisms to understand

relationships between words in a sentence. They process entire sentences simultaneously,
capturing context better than traditional sequential models.
BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained

Transformer model on a massive general text corpus, learning general language
understanding. Fine-tuned for specific tasks like question answering or sentiment analysis,
but not directly applicable to code mapping.
BioBERT is a BERT variant pre-trained on a biomedical text corpus, including PubMed

abstracts and clinical notes. It understands medical concepts and terminology better than
general BERT, making it more suitable for tasks like named entity recognition (NER) in
medical texts.
MIMIC 3 is a large, publicly available dataset of electronic health records (EHRs) from
intensive care units (ICUs). It is used to train and evaluate clinical language models like
BioBERT and Clinical-BioBERT.
Clinical-BioBERT is a BioBERT model fine-tuned on clinical text, specifically discharge

summaries from MIMIC 3. It can potentially improve tasks like NER or text classification on
clinical texts compared to BioBERT.
Embeddings and Similarity
In simple terms, vector embeddings are numerical representations of words or sentences

that capture their meaning and relationships within a specific context. Imagine a map
where words are like locations, and their distances represent how similar they are in
meaning.
Large amounts of text data (like books, articles, or code) are fed into a machine-learning
model. Each word is assigned a unique vector, initially random. As the model processes the
text, it analyzes how words are used together and adjust their vectors to reflect their
semantic relationships. For sentences, the individual word vectors are combined using
various techniques (e.g., averaging, attention mechanisms) to create a single vector
representing the entire sentence's meaning.
These vectors hold valuable information, (a) Similarity: words with closer vectors are
considered more semantically similar. (e.g., "king" and "queen" would have closer vectors
than "king" and "car"); (b) Context: the meaning of a word can shift depending on its
context. Vectors can capture these nuances by adjusting based on surrounding words.
This allows us to gauge conceptual similarity in several ways, (a) Distance: the closer the
vectors, the more similar the concepts; (b) Analogy: we can find words with similar
relationships to a given word pair; (c) Classification: models can be trained to categorize
words or sentences based on their vector representations (e.g., sentiment analysis, topic
modeling).
In the context of medical coding, evaluating embeddings and computing their semantic
similarity allows us to identify similar procedures by matching their descriptions even if
they use different wording. Similar methods can be used to contextually identify
appropriate code in the context of additional information, such as clinical notes and
discharge summaries.
Fuzzy Matching
In the context of information retrieval and data analysis, fuzzy matching comes into play
when you need to find matches between items even when they aren't identical. This is
particularly useful when dealing with:
● Typos and misspellings: "patient" vs. "pasiant"
● Abbreviations and variations: "CT scan" vs. "CAT scan"
● Synonyms and paraphrases: "remove appendix" vs. "appendectomy"
● Unsorted or incomplete data: "knee replacement surgery" vs. "replacement of
knee"
Traditional string-matching methods would miss these nuances, but fuzzy matching
techniques offer more flexibility.
Partial Matching focuses on identifying common subsequences within strings, even if they
are not in the same order or have additional characters. Examples: Levenshtein distance
measures the number of edits (insertions, deletions, substitutions) needed to transform
one string into another. Jaccard similarity compares the intersection of unique characters
between two strings. Unsorted Matching considers the overall meaning or semantic
similarity of strings, regardless of their exact word order or structure. TF-IDF weighting
assigns higher weights to important words, helping identify documents with similar topics
even if the wording differs.
Overall, fuzzy matching is a valuable tool for various tasks, including medical coding, where
dealing with data inconsistencies and variations is common. Choosing the appropriate
technique and carefully considering its limitations are crucial for ensuring accurate and
reliable results.
Procedure
We employ a promising approach for matching medical codes with descriptions using a
combination of Natural Language Processing (NLP), Transformers, and fuzzy matching
techniques.
1. Input:
a. Code descriptions: Textual descriptions of medical codes (e.g., ACHI codes).
b. Target codes: The reference set of medical codes for matching (e.g., CPT
codes).
2. Clinical-BioBERT Embeddings:
a. Pass each code description and target code through a pre-trained
Clinical-BioBERT model, generating a vector embedding that captures its
semantic meaning.
b. These embeddings represent the codes in a high-dimensional space where
similar codes will have closer vectors.
3. Cosine Similarity:
a. Calculate the cosine similarity between the embedding of a code description
and each target code embedding.
b. This score reflects the angular distance between the vectors, with higher
values indicating greater semantic similarity.
4. Partial Unordered Fuzzy Matching:
a. Employ a fuzzy matching technique like Levenshtein distance or Jaccard
similarity to compare the textual descriptions of code and target codes.
b. This accounts for potential variations in wording, typos, or abbreviations.
5. Composite Score:
a. Combine the cosine similarity score and the fuzzy matching score using a
weighted average or other suitable method.
b. This composite score incorporates both semantic and textual similarity,
potentially leading to more robust matching.
6. Matching and Ranking:
a. Rank the target codes for each code description based on their composite
scores.
b. The top-ranked codes represent the most likely matches for the given
description.
Overall, this approach leverages the strengths of both NLP and fuzzy matching to achieve
more accurate and flexible medical code matching.
Technical Considerations
Hyperparameter tuning
Fine-tune the weights in the composite score and any parameters within the fuzzy
matching technique for optimal performance on your specific data.
Evaluation metrics
Use appropriate metrics like precision, recall, and F1-score to evaluate the effectiveness of
your matching system.
Interpretability
Consider techniques to explain the rationale behind the matching decisions, especially for
critical applications.

Clinical Codes - Technical Document

Uploaded by

Copyright:

Available Formats

You might also like

Clinical Codes - Technical Document

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Clinical Codes - Technical Document

Uploaded by

Copyright:

Available Formats

Clinical Codeset Mapping: ACHI to CPT and ASDG to CDT

● 7-character numeric codes: Each code represents a specific intervention.

● Supervised learning models: Researchers have explored training models on large

Challenges and limitations

● Data availability: Training accurate ML models requires massive datasets of coded

Explainability and Interpretability

Transformers are a powerful AI architecture using attention mechanisms to understand

BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained

BioBERT is a BERT variant pre-trained on a biomedical text corpus, including PubMed

Clinical-BioBERT is a BioBERT model fine-tuned on clinical text, specifically discharge

Embeddings and Similarity

In simple terms, vector embeddings are numerical representations of words or sentences

You might also like