Professional Documents
Culture Documents
Coding Clinical Information: Analysis of Clinicians Using Computerized Coding
Coding Clinical Information: Analysis of Clinicians Using Computerized Coding
Coding Clinical Information: Analysis of Clinicians Using Computerized Coding
Introduction
Unstructured free text is still the most frequently used form of documentation in medicine. From an information processing point of view, nexibility and freedom of expression are hampered by major difficulties to process free text with computers. Therefore, numerous ever-more complex coding and classification systems have been developed [1] using even multi-axial approaches and vocabularies of more than 100.000 codable terms [2, 3]. Naturallanguage processing systems. albeit rapidly improving, are still difficult to use in routine clinical settings, particularly in languages other than English where a more complex syntax and delayed availability of standard coding systems may complicate matters even more. Many clinical depart men ts do not use numeric codes for clinical documentation as a primary means of entering information into computer systems. Thus, for the time being, the extraction of codable information from free text through medical stafr rcmains the only viable
104
approach for many medical institutions, whereas data quality remains a concern [4-7]. In our hospital, coding of freetext data by physicians is a critical element of documenting patient care related processes. A computerized patient record system (PADS, Patient Archiving and Documentation System) was developed at the University of Munich in 1989 [8]. Amongst the systems' features is database managemen t of coded diagnoses using a database browsing and encoding tool for codes. This article describes a controlled experiment with the PADS lCD-encoding tool comparing it to traditional coding techniques.
text was not structured or formatted in any way; the terms in question were not highlighted. Clinicians were tested twice (with one exception). Thus, a total of 560 medical concepts was analysed. Before each test they were given a sheet with both exercise cases. They spent the. whole test time in a room with only the test supervisor present encouraging the physicians to work as fast as possible. They had to read the text, identify codable medical concepts, use the correct search term and then code the term identified. Timing was started after the text was read once to familiarise physicians with the scenario. During one session they had to code the identified terms using a paperback-type coding manual, flipping through pages (manual coding, mc). During the other session a com puterized coding tool was used (computer-assisted coding, cc), allowing the entry of terms not included in the lCD-9 code but linking the uscr vocabulary to the target terms using a thesaurus. The interval between mc and cc was at least one month. The sequence - manual or computerized session first I
lvIclh Inform iVIed 1996: 35: 104-7
~------------------------~
n. s.
p:5 0.001
p:5 0.01
n. s.
computerized (n
p:5 0.01
~ computerized (n
manual (control, n
=288) =272)
=144)
Fig. 1 Overview of the coding quality ofthe main ICD diagnosis code in both protocols (computerized vs. manual [control]), dem'onstrating a positive effect of the computerized approach. Analyzed ~". were correct codes (increase by 69.8%), close codes (no difference; for definition, see text), incorrect codes (reduction by 38.2%) and missing codes (reduction by 61.5%).
Fig. 2 Overview of coding quality of the (optional) V-modifier code in both protocols (computerized vs. manual [control]), demonstrating a positive effect of the computerized approach. Analyzed were correct codes (increase by 46.6%), incorrect (no change) and missing codes (reduction by 33.7%).
- was chosen at random. Due to a better documentation and transparency of computerized coding some parameters were available for the cc-group, only. For example, a stopwatch integrated into the electronic patient record was designed in a way that the entry to and exit from the main browsing and coding dialogue was recorded in a database. Repeated frustrating attempts to code one diagnosis by browsing the database of codes with the computerized search tool were summarized, resulting in a total coding time for that particular diagOnosis and giving a realistic estimate
Table 1 Data showing the advantage ofthe computerized approach over the traditional manual one. They also indicate that the time to process the entire narrative case document is 59% longer than expected from the sum of coding times for individual medical concepts.
about the real time spent in coding. In the mc-group, however, the line between the coding-endpoint of some diagnosis and the coding-starting point of the next diagnosis could not precisely be drawn without interference with the coding process. Therefore, data on coding time for individual cases are not available for the mc-group. For both groups (cc vs. mc) coded terms were analysed in terms of correct, incorrect, close and missing. As "close" we defined those user-defined lCD-codes which were identical with the target diagnoses in at least three rCD-digits.
Coding Database
rn our hospital a German clinical modification and subset of the WHO's ICD-9 is used [9] and available as a paperback-type manual. This coding database was used for both experimental groups. One feature is the use of modifier codes in addition to the main code used (usually a three-two-five digit numerical code). Three modifier codes were used: The Y-modifier code modifies the main code with terms such as "status after operation for. .. ". It is a one-digit numerical code placed in front
computerized
TypeoJcode Parameter
Correct code Close code Incorrect code Missing code
manual (control)
meanSE
5.20.44 3.01 0.32 5.11 0.63 2.70.21
meanSE
8.830.26 2.83 0.47 3.160.26 1.00.52
%
55.21 17.71 19.79 6.25
n
159 53 57 19
%
32.47 18.78 31.91 16.85
n
88 51 87 46
significance
p:5 0.001 n.s p:5 0.001 p:5 0.01
MainICD Code
58
10
76
30 8 98
2.670.38 0.260.28 OO
92.1 8.2 0
264 24 0 -
111 14 147
Time to code
Tune to code/diagnosis (sec) 54.58 12.4 Total time to code (min) 14.46 2.47 Total time to process text (min) 233.61
p:5 0.01
105
of the main code and separated from it with a hyphen. It is optional for both groups (cc and mc). In certain diagnoses, f- or z-modifier codes exist, coding for functional or morphological features of a selected syndrome thus enabling a fairly precise assessment of disease status. These additional codes (mandatory for the cc-group, optional for the mc-group) can be regarded as a multi-axial coding approach to certain disease entities. For example, in thyroid disease the z-modifier code describes morphological status such as multinodular goitre; the f-modifier code describes functional status such as hyperthyroidism. Both the f- and the z-code are separated by hyphens and appended to the last digit of the main code. The resulting final code is:
s. p. operation for 4. Y-modifier code goitre 241.1. ICD main code multinodular 3. z-modifer code with hyperthyreoidism 3 f-modifier code
p ~ 0.001
n. s.
~ computerized (0 = 288)
~ manual (control,
0
p ~ 0.001
=272)
o
F+Z-code, correct. F+Z-code, incorrect. F+Z-code, missing.
Fig.3 Overview of coding quality of the combined f- and z-modifier code (mandatory for the computerized group fcc-group] in both protocols [computerized vs. manual]), demonstrating a positive effect ofthe computerized approach. Analyzed were correct codes (increase by 50.9%), incorrect (no change) and missing codes (reduction by 54.5%).
higher quality codes (i. e., correct, not only close), however, was 69.8% higher in the cc-group (Table 1).
Results
Quality of Coding, Main Code
When analysing the quality of coded terms in the cc-group as opposed to the mc-group, 55.21 % (8.83 0.26) vs. 32.47% (5.2 0.44, P :0:;0.001) of the codes were encoded correctly (a 69.8% increase), 17.71% (2.83 0.47) vs. 18.78% (3.01 0.32, n. s.) were close (for definition of "close" see Section on Subjects and Methods) and 19.79% (3.16 0.26) vs. 31.91 % (5.11 0.63, P :50.001) were coded incorrectly (a 38.2% reduction). In 6.25% (1.0 0.52) vs. 16.85% (2.7 0.21, n. s.) the code was missing; a 61.5% reduction (Fig. 1). Acceptable coding (correct plus close) was 73% vs. 51 % in the cc-group and the mc-group, respectively. The rate of 106
number of incorrectly coded terms is 6.3% (0.33 0.21) vs. 5.6% (0.33 0.22, n. s.), and the number of missing terms 53.1 % (2.83 0.54) vs. 72.5% (4.27 0.41, p :0:;0.01), respectively (Fig. 2). For the mandatory f- and z-codes the combined number of correctly coded terms is 92.1 % (2.67 0.38) vs. 41.2% (1.33 0.47, p :0:;0.001), the number of incorrectly coded terms 8.2% (0.26 0.28) vs. 5.3% (0.17 0.35, n.s.), and the number of missing terms 0% vs. 54.5% (1.77 0.35, P :0:;0.001) (see Fig. 3; for data overview see Table 1). When comparing only computerprocessed modifier codes, a comparison between optional (Y-) and mandatory (flz-) codes revealed an advantage for f mandatory coding with a 51.5% in-
p~O.OOl
Code, correct.
Code, incorrect .
Code, missing.
Fig.4 Chart demonstrating the advantage of mandatory as opposed to optional modifier coding. Both groups used the computerized approach. Correct codes are increased by 51.5%, incorrect codes are unchanged, and the number of missing codes is reduced by 53.1% to zero.
;0
~.
Fig. 5 Chart demonstrating advantage in speed when processing two narrative exercise cases with a computerized coding tool as opposed to the traditional approach (b vs. c). Furthermore, a 59% increase of mean time is noted when considering the overall case processing time as opposed to the simple sum of coding times for each medical concept (a vs. b).
action was "intelligent" searching and flipping through the pages of the coding manual. In the case of the computerized browsing/coding tool, identifying and entering the right search terms into a specific dialogue were relevant, a process we facilitated through an extensive thesaurus. We conclude that with the help of a computerized coding tool time can be saved and completeness of coding can be increased. However, there is evidence to suggest that clinicians need substantially more time to extract codable information from free text than is suggested by the speed of coding individual diagnoses.
REFERENCES 1. International Classification of Diseases. Basic tabulation list with Alphabetical Index (9th Rev. Ed., 2 vols). Geneva: World Health Organisation 1978. 2. Rothwell Dl, Hause LL. SNOMED and microcomputers in anatomic pathology. Med Inf 1983; 8: 23-31. 3. Unified Medical Language System. Fact Sheet. Bethesda Md: National Library of Medicine 1989. 4. Hohnloser IH, Konig A, Fischer MR, Emmerich B. Data quality in computerized patient records: Analysis of a hematology biopsy report database. Int 1 Clin Monit Comp 1994; 11: 233-40. 5. Klar R, Kaufmehl K. Die QualiUit der Diagnosenstatistik nach der neuen Bundespflegesatzverordnung. In: Oberla K, Rienhoff 0, Victor N, eds. Medizinische Informatik und Statistik. Heidelberg: Springer Verlag 1988; 23-6. 6. Lloyd SS, Rissing IP. Physician and coding errors in patient records. lAMA 1985; 10: 1330-6. 7. Nietzschke E, Wiegand M. Fehleranalyse bei der Diagnoseverschliisselung nach ICD-9 gemaB der Bundespflegesatzverordnung. Z Orthop 1992; 130: 371-7. 8. Hohnloser IH, Puerner F. PADS - A Patient Archiving and Documentation System. Int 1 Clin Monit Comp 1992; 9: 71-84. 9. Scriba PC, Mansky T, Fassl H, Friedrich Hl. Diagnoseschliissel des Zentrums fiir Innere Medizin und des Medizinischen Zentrums. 1. Auflage 1986. Address of the authors: Dr. 1. Hohnloser, Electronic Patient Record Group, Medizinische Klinik, Klinikum Innenstadt, Ludwig-Maximilians-Universitat MUnchen, Ziemssenstr. I, D-80336 MUnchen, Germany Phone: +49 89 5160 4575 Fax: +498951602341 E-Mail Compuserve: 100015.3015@compuserve.com E-Mail: 100015.3015@CompuServe.COM
crease of correct codes, no change in the rate of incorrect codes (6.3% vs. 8.2 %) and a 53.1 % reduction in the rate of missing codes (Fig. 4).
Discussion
We have attempted to analyse a typical clinical coding scenario under standardized conditions, applying a controlled experimental setting. Clinicians had to perform one of their daily tasks, extracting codable medical conMeth. Inform. Med., Vol. 35, No.2, 1996
cepts from narrative free text. The following observations seem noteworthy: - Using the computerized coding tool (a generic browsing and encoding utility as part of our electronic patient record system PADS) the time required to code distinct medical concepts or diagnoses could be reduced by about 50%. Coding quality was improved substantiallyas indicated by higher rates of correctly encoded terms, lower rates of incorrect terms, and lower rates of missing terms. When applying mandatory as opposed to optional (modifier) codes, comparable to multi-axial coding schemes such as SNOMED, users responded favorably with lower rates of incomplete coding, higher rates of correct coding, and no higher rates of incorrect coding. Adding individual coding time intervals for individual terms resulted in a significantly shorter time than the real total time interval required to process the free-text document. This excluded the first reading of the document. Summarizing coding times for individual diagnoses significantly distorted the real coding time burden for clinicians. Apparently, significant extra time for either mental concept change or interaction with the coding instrument (close to 60%) was spent by clinicians even under maximum time pressure as present in the experiment. In the case of manual coding this inter-
107