CH 20

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 3

Chapter 20: Method of Capturing the

Control
Successfully inserting its body into a file is only half of a job. To support its vital activities,
every virus must in some or other way capture the control. There are at least three methods
of achieving this goal.

Correction of the Entry Point


The classical method actively used already by the times of MS-DOS consists in correction
of the entry point – one of the fields of the ELF/COFF/a.out headers of the files to be
infected. In ELF headers this role is delegated to the e_entry field, and in a.out the
a_entry field carries out the same task. Both fields contain virtual address (not the offset
counted from the start of the file) of the machine instruction, to which the control should be
passed.
When inserting its body into the file, the virus memorizes the address of the original entry
point and resets the entry point to its own body. Having accomplished all planned actions,
the virus then returns the control to the carrier program using the saved address of the
original entry point. Although this technique seems flawless at the first glance, it is not free
from drawbacks, ensuring quick detection of the virus.
First, the entry point of most normal files points to the start of the file's code section. It is
very hard to insert there, and all existing methods of insertion are exposed to the risk of
irreversibly damaging the executable file, which would make it unusable. The entry points
that goes beyond the limits of the .text section is clear indication of the virus infection.
Second, analysis of any suspicious file starts from the environment of the entry point (and
ends up by the same location also), therefore, no matter which method was used for
insertion, but the virus code attracts the investigator's attention immediately.
Third, the entry point is an object to which legions of disk scanners, detectors, managers,
and other antiviral tools pay vigilant attention.
Thus, using the entry point for capturing the control is too primitive approach, which,
according to the opinion prevailing among most virus writers is even disgraceful.
Contemporary viruses master other techniques of infection, and only naïve beginner might
rely on the analysis of the entry point (by the way, that's how rumors about elusive and
imperceptible viruses are born…).
Insertion of the Virus Code in the Neighborhood of
the Entry Point
Most viruses do not change the entry point in any way, however, they insert by the
specified address the command that carries out the jump to the virus body, having
previously saved its original contents. Despite its seeming elegance, this algorithm is very
capricious and difficult to implement. To begin with, it is necessary to mention that to save
the original machine instruction located in the entry point, the virus must determine its
length, which is impossible to carry out without built-in disassembler.
Most viruses limit themselves by saving the first 16 bytes (the maximum possible length of
a machine command on the Intel platform), and then restore them once again, thus
bypassing the prohibition for code segment modification. Some viruses assign the Write
attribute to the code segment, making it available for writing (if section attributes are not
touched, it is possible to modify the code segment anyway, however, IDA PRO won't tell
you about it, because it is unable of working with segment attributes). There are also
viruses that use the mprotect function for changing page attributes on the fly. However,
both methods are too noticeable, leave alone the jmp instruction that passes the control to
the virus body, which immediately attracts attention!
More advanced viruses scan the start-up procedure of the file to be infected, searching for
the call or jmp instructions. Having found such an instruction, the virus replaced the
called address by the address of its body. Despite its seeming elusiveness, it is not difficult
to detect such method of capturing the control. First, and foremost, the virus, in contrast to
legally called functions, doesn't in any way use the arguments passed to it through the
stack. It has no idea about their number and even about their presence (automatic analysis
of the number of passed arguments is unimaginable without integrating a fully-featured
disassembler into the virus, which additionally would be equipped with a powerful
analyzer). The virus carefully saves all the modified registers, being aware that functions
might use unknown calling convention when passing arguments via registers. The main
issue consists in that when passing control to the original function the virus must either
remove the return address from the top of the stack (otherwise there will be two return
addresses there), or call the original function using the jmp command instead of call. For
normal programs written in high-level programming languages both methods are very
untypical, and because of this the virus presence is immediately disclosed.
Viruses that capture control in an arbitrary point of the program (often very distant from the
entry point) are much more difficult to detect, because in this case investigator must
analyze large fragments of code, which additionally, are not known beforehand. At the
same time, the risk that the given branch of the program would never get the control rapidly
grows with the increase of the distance from the entry point. Therefore, as far as I know,
most viruses never go beyond the limits of the first ret instruction that they encounter.
Modification of the Import Table
The classical mechanism of importing external function from/to ELF files in the most
general form appears as follows: at the first stage of the call to the imported function from
the .text section the "stub" is called, which is located in the .plt (Procedure Linkable
Table) section, which, in turn, refers to the pointer to the printf function located in the
.got (Global Offset Tables) section associated with the strings table containing names of
called functions (or their hashes).
Listing 20.1 provides the scheme of calling the printf function by the ls utility from the
Red Hat 5.0 distribution set:

Listing 20.1. Scheme used by the ls utility for calling the printf function

.text:08000E2D call _printf



.plt:08000A58 _printf proc near
.plt:08000A58
.plt:08000A58 jmp ds:off_800628C
.plt:08000A58 _printf endp

.got:0800628C off_800628C dd offset printf

extern:8006580 extrn printf:near ; weak

0000065B: FF 00 6C 69-62 63 2E 73-6F 2E 35 00-73 74 70 63 y libc.so.5 stpc
0000066B: 70 79 00 73-74 72 63 70-79 00 69 6F-63 74 6C 00 py strcpy ioctl
0000067B: 70 72 69 6E-74 66 00 73-74 72 65 72-72 6F 72 00 printf strerror
Which point of this chain can be chosen for insertion? First, the virus can create a fictitious
strings table, by trapping the calls of all functions that are of any interest to it. The most
popular targets for infection are printf/fprintf/sprintf functions (because
practically no program can do without them) and file input/output functions. This
automatically ensures transparent mechanism for searching new targets for infection.
Satellite viruses can create special trapping library in all infected files. Because IDA Pro
doesn't display the name of the imported library when disassembling ELF files, it is very
difficult to suspect anything wrong in this situation. Fortunately, there are HEX editors.
Some other viruses tend to manipulate over the fields of the global offset table by resetting
them to point at the virus body.

You might also like