Professional Documents
Culture Documents
The Undelete Technology Research For UNIX-like
The Undelete Technology Research For UNIX-like
Group Descriptor
Keywords- undelete; bitmap; the physical structure offlies;
data recovery Block Bitmap
system doesn't know the address of the file's data blocks. As described here, undelete method can only recover the
On this file system, the deleted files are difficult to restore, data blocks which are continuous or continuous in the free
although sometimes the deleted files' inode may be space. The pointer blocks are the specific data blocks which
recorded in the log, but this probability is very low, so you the EXT3 file system used to manage file blocks. If the
can give up this method. recovered file is correct circumstances, pointer blocks will
The size of EXT3 file system block must be 2xKB, the appear in a fixed location, the concrete location and the
x is an integer, in general, x 2 and the block size is 4KB, a
= number are connected with the file's size. Here we set the
block bitmap's space is generally a block (a small file file's size is S, the file system's block size BS (for example
system does not take up full space), thus allowing a group to as 4K), calculate the pointer blocks' position as follows:
store 128MB information which also includes the space • OKB:SS:S48KB. The ext3 file system's inode has
occupied by the file system itself, making the actual data in 12 direct pointers, the file does not occupy the
each group is limited. The pointers in the inode contain 12 indirect pointer without the need of removing the
direct pointers, an indirect pointer, a double indirect pointer pointer block operation;
and a triple indirect pointer. The direct pointer is directly • 48KB<S::;4144KB;::;4MB. The file takes up an
pointing to block. The indirect pointer points to a block that indirect pointer and a pointer to block, because
its pointers point to data blocks (In the following passage, the size of each pointer occupies 4B, for example,
the block which records the pointers is called the pointer the file system block is 4KB, so a pointer block
block). The double indirect pointer points to a pointer block can keep 1024 pointers that point to 1024 blocks,
which in each of its pointers they are pointing to a pointer add original 12 direct pointers, a total of 1036
block, the triple indirect pointers and so on. So EXT3 file blocks in size, so the file's size in this range is
system can store files of TB level. These pointer blocks and only a pointer block, located in the No.13 block,
the ordinary files' block are all stored in the data area, and the block's starting offset address is 48KB;
they are no difference. According to the above two points, • 4144KB:SS<4GB. The file takes up an indirect
the file data blocks are prone to the phenomenon of pointer and a double indirect pointer. The location
discontinuous. We can conclude that it has bad effect if we of the indirect pointer block is the same with the
directly use the method of basing on the physical storage last paragraph. Double indirect pointer block's
structure to recover the files. max number is 1025, it contains a double indirect
Based on the above analysis about the EXT3 file pointer block and 1024 indirect pointer blocks
system, the file system itself can't take the route recovery (1024 indirect pointers may not be full used). The
mechanism; it also can't be directly applied to the method of location of the double indirect pointer block is at
basing on the physical file storage structure. Therefore, we the No. 1038 block in the file, the algorithm is as
need to propose an improved method of basing on the follows: an indirect pointer block point to 1024
physical file's storage structure. data blocks, these data blocks are arranged behind
the pointer block, the location of the double
III. UNIX-FILE SYSTEM UNDELETE METHOD
pointer block is at the No.1025 block behind the
If we want to improve basis file physics structure's indirect pointer block, together with the blocks in
method, first, we must make the object file's data blocks front of an indirect pointer location, it is 1025 +13
restore to be continual, this must remove the disturbance of = 1038. Pointer in the second block contains an
other blocks. The main disturbance of data blocks are indirect pointer which is in the first block position
occupied by the file system blocks, not deleted files' data after each block of a backward pointer offset 1025
blocks and the pointer blocks. We want to recover the blocks. The location of the double indirect pointer
deleted files' blocks which have been released, so that we block is 1038BS;The location of the indirect
just need to carry on the scanning to the free block, it may pointer blocks which are the subordinate of the
improve continuous rates of data blocks. double indirect pointer:
The most difficult thing is removing the pointer blocks.
When the corresponding file is deleted, the data blocks and (1039+1024k)BS(k=O ,I,2 ,n,n= 1I S-
...
lO24
4I44
l )
pointer blocks will be released together. There is no • 4GB:SS:S4TB. File has been occupied all the
difference between the pointer block and the data block.
indirect pointers and the double pointers, start
They are stored in the data area where it is mixed in data
using triple indirect pointers, it can be concluded
blocks. General speaking, they are very difficult to remove.
from the above paragraph.
In order to remove the pointer block, we must carry on the
In addition, when the target part is scanned completely,
further analysis about the UNIX-like file system, take the
it will restore the files which have been scanned, and these
EXT3 file system as an example, the distribution of the
files corresponding data blocks removed from the free space,
pointer block has certain regularity. Here we first ignore the
then re-scan the rest of the free block, this excludes the
pointer block (take pointer blocks as data blocks), restore
disturbance of files which have been restored block, to
the files out, and then remove the pointer blocks in the files.
VI-203
2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE)
'
improve the recovery and efficiency of files. tracking text pointer. The full use of model B and B has
'
Take the image files (JPG file) as an example, JPG been "part of the match" the results of the mode B and B
files have an unique file's header and footer signatures, tum to the middle of "sliding" as far a distance, continue
beginning with a fixed value, the first 4B position more. General n 2': 2m, the actual case n>> 2m.
OxFFD8FFEO, 7-11B for the Ox4A46494600, at the end of a Definition pattern string Q function:
fixed value OxFFD9 [2]. Search header and footer of the file,
if the file's data blocks are continuous between them is a
0, j = 1
JPG file! The file's header is easier to search, because files max
{� 11<� <j, } ,
are stored in a block from the starting position, and we just b1 .. hk1-1 =bj_k,+I .. hj_1
need to search each block's first lIB to see if the first This set can't be empty
signature matching. Search footer of the file is more
1, Other cases
complicated, because the file's header will appear in a block Q(r) = (1)
'
of fixed position, but the footer's position is not fixed. Here
is a quote KMP string matching algorithm [3], and the
0, B(O)
j =
the text A and model B alignment. Text A terminal and mode Literature [6] gives a similar method for evaluating k.
' High efficiency through the above algorithm can find
B end alignment. The starting position of the pointer of
Model B is the first side of Model B. The starting position the end of the file. Main program's processing is shown as
' ' Fig. 2.
of the pointer of Model B is the end side of Model B . The
two-pointer of the text A start location: Header and footer of Finally, removed the pointer blocks on the recovered
' files, you can restore the files out.
the text. Matching process, the model B and model B turn
to the middle of text matching, respectively, the matching
processes when a mismatch occurs. It not needs back
Read all the block bitmap to build Location of log file header
a simulation chart, set the current block, to continue scanning
position to the first free block
No
VI-204
2010 3rd International Conforence on Advanced Computer Theory and Engineering(ICACTE)
VI-205
2010 3rd International Conference on Advanced Computer Theory and Engineering(1CACTE)
don't have any signatures and file's data blocks [6] Weimin Yan,Weimin Wu. Data structure[M]. Beijing:Qinghua
University Press, 1996:80--84
interleaved on more complex files (For instance, file A
[7] Zhongxia Wang, Jingsheng Zhang. SQL Server Database Reverse
and file B is the same type of file, file B' header is at the reconstruction[J].Beijing: School Paper of Beijing Information
middle of file A, file B's footer is after the footer of the Science and Technology University, 2009,12(4) 39--42
file A)can't be restored. For the above, understanding of
the logical structure of files can solve the problems. We
will continue study this for the future direction of research
VI-206