Total Pageviews

Wednesday, May 19, 2010

Dissecting FAT 16 File System

In this article I will discuss about the internals of FAT16 file system, its layout and how the files and directories are stored in it. We will dissect the FAT file system by reading the important blocks and analyze their contents to get an understanding of how the fat16 file system works.


Sectors Vs Clusters
Data is written on disks in terms of sectors which are typically 512 bytes in size. You cannot read/write data less than sector size.
Cluster is a fixed number of contiguous sectors on disk. FAT file system allocate space to a file in terms of clusters and not sectors. If the cluster size is 8, it means that even if you write a single byte to a file, 8 contiguous sectors (ie 8 * 512 = 4096 bytes) will be allocated to the file.


Why FAT uses clusters instead of sectors ?
Like any file system FAT also needs to keep information about which sectors on disk are free and which are allocated to a file. FAT keeps a table, where it uses two bytes for each sector.
For efficiently reading/writing to a file, this table should be kept in memory. Smaller the size of table, more part of the table can be kept in memory. If instead of sector, cluster is used as a minimum allocation unit, the size of table will be reduced. If we use 8 sectors / cluster, the size of table will be reduces 8 times.


Larger File System
Also, larger disks can be handled using a larger cluster size. As FAT uses 2 bytes per cluster, total number of clusters that can be addressed are 0xFFFF (65535). Bigger the size of cluster, bigger the size of FAT file system that can be created.


Cluster size (Bytes)       FAT16 File System Size
     1024 (1K)             67MB = (65536 * 1024) Bytes
     4096 (4K)             268 MB
     8192 (8K)             536 MB            
     32768 (32K            2.14 GB


Better Performance
Another advantage of using a cluster is that if we have more contiguous blocks of sectors allocated to a file. This reduces the disk seek time and therefore improves file system performance.


Drawbacks (Internal Fragmentation)
The drawback of using a cluster is the internal fragmentation. If we need to write a single byte, the file system will allocate a cluster. If size of cluster is 8 sectors, we will end up allocating (8 * 512) = 4096 bytes, which means a lot of wastage of disk space. Since, most of the files are larger than sector size, the cluster concept works great.


FAT File System Layout on disk
A filesystem is created on a partition on the disk. A partition can be created using “fdisk” command.




As you see in the above diagram, essentially a FAT File System comprise of three sections.
  • · Boot Sector
  • · Primary And Secondary FAT Tables
  • · Data Clusters that contain the root directory and other files and directories.
Lets discuss FAT Table first
FAT table keeps information about which clusters in the data area are allocated to a file/directory and which are free.
It uses 2 bytes for each cluster.


FAT  TABLE VALUE           MEANING
0x0000                     Cluster is free and not allocated to any
                           file/directory
0x0002 – 0xFFEF            Used, Next cluster in file
0xFFF7                     Bad Cluster
0xFFF8 – 0xFFFF            Used, Last Cluster in file  


The first two entries in the FAT Table are the special entries that are used to indicate the type of media and the dirty status of the file system.
The first entry stores the type of storage medium (0xF0 means removable and 0xF8 means non removable)
The second entry in the table indicates if the file system is dirty or not.
Suppose the file (Hello.txt) is allocated cluster number 3, then to find the next cluster allocated to this file go to third entry in the table which will have the next cluster number (which is 7) allocated. Then go to entry number 7 which has 0xFFFF as the next cluster number. 0xFFFF indicates end of file.
So, find all the clusters allocated to a file, we need to traverse the FAT table. That is why it will be more efficient to keep the entire FAT Table in the memory.


FAT 16 BOOT SECTOR
It is the first sector in the FAT file system
Contains information about the file system like Sectors per cluster, Number of FAT Tables, file system type (FAT 16/12/32) etc.


Bytes
Purpose
0-2
Assembly code instructions to jump to boot code (present in bootable partition)
3-10
OEM name in ASCII
11-12
Bytes per sector (512, 1024, 2048, or 4096)
13
Sectors per cluster (Must be a power of 2 and cluster size must be <=32 KB)
14-15
Size of reserved area, in sectors
16
Number of FAT Tables (usually 2, one primary and other backup)
17-18
Maximum number of files in the root directory
19-20
Number of sectors in the file system; if 2 Bytes are not large enough, set to 0 and use 4 Byte value in bytes 32-35 below
21
Media type (0xF0 = removable disk, 0xF8 = fixed disk)
22-23
Size of each FAT in sectors
24-25
Sectors per track in storage device
26-27
Number of heads in storage device
28-31
Number of sectors before the start partition
32-35
Number of sectors in the file system; this field will be 0 if the 2Byte field above (bytes 19-20) is non-zero
36
BIOS INT 13h (low level disk services) drive number
37
Not used
38
Extended boot signature to validate next three fields (0x29)
39-42
Volume serial number
43-53
Volume label, in ASCII
54-61
File system type level, in ASCII. (Generally "FAT", "FAT12", or "FAT16")
62-509
Not used
510-511
Signature value (0xaa55)


Lets print the BOOT SECTOR of FAT16 File System.
I created this FAT16 File System on 1GB disk. I used “dd” command to read the sectors from the partition.

Bytes      Value                     Meaning 
00-02:     eb 3c 90                  Instructions to jump to boot code
03-0a:     4d 53 44 4f 53 35 2e 30   Name string (MSDOS5.0)
0b-0c:     00 02                     Bytes/sector (0x0200 = 512)
0d   :     20                        Sectors/cluster (0x20 = 32)
0e-0f:     02 00                     Size of reserved area (2 sectors)
10   :     02                        Number of FAT Tables (2) primary and secondary 
11-12:     00 02                     Max. number of root directory entries
                                     (0x0200 = 512)  
13-14:     00 00                     Total number of sectors (0x0000)
                                     2 Bytes are not large enough to
                                     store the size of this fileSystem
15   :     f8                        Media type (non removable)
16-17:     ff 00                     FAT size (0x00ff = 255 sectors)
18-19:     20 00                     Sectors/track (0x0020 = 32)
1a-1b:     80 00                     Number of heads (0x0080 = 128)
1c-1f:     20 00 00 00               Number of sector before partition (0x20 = 32)
20-23:     c3 dd 1f 00               Total number of sectors (0x001fddc3 ~ 1MB) 
24   :     80                        Drive number (80)
25   :     00                        Unused
26   :     29                        Extended boot signature
27-2a:     fa 1e e7 34               Volume serial number
2b-35:     4e 4f 20 4e 41 4d 45 20 20 20 20  Volume label ("NO NAME ")
36-3d:     46 41 54 31 36 20 20 20   File system type label ("FAT16 ")
3e-1fd :   [snip]                    Not used
1fe-1ff:   55 aa                     Signature value (0xaa55)



DATA AREA
Data area contains the root directoy and all other files and sub directories.
Root Directory store the directory entries for each file in the root directory. The most important data in each directory entry is the name of the file and the first cluster number of the file. If you the first cluster number then all other clusters that belong to this file can be found.
After the root directory, data belonging to all the files and the subdirectories is stored.


First lets have a look at the various fields of DIRECTORY ENTRY

VARIOUS FIELDS OF DIRECTORY ENTRY
Bytes        Purpose

0            First character of file name Or allocation status
             (0x00 = unallocated, 0xe5 = deleted)
1-10         Characters 2-11 of the file name
11           File Attributes
             0x01   =>  Read Only
             0x02   =>  Hidden File
             0x04   =>  System File             
             0x08   =>  Volume Label
             0x0f   =>  Long File Name
             0x10   =>  Directory
             0x20   =>  Archive
12           Reserved
13           File Creation Time (in tenths of seconds)
14-15        Creation Time (Hours, minutes, seconds)
16-17        Creation date
18-19        Access Date
20-21        High order 2 Bytes of address of first cluster
             (0 for FAT16)
22-23        Modified Time (hours, minutes, seconds)
24-25        Modified Date
26-27        Low order two bytes of address of first cluster
28-31        File size (0 for directories)


Lets do the dissection of a real FAT16 File System

I printed the boot sector of the FAT file system above that has the overall information about the entire file system. We will use this information to create the layout of the file system


No of sectors per cluster              = 32
No of reserved sectors                 = 2
Number of FAT tables                   = 2 (one primary and one secondary)
Size of each FAT Table                 = 255 sectors
Size of single directory entry         = 32 bytes
Max number of Root Directory Entries   = 512
Size of root directory = 512 * 32 = 32 sectors    = 1 cluster




So the LAYOUT of the file system is
layout
Now lets analyze the contents of Root Directory and other files.
 In my filesystem, I created following folders and files.

/folder1
/folder1/test3.txt
/folder2
/folder2/test4.txt
/test1.txt
/test2.txt


Lets print the root directory contents of my file system

Byte                
Offset               Directory Entry (32 Bytes)
-------------------------------------------------------                                        
1000000  T  E  S  T     F  A  T  1  6 20 08 00 00 00 00 
1000020 00 00 00 00 00 00 fd 00 b1 3c 00 00 00 00 00 00 

First directory entry (32 bytes)     
        Volume Name = TEST FAT16
        File Attribute = 08 = Volume Label
-------------------------------------------------------
                                                        
1000040 e5 4e 00 65 00 77 00 20 00 46 00 0f 00 dd 6f 00
1000060 6c 00 64 00 65 00 72 00 00 00 00 00 ff ff ff ff
Second directory entry (32 bytes)
        e5 = Deleted
-------------------------------------------------------

1000140  F  O  L  D  E  R  1             10 00 4c 02 01
1000160 b1 3c b1 3c 00 00 03 01 b1 3c 02 00 00 00 00 00
Fourth Directory Entry
         Directory Name = FOLDER1
         Attribute = 10 = Directory
         First Cluster Address = 0002
-------------------------------------------------------
(removed deleted/unallocated directory entries>

1000300  F  O  L  D  E  R  2             10 00 91 07 01
1000320 b1 3c b1 3c 00 00 08 01 b1 3c 03 00 00 00 00 00
         Directory Name = FOLDER2
         Attribute = 10 = Directory
         First cluster address = 0003
-------------------------------------------------------- 
1000500  T  E  S  T  1           T  X  T 20 00 7e 0b 01
1000520 b1 3c b1 3c 00 00 22 01 b1 3c 04 00 b8 01 00 00
         File Name = TEST1.TXT
         Attributes = 20 = Archive
         First Cluster address = 0004
--------------------------------------------------------
1000700  T  E  S  T  2           T  X  T 20 00 7e 0b 01
1000720 b1 3c b1 3c 00 00 3a 01 b1 3c 05 00 2b 01 00 00
         File Name = TEST2.TXT
         Attribute = 20 = Archive
         First cluster address = 0005


So, we saw that the root directory contains the directory entries of all folders and files that are created inside it. The directory entry contains the first sector containing the directory or file, so we can find all subdirectories and files inside it.
Lest look at FOLDER1. I created the file test3.txt inside FOLDER1
As per the directory entry of FOLDER1 (printed above) the first cluster number that contains the contents of FOLDER1 is 0x02.
We will print the contents of cluster no 2. Note that cluster numbering starts from 2. Cluster 2 lies just after the end of root directory.
Sector number corresponding to cluster 2 = 2 + 255 *2 + 32 = 544


CONTENTS OF FOLDER1  (cluster 2) Sector #544 
1040000 2e 20 20 20 20 20 20 20 20 20 20 10 00 4c 02 01
1040020 b1 3c b1 3c 00 00 03 01 b1 3c 02 00 00 00 00 00
<snip>
1040240  T  E  S  T  3           T  X  T 20 00 59 45 01
1040260 b1 3c b1 3c 00 00 54 01 b1 3c 06 00 f2 00 00 00

         File Name = TEST3.TXT

         Attributes = 0x20 (Archive)
         First Cluster address = 0006



The first cluster that contains the file TEST3.TXT is 0x06.
I stored following contents in TEST3.TXT

# cat /folder1/test3.txt
CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TX


Lets print the contents of cluster 6 to see if the contents match with the contents of file test3.txt
Cluster 6 correspond to sector number 2 + 255 * 2 + 32 + 32 * 4 = 672

1240000  C   O   N   T   E   N   T   S       o   F       T   E   S   T
1240020  3   .   T   X   T       C   O   N   T   E   N   T   S       o
1240040  F       T   E   S   T   3   .   T   X   T       C   O   N   T
1240060  E   N   T   S       o   F       T   E   S   T   3   .   T   X
1240100  T       C   O   N   T   E   N   T   S       o   F       T   E
1240120  S   T   3   .   T   X   T       C   O   N   T   E   N   T   S
1240140      o   F       T   E   S   T   3   .   T   X   T       C   O
1240160  N   T   E   N   T   S       o   F       T   E   S   T   3   .
1240200  T   X   T       C   O   N   T   E   N   T   S       o   F   
1240220  T   E   S   T   3   .   T   X   T       C   O   N   T   E   N
1240240  T   S       o   F       T   E   S   T   3   .   T   X   T   
1240260  C   O   N   T   E   N   T   S       o   F       T   E   S   T
1240300  3   .   T   X   T       C   O   N   T   E   N   T   S       o
1240320  F       T   E   S   T   3   .   T   X   T       C   O   N   T
1240340  E   N   T   S       o   F       T   E   S   T   3   .   T   X
1240360  T      \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
1240400 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0


So, we saw that the contents matches with what we stored in test3.txt.
You can follow the same approach to traverse the directory hierarchy to print the contents of other files.
Lets print the primary FAT Table as well

0002000 f8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0002020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Entry 0 = 0xf8   (media type non removable)
Entry 1 = 0xff
Entry 2 = 0xff
..
Entry 6 = 0xff   which indicates the end of file for test3.txt
                 (ie test3.txt has only one cluster allocated)

Thats all for today. I hope you find this article useful. If you have any questions or find something missing or incorrect please post your comments.
Have Fun !!!

4 comments:

  1. Excellent !!! I Learnt FAT filesystem in just 3 hrs. Thanks...

    ReplyDelete
  2. The best fat16 tutorial I ever seen on the internet, just what I needed it.

    ReplyDelete
  3. Coll mate, cool for newbie hobbyist developers also. Some suggestion (hey not meant to be harsh its a great tut)
    1. Allocating New File
    2. Deleting File
    3. Allocating Directory (Folder)
    4. Delete Folder
    5. Finally FAT32 dissection who wants him to write about it ? raise ur hands ladies and gentlemen

    ReplyDelete
  4. All is OK but why on the last table there are only one byte per entry? They should to be 2 bytes per FAT16.
    Or I do not see well?

    ReplyDelete

Followers