Understanding Linux: May 2010

You must have heard that the fundamental data types must be aligned to a specific byte boundary. Most of you must have seen your programs being killed due to alignment exception generated by the processor. For any C programmer it is very important to understand the reasons behind the alignment restrictions.
In this article we will see why the compiler align certain data types to a certain boundary.

Why alignment restrictions ?
Typically a 32 bit processor reads or writes from the memory in chunks of 4 bytes (32 bits). The reads and writes can be performed at addresses that are divisible by 4.
Even if you want to read a single byte, 4 bytes are read from the memory and then the processor hardware circuitry will copy the requested byte to the specific position in the register.
For example, on m68K processor, a multiplexer sits in between an internal register and the external bus. The multiplexer selects the specified byte and routes it to the appropriate byte position in the register.

Lets try to see why processor restricts the data to be aligned to a specific byte boundary. Suppose the processor needs to read an integer (4 bytes in size).

Case 1: Integer is stored at address that is 4 byte aligned (say 0x100)

As the processor can read/write only at addresses that are divisible by 4, all four bytes can be read in a single bus cycle. Note that the address 0x100 is divisible by 4.

Address	Byte 0	Byte 1	Byte 2	Byte 3
0x100	X0	X1	X2	X3

Case 2: Integer is stored at an unaligned address (say 0x101)

This read cannot be performed in a single 32 bit bus cycle. The processor will have to issue two different reads at addresses 0x100 and 0x104.

Thus, it takes twice the time to read a misaligned data.

Address	Byte 0	Byte 1	Byte 2	Byte 3
0x100		X0	X1	X2
0x104	X3

It is important to note that

· Some processors allow unaligned access but at a performance penalty as we saw above. Total time to read a misaligned data is more/double as compared to the aligned data Intel processor allow reading/writing data which is unaligned but at the cost of reduced performance.

· Some processors generate an alignment exception on accessing a misaligned data. Then it is up to the exception handler either to report an error (kill the user process) or perform two read cycles to read the misaligned data (user process will run normally in this case but at reduced performance).

C compiler allocates addresses such that the data types are properly aligned. This helps in speeding up the read/write operation without causing any exception.
Typically on a 32 bit processor (like 32 bit x86, m68k, PowerPC, ARM etc)

· A char (one byte) will be 1-byte aligned.

· A short (two bytes) will be 2-byte aligned.

· An int (four bytes) will be 4-byte aligned.

· Any pointer (four bytes) will be 4-byte aligned (e.g.: char *, int *)

Structure alignment and padding

Compiler add pad bytes in the user defined structures so that the various fields of a structure are properly aligned.

There are certain rules that can be kept in mind to understand padding in a structure.

1) No padding before the first element

The first element of a struct must come first, and must be preceded by no padding. This allows a struct pointer to be converted to a pointer to the struct's first element and vice versa, which is a useful property.

2) Pad bytes at the end of structure

It is obvious to add bytes in between the elements of a structure to keep the elements aligned. But why the compiler will put pad bytes at the end of a structure ?

This is because if an array of structure is declared, start of each structure should be properly aligned.

For an array of structure, struct A array[10],

ð array[1] should be equivalent to *(array + 1).

ð difference between array + 1 and array + 0, should exactly equal to the size of struct A

To satisfy above constraints, compiler may have to put padding bytes at the end of structure. The rule is that the last member inside a structure should be padded with the number of bytes to make the structure aligned to the size of largest member of structure.

Lets see few examples for structure alignment and padding

struct exampleStruct {

char data1;

short data2;

int data 3;

char data 4;

};

On a typical 32 bit processor (say m68k or PowerPC), the compiler will insert padding in between various members so that they are properly aligned.

struct exampleStruct {

char data1;

char padding[1]; // pad of 1 byte to keep short to be 2

// byte aligned

short data2;

int data3; // already aligned to 4 byte boundary,

// so no padding before this

char data4;

char padding[3]; // padding at the end of structure so

// that if an array of such structures is

// declared, the start of each structure

// in the array is properly aligned.

// Padding should be done such that

// start of structure address is aligned

// to the size of largest data member of

// structure. In this case, the size of

// largest data type (int data3) is 4.

// So, 3 pad bytes are added to the end.

};

How to arrange members so as to minimize padding?

By changing the ordering of members in a structure, it is possible to minimize the amount of padding. If members are sorted by ascending or descending aligment requirements, minimal amount of padding is required.

So, if you have a structure with multiple integer, short and char members, then keep all integer members at the beginning, keep all short members after interger members and all chars at the end.

How to prevent compiler from putting pad bytes in between or at the end of structure?

Compiler provides #pragma directives to specify the alignment of members inside a structure.

#pragma pack(push) // save original alignment on stack

#pragma pack(1) // set alignment to 1 byte boundary

struct packedStruct {

char data1;

int data2; // no padding is inserted,

// as alignment specified is 1 byte.

char data3;

};

#pragma pack(pop) // restore original alignment

Here, we tell the compiler that each element should be aligned to 1 byte boundary. So, no padding is inserted in between elements. The size of this structure aftre compilation will be 6 bytes.

Thats all for today.
Have Fun !!!

In this article I will discuss about the internals of FAT16 file system, its layout and how the files and directories are stored in it. We will dissect the FAT file system by reading the important blocks and analyze their contents to get an understanding of how the fat16 file system works.

Sectors Vs Clusters
Data is written on disks in terms of sectors which are typically 512 bytes in size. You cannot read/write data less than sector size.
Cluster is a fixed number of contiguous sectors on disk. FAT file system allocate space to a file in terms of clusters and not sectors. If the cluster size is 8, it means that even if you write a single byte to a file, 8 contiguous sectors (ie 8 * 512 = 4096 bytes) will be allocated to the file.

Why FAT uses clusters instead of sectors ?
Like any file system FAT also needs to keep information about which sectors on disk are free and which are allocated to a file. FAT keeps a table, where it uses two bytes for each sector.
For efficiently reading/writing to a file, this table should be kept in memory. Smaller the size of table, more part of the table can be kept in memory. If instead of sector, cluster is used as a minimum allocation unit, the size of table will be reduced. If we use 8 sectors / cluster, the size of table will be reduces 8 times.

Larger File System
Also, larger disks can be handled using a larger cluster size. As FAT uses 2 bytes per cluster, total number of clusters that can be addressed are 0xFFFF (65535). Bigger the size of cluster, bigger the size of FAT file system that can be created.

Cluster size (Bytes) FAT16 File System Size

1024 (1K) 67MB = (65536 * 1024) Bytes

4096 (4K) 268 MB

8192 (8K) 536 MB

32768 (32K 2.14 GB

Better Performance
Another advantage of using a cluster is that if we have more contiguous blocks of sectors allocated to a file. This reduces the disk seek time and therefore improves file system performance.

Drawbacks (Internal Fragmentation)
The drawback of using a cluster is the internal fragmentation. If we need to write a single byte, the file system will allocate a cluster. If size of cluster is 8 sectors, we will end up allocating (8 * 512) = 4096 bytes, which means a lot of wastage of disk space. Since, most of the files are larger than sector size, the cluster concept works great.

FAT File System Layout on disk
A filesystem is created on a partition on the disk. A partition can be created using “fdisk” command.

As you see in the above diagram, essentially a FAT File System comprise of three sections.

· Boot Sector
· Primary And Secondary FAT Tables
· Data Clusters that contain the root directory and other files and directories.

Lets discuss FAT Table first
FAT table keeps information about which clusters in the data area are allocated to a file/directory and which are free.
It uses 2 bytes for each cluster.

FAT TABLE VALUE MEANING

0x0000 Cluster is free and not allocated to any

file/directory
0x0002 – 0xFFEF Used, Next cluster in file

0xFFF7 Bad Cluster

0xFFF8 – 0xFFFF Used, Last Cluster in file

The first two entries in the FAT Table are the special entries that are used to indicate the type of media and the dirty status of the file system.
The first entry stores the type of storage medium (0xF0 means removable and 0xF8 means non removable)
The second entry in the table indicates if the file system is dirty or not.
Suppose the file (Hello.txt) is allocated cluster number 3, then to find the next cluster allocated to this file go to third entry in the table which will have the next cluster number (which is 7) allocated. Then go to entry number 7 which has 0xFFFF as the next cluster number. 0xFFFF indicates end of file.
So, find all the clusters allocated to a file, we need to traverse the FAT table. That is why it will be more efficient to keep the entire FAT Table in the memory.

FAT 16 BOOT SECTOR
It is the first sector in the FAT file system
Contains information about the file system like Sectors per cluster, Number of FAT Tables, file system type (FAT 16/12/32) etc.

Bytes	Purpose
0-2	Assembly code instructions to jump to boot code (present in bootable partition)
3-10	OEM name in ASCII
11-12	Bytes per sector (512, 1024, 2048, or 4096)
13	Sectors per cluster (Must be a power of 2 and cluster size must be <=32 KB)
14-15	Size of reserved area, in sectors
16	Number of FAT Tables (usually 2, one primary and other backup)
17-18	Maximum number of files in the root directory
19-20	Number of sectors in the file system; if 2 Bytes are not large enough, set to 0 and use 4 Byte value in bytes 32-35 below
21	Media type (0xF0 = removable disk, 0xF8 = fixed disk)
22-23	Size of each FAT in sectors
24-25	Sectors per track in storage device
26-27	Number of heads in storage device
28-31	Number of sectors before the start partition
32-35	Number of sectors in the file system; this field will be 0 if the 2Byte field above (bytes 19-20) is non-zero
36	BIOS INT 13h (low level disk services) drive number
37	Not used
38	Extended boot signature to validate next three fields (0x29)
39-42	Volume serial number
43-53	Volume label, in ASCII
54-61	File system type level, in ASCII. (Generally "FAT", "FAT12", or "FAT16")
62-509	Not used
510-511	Signature value (0xaa55)

Lets print the BOOT SECTOR of FAT16 File System.
I created this FAT16 File System on 1GB disk. I used “dd” command to read the sectors from the partition.

Bytes Value Meaning

00-02: eb 3c 90 Instructions to jump to boot code

03-0a: 4d 53 44 4f 53 35 2e 30 Name string (MSDOS5.0)

0b-0c: 00 02 Bytes/sector (0x0200 = 512)

0d : 20 Sectors/cluster (0x20 = 32)

0e-0f: 02 00 Size of reserved area (2 sectors)

10 : 02 Number of FAT Tables (2) primary and secondary

11-12: 00 02 Max. number of root directory entries

(0x0200 = 512)

13-14: 00 00 Total number of sectors (0x0000)

2 Bytes are not large enough to

store the size of this fileSystem

15 : f8 Media type (non removable)

16-17: ff 00 FAT size (0x00ff = 255 sectors)

18-19: 20 00 Sectors/track (0x0020 = 32)

1a-1b: 80 00 Number of heads (0x0080 = 128)

1c-1f: 20 00 00 00 Number of sector before partition (0x20 = 32)

20-23: c3 dd 1f 00 Total number of sectors (0x001fddc3 ~ 1MB)

24 : 80 Drive number (80)

25 : 00 Unused

26 : 29 Extended boot signature

27-2a: fa 1e e7 34 Volume serial number

2b-35: 4e 4f 20 4e 41 4d 45 20 20 20 20 Volume label ("NO NAME ")

36-3d: 46 41 54 31 36 20 20 20 File system type label ("FAT16 ")

3e-1fd : [snip] Not used

1fe-1ff: 55 aa Signature value (0xaa55)

DATA AREA
Data area contains the root directoy and all other files and sub directories.
Root Directory store the directory entries for each file in the root directory. The most important data in each directory entry is the name of the file and the first cluster number of the file. If you the first cluster number then all other clusters that belong to this file can be found.
After the root directory, data belonging to all the files and the subdirectories is stored.

First lets have a look at the various fields of DIRECTORY ENTRY

VARIOUS FIELDS OF DIRECTORY ENTRY

Bytes Purpose

0 First character of file name Or allocation status

(0x00 = unallocated, 0xe5 = deleted)

1-10 Characters 2-11 of the file name

11 File Attributes

0x01 => Read Only

0x02 => Hidden File

0x04 => System File

0x08 => Volume Label

0x0f => Long File Name

0x10 => Directory

0x20 => Archive

12 Reserved

13 File Creation Time (in tenths of seconds)

14-15 Creation Time (Hours, minutes, seconds)

16-17 Creation date

18-19 Access Date

20-21 High order 2 Bytes of address of first cluster

(0 for FAT16)

22-23 Modified Time (hours, minutes, seconds)

24-25 Modified Date

26-27 Low order two bytes of address of first cluster

28-31 File size (0 for directories)

Lets do the dissection of a real FAT16 File System

I printed the boot sector of the FAT file system above that has the overall information about the entire file system. We will use this information to create the layout of the file system

No of sectors per cluster = 32
No of reserved sectors = 2
Number of FAT tables = 2 (one primary and one secondary)
Size of each FAT Table = 255 sectors
Size of single directory entry = 32 bytes
Max number of Root Directory Entries = 512
Size of root directory = 512 * 32 = 32 sectors = 1 cluster

So the LAYOUT of the file system is

Now lets analyze the contents of Root Directory and other files.
In my filesystem, I created following folders and files.

/folder1

/folder1/test3.txt

/folder2

/folder2/test4.txt

/test1.txt

/test2.txt

Lets print the root directory contents of my file system

Byte

Offset Directory Entry (32 Bytes)

-------------------------------------------------------

1000000 T E S T F A T 1 6 20 08 00 00 00 00

1000020 00 00 00 00 00 00 fd 00 b1 3c 00 00 00 00 00 00

First directory entry (32 bytes)

Volume Name = TEST FAT16

File Attribute = 08 = Volume Label
-------------------------------------------------------

1000040 e5 4e 00 65 00 77 00 20 00 46 00 0f 00 dd 6f 00

1000060 6c 00 64 00 65 00 72 00 00 00 00 00 ff ff ff ff

Second directory entry (32 bytes)

e5 = Deleted
-------------------------------------------------------

1000140 F O L D E R 1 10 00 4c 02 01

1000160 b1 3c b1 3c 00 00 03 01 b1 3c 02 00 00 00 00 00

Fourth Directory Entry

Directory Name = FOLDER1

Attribute = 10 = Directory

First Cluster Address = 0002
-------------------------------------------------------

(removed deleted/unallocated directory entries>

1000300 F O L D E R 2 10 00 91 07 01

1000320 b1 3c b1 3c 00 00 08 01 b1 3c 03 00 00 00 00 00

Directory Name = FOLDER2

Attribute = 10 = Directory

First cluster address = 0003
--------------------------------------------------------

1000500 T E S T 1 T X T 20 00 7e 0b 01

1000520 b1 3c b1 3c 00 00 22 01 b1 3c 04 00 b8 01 00 00

File Name = TEST1.TXT

Attributes = 20 = Archive

First Cluster address = 0004

--------------------------------------------------------

1000700 T E S T 2 T X T 20 00 7e 0b 01

1000720 b1 3c b1 3c 00 00 3a 01 b1 3c 05 00 2b 01 00 00

File Name = TEST2.TXT

Attribute = 20 = Archive

First cluster address = 0005

So, we saw that the root directory contains the directory entries of all folders and files that are created inside it. The directory entry contains the first sector containing the directory or file, so we can find all subdirectories and files inside it.
Lest look at FOLDER1. I created the file test3.txt inside FOLDER1
As per the directory entry of FOLDER1 (printed above) the first cluster number that contains the contents of FOLDER1 is 0x02.
We will print the contents of cluster no 2. Note that cluster numbering starts from 2. Cluster 2 lies just after the end of root directory.
Sector number corresponding to cluster 2 = 2 + 255 *2 + 32 = 544

CONTENTS OF FOLDER1 (cluster 2) Sector #544

1040000 2e 20 20 20 20 20 20 20 20 20 20 10 00 4c 02 01

1040020 b1 3c b1 3c 00 00 03 01 b1 3c 02 00 00 00 00 00

<snip>

1040240 T E S T 3 T X T 20 00 59 45 01

1040260 b1 3c b1 3c 00 00 54 01 b1 3c 06 00 f2 00 00 00

File Name = TEST3.TXT

Attributes = 0x20 (Archive)

First Cluster address = 0006

The first cluster that contains the file TEST3.TXT is 0x06.
I stored following contents in TEST3.TXT

# cat /folder1/test3.txt

CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TXT CONTENTS oF TEST3.TX

Lets print the contents of cluster 6 to see if the contents match with the contents of file test3.txt

Cluster 6 correspond to sector number 2 + 255 * 2 + 32 + 32 * 4 = 672

1240000 C O N T E N T S o F T E S T

1240020 3 . T X T C O N T E N T S o

1240040 F T E S T 3 . T X T C O N T

1240060 E N T S o F T E S T 3 . T X

1240100 T C O N T E N T S o F T E

1240120 S T 3 . T X T C O N T E N T S

1240140 o F T E S T 3 . T X T C O

1240160 N T E N T S o F T E S T 3 .

1240200 T X T C O N T E N T S o F

1240220 T E S T 3 . T X T C O N T E N

1240240 T S o F T E S T 3 . T X T

1240260 C O N T E N T S o F T E S T

1240300 3 . T X T C O N T E N T S o

1240320 F T E S T 3 . T X T C O N T

1240340 E N T S o F T E S T 3 . T X

1240360 T \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

1240400 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

So, we saw that the contents matches with what we stored in test3.txt.
You can follow the same approach to traverse the directory hierarchy to print the contents of other files.
Lets print the primary FAT Table as well

0002000 f8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

0002020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Entry 0 = 0xf8 (media type non removable)

Entry 1 = 0xff

Entry 2 = 0xff

Entry 6 = 0xff which indicates the end of file for test3.txt

(ie test3.txt has only one cluster allocated)

Thats all for today. I hope you find this article useful. If you have any questions or find something missing or incorrect please post your comments.
Have Fun !!!

Total Pageviews

Understanding Linux

Blog Archive

Monday, May 31, 2010

Data Structure Alignment and Padding

Wednesday, May 19, 2010

Dissecting FAT 16 File System

Followers