LINUX vs WINDOWS SERVER
LINUX SERVER
Linux is a freely-circulated open source system, which makes it greatly cost-effective for hosts to provide, maintain, and operate. It also has a extremely strong position for both speed and stability. It’s so accepted that the best part of websites are essentially hosted on a Linux operating system.
SECURITY: Although Linux and Windows can both face hacking attempts, for the reason that Linux is open sourced, patches to close security holes are implemented very quickly since so many people contribute to making it better every day.
COST: Yet again, it’s an open source OS, so it doesn’t need any licensing charge for using Linux operating systems. It expenses a smaller amount for the host to offer the service.
1. Linux is free software., Linux is a open-source OS.People can change code .
2. We can Add programs which will help to use your computer better.
3. Linux wants the programmers to extend and redesign it's OS time after time, but with open-source, so you can see what happens and you can edit the OS.
4. The various distributions of Linux come from different companies (i.e LIndows , Lycoris, Red Hat, SuSe, Mandrake, Knopping, Slackware).,Linux is capable of networking, file sharing and being a web server.
5. Linux is very cheap or free.
6. Linux is customizable in a way that Windows is not.
7. Microsoft Windows is a closed-source operating system created by Bill Gates, supreme ruler of the earth. It is gradually losing it's grip on the market because it is insecure, slow, and wasteful.
8. Windows and Linux are two different operating systems. The purpose of an operating system is to: 1. control all the hardware components that are part of your computer. 2. manage a computer's ability to do several things at once 3. provide a base set of services to programs to keep software manufacturers from have to reinvent the wheel a million times for the same thing. The Linux operating system was developed from a base of Unix (another operating system) after the Unix systems stopped being free. The Linux people believe in free and open software, and so they "reinvented" Unix, and improved it slightly to make Linux.
9. Most hard drive installations of Linux utilize a "swap partition", where the disk space allocated for paging is separate from general data, and is used strictly for paging operations. This reduces slowdown due to disk fragmentation from general use.
10. Linux kernel 2.6 once used a scheduling algorithm favoring interactive processes. Here "interactive" is defined as a process that has short bursts of CPU usage rather than long ones. It is said that a process without root privilege can take advantage of this to monopolize the CPU, when the CPU time accounting precision is low. However, Completely Fair Scheduler, now the standard scheduler, addresses this problem
WINDOWS NT SERVER
Windows, similar to your personal computer, is a Microsoft owned commercial operating system. Its major benefit is that it can also run Microsoft software such as Access and MS SQL databases.
SECURITY: Because it is a commercial operating system, it could take a little longer at fixing a few security issues (frequently by releasing security packs) while they must usually be provided through Microsoft.
COST: As you buy Windows for your private computer, servers needs to pay Microsoft for extra licensing amount to make use of their operating system. That’s why hosts generally charge extra for Windows hosting.
1. Window NT is devloped by Microsoft company.
2. Window NT is programmed in C and C++
3. You can't change any thing in windows. you can't even see which processes do what and build your onw extension.
4. All the flavors of Windows come from Microsoft.
5. Windows is expensive
6. Windows is not customizable.
7. Linux is an open source operating system that, until fairly recently, was only used on servers. Now it is used on Mac OS X computers, and more people are starting to use it on computers that aren't servers. It is very secure, efficient, and flexible.
8. Windows and Linux are two different operating systems. The purpose of an operating system is to: 1. control all the hardware components that are part of your computer. 2. manage a computer's ability to do several things at once 3. provide a base set of services to programs to keep software manufacturers from have to reinvent the wheel a million times for the same thing.
Windows is a proprietary operating system owned by Microsoft. It was developed independently from Unix, and its internal details are much different. They should perform the same tasks, however at the deepest levels, details differ, and so a program written to run on Windows will not run on Linux, and vice versa.
Widows comes in several "flavors", like Windows NT, Windows 2000, and Windows XP, all of which are slightly different, but share enough in common that programs written for one flavor will run on the others 99.9% of the time.
9. Windows NT family (including 2000, XP, Vista, Win7) most commonly employs a dynamically allocated pagefile for memory management. A pagefile is allocated on disk, for less frequently accessed objects in memory, leaving more RAM available to actively used objects. This scheme suffers from slow-downs due to disk fragmentation
10. NT-based versions of Windows use a CPU scheduler based on a multilevel feedback queue, with 32 priority levels defined. The kernel may change the priority level of a thread depending on its I/O and CPU usage and whether it is interactive , raising the priority of interactive and I/O bounded processes and lowering that of CPU bound processes, to increase the responsiveness of interactive applications.
Thursday, April 22, 2010
LINUX vs WINDOWS SERVER
LINUX SERVER
Linux is a freely-circulated open source system, which makes it greatly cost-effective for hosts to provide, maintain, and operate. It also has a extremely strong position for both speed and stability. It’s so accepted that the best part of websites are essentially hosted on a Linux operating system.
SECURITY: Although Linux and Windows can both face hacking attempts, for the reason that Linux is open sourced, patches to close security holes are implemented very quickly since so many people contribute to making it better every day.
COST: Yet again, it’s an open source OS, so it doesn’t need any licensing charge for using Linux operating systems. It expenses a smaller amount for the host to offer the service.
1. Linux is free software., Linux is a open-source OS.People can change code .
2. We can Add programs which will help to use your computer better.
3. Linux wants the programmers to extend and redesign it's OS time after time, but with open-source, so you can see what happens and you can edit the OS.
4. The various distributions of Linux come from different companies (i.e LIndows , Lycoris, Red Hat, SuSe, Mandrake, Knopping, Slackware).,Linux is capable of networking, file sharing and being a web server.
5. Linux is very cheap or free.
6. Linux is customizable in a way that Windows is not.
7. Microsoft Windows is a closed-source operating system created by Bill Gates, supreme ruler of the earth. It is gradually losing it's grip on the market because it is insecure, slow, and wasteful.
8. Windows and Linux are two different operating systems. The purpose of an operating system is to: 1. control all the hardware components that are part of your computer. 2. manage a computer's ability to do several things at once 3. provide a base set of services to programs to keep software manufacturers from have to reinvent the wheel a million times for the same thing. The Linux operating system was developed from a base of Unix (another operating system) after the Unix systems stopped being free. The Linux people believe in free and open software, and so they "reinvented" Unix, and improved it slightly to make Linux.
9. Most hard drive installations of Linux utilize a "swap partition", where the disk space allocated for paging is separate from general data, and is used strictly for paging operations. This reduces slowdown due to disk fragmentation from general use.
10. Linux kernel 2.6 once used a scheduling algorithm favoring interactive processes. Here "interactive" is defined as a process that has short bursts of CPU usage rather than long ones. It is said that a process without root privilege can take advantage of this to monopolize the CPU, when the CPU time accounting precision is low. However, Completely Fair Scheduler, now the standard scheduler, addresses this problem
WINDOWS NT SERVER
Windows, similar to your personal computer, is a Microsoft owned commercial operating system. Its major benefit is that it can also run Microsoft software such as Access and MS SQL databases.
SECURITY: Because it is a commercial operating system, it could take a little longer at fixing a few security issues (frequently by releasing security packs) while they must usually be provided through Microsoft.
COST: As you buy Windows for your private computer, servers needs to pay Microsoft for extra licensing amount to make use of their operating system. That’s why hosts generally charge extra for Windows hosting.
1. Window NT is devloped by Microsoft company.
2. Window NT is programmed in C and C++
3. You can't change any thing in windows. you can't even see which processes do what and build your onw extension.
4. All the flavors of Windows come from Microsoft.
5. Windows is expensive
6. Windows is not customizable.
7. Linux is an open source operating system that, until fairly recently, was only used on servers. Now it is used on Mac OS X computers, and more people are starting to use it on computers that aren't servers. It is very secure, efficient, and flexible.
8. Windows and Linux are two different operating systems. The purpose of an operating system is to: 1. control all the hardware components that are part of your computer. 2. manage a computer's ability to do several things at once 3. provide a base set of services to programs to keep software manufacturers from have to reinvent the wheel a million times for the same thing.
Windows is a proprietary operating system owned by Microsoft. It was developed independently from Unix, and its internal details are much different. They should perform the same tasks, however at the deepest levels, details differ, and so a program written to run on Windows will not run on Linux, and vice versa.
Widows comes in several "flavors", like Windows NT, Windows 2000, and Windows XP, all of which are slightly different, but share enough in common that programs written for one flavor will run on the others 99.9% of the time.
9. Windows NT family (including 2000, XP, Vista, Win7) most commonly employs a dynamically allocated pagefile for memory management. A pagefile is allocated on disk, for less frequently accessed objects in memory, leaving more RAM available to actively used objects. This scheme suffers from slow-downs due to disk fragmentation
10. NT-based versions of Windows use a CPU scheduler based on a multilevel feedback queue, with 32 priority levels defined. The kernel may change the priority level of a thread depending on its I/O and CPU usage and whether it is interactive , raising the priority of interactive and I/O bounded processes and lowering that of CPU bound processes, to increase the responsiveness of interactive applications.
LINUX SERVER
Linux is a freely-circulated open source system, which makes it greatly cost-effective for hosts to provide, maintain, and operate. It also has a extremely strong position for both speed and stability. It’s so accepted that the best part of websites are essentially hosted on a Linux operating system.
SECURITY: Although Linux and Windows can both face hacking attempts, for the reason that Linux is open sourced, patches to close security holes are implemented very quickly since so many people contribute to making it better every day.
COST: Yet again, it’s an open source OS, so it doesn’t need any licensing charge for using Linux operating systems. It expenses a smaller amount for the host to offer the service.
1. Linux is free software., Linux is a open-source OS.People can change code .
2. We can Add programs which will help to use your computer better.
3. Linux wants the programmers to extend and redesign it's OS time after time, but with open-source, so you can see what happens and you can edit the OS.
4. The various distributions of Linux come from different companies (i.e LIndows , Lycoris, Red Hat, SuSe, Mandrake, Knopping, Slackware).,Linux is capable of networking, file sharing and being a web server.
5. Linux is very cheap or free.
6. Linux is customizable in a way that Windows is not.
7. Microsoft Windows is a closed-source operating system created by Bill Gates, supreme ruler of the earth. It is gradually losing it's grip on the market because it is insecure, slow, and wasteful.
8. Windows and Linux are two different operating systems. The purpose of an operating system is to: 1. control all the hardware components that are part of your computer. 2. manage a computer's ability to do several things at once 3. provide a base set of services to programs to keep software manufacturers from have to reinvent the wheel a million times for the same thing. The Linux operating system was developed from a base of Unix (another operating system) after the Unix systems stopped being free. The Linux people believe in free and open software, and so they "reinvented" Unix, and improved it slightly to make Linux.
9. Most hard drive installations of Linux utilize a "swap partition", where the disk space allocated for paging is separate from general data, and is used strictly for paging operations. This reduces slowdown due to disk fragmentation from general use.
10. Linux kernel 2.6 once used a scheduling algorithm favoring interactive processes. Here "interactive" is defined as a process that has short bursts of CPU usage rather than long ones. It is said that a process without root privilege can take advantage of this to monopolize the CPU, when the CPU time accounting precision is low. However, Completely Fair Scheduler, now the standard scheduler, addresses this problem
WINDOWS NT SERVER
Windows, similar to your personal computer, is a Microsoft owned commercial operating system. Its major benefit is that it can also run Microsoft software such as Access and MS SQL databases.
SECURITY: Because it is a commercial operating system, it could take a little longer at fixing a few security issues (frequently by releasing security packs) while they must usually be provided through Microsoft.
COST: As you buy Windows for your private computer, servers needs to pay Microsoft for extra licensing amount to make use of their operating system. That’s why hosts generally charge extra for Windows hosting.
1. Window NT is devloped by Microsoft company.
2. Window NT is programmed in C and C++
3. You can't change any thing in windows. you can't even see which processes do what and build your onw extension.
4. All the flavors of Windows come from Microsoft.
5. Windows is expensive
6. Windows is not customizable.
7. Linux is an open source operating system that, until fairly recently, was only used on servers. Now it is used on Mac OS X computers, and more people are starting to use it on computers that aren't servers. It is very secure, efficient, and flexible.
8. Windows and Linux are two different operating systems. The purpose of an operating system is to: 1. control all the hardware components that are part of your computer. 2. manage a computer's ability to do several things at once 3. provide a base set of services to programs to keep software manufacturers from have to reinvent the wheel a million times for the same thing.
Windows is a proprietary operating system owned by Microsoft. It was developed independently from Unix, and its internal details are much different. They should perform the same tasks, however at the deepest levels, details differ, and so a program written to run on Windows will not run on Linux, and vice versa.
Widows comes in several "flavors", like Windows NT, Windows 2000, and Windows XP, all of which are slightly different, but share enough in common that programs written for one flavor will run on the others 99.9% of the time.
9. Windows NT family (including 2000, XP, Vista, Win7) most commonly employs a dynamically allocated pagefile for memory management. A pagefile is allocated on disk, for less frequently accessed objects in memory, leaving more RAM available to actively used objects. This scheme suffers from slow-downs due to disk fragmentation
10. NT-based versions of Windows use a CPU scheduler based on a multilevel feedback queue, with 32 priority levels defined. The kernel may change the priority level of a thread depending on its I/O and CPU usage and whether it is interactive , raising the priority of interactive and I/O bounded processes and lowering that of CPU bound processes, to increase the responsiveness of interactive applications.
Friday, March 26, 2010
Raid Technology
RAID stands for Redundant Array of Inexpensive (or sometimes "Independent") Disks.
RAID is a method of combining several hard disk drives into one logical unit (two or more disks grouped together to appear as a single device to the host system). RAID technology was developed to address the fault-tolerance and performance limitations of conventional disk storage. It can offer fault tolerance and higher throughput levels than a single hard drive or group of independent hard drives. While arrays were once considered complex and relatively specialized storage solutions, today they are easy to use and essential for a broad spectrum of client/server applications
HISTORY
RAID technology was first defined by a group of computer scientists at the University of California at Berkeley in 1987. The scientists studied the possibility of using two or more disks to appear as a single device to the host system.
Although the array's performance was better than that of large, single-disk storage systems, reliability was unacceptably low. To address this, the scientists proposed redundant architectures to provide ways of achieving storage fault tolerance. In addition to defining RAID levels 1 through 5, the scientists also studied data striping -- a non-redundant array configuration that distributes files across multiple disks in an array. Often known as RAID 0, this configuration actually provides no data protection. However, it does offer maximum throughput for some data-intensive applications such as desktop digital video production.
THE DRIVING FACTORS BEHIND RAID
A number of factors are responsible for the growing adoption of arrays for critical network storage.
More and more organizations have created enterprise-wide networks to improve productivity and streamline information flow. While the distributed data stored on network servers provides substantial cost benefits, these savings can be quickly offset if information is frequently lost or becomes inaccessible. As today's applications create larger files, network storage needs have increased proportionately. In addition, accelerating CPU speeds have outstripped data transfer rates to storage media, creating bottlenecks in today's systems.
RAID storage solutions overcome these challenges by providing a combination of outstanding data availability, extraordinary and highly scalable performance, high capacity, and recovery with no loss of data or interruption of user access.
By integrating multiple drives into a single array -- which is viewed by the network operating system as a single disk drive -- organizations can create cost-effective, minicomputersized solutions of up to a terabyte or more of storage.
RAID LEVELS
There are several different RAID "levels" or redundancy schemes, each with inherent cost, performance, and availability (fault-tolerance) characteristics designed to meet different storage needs. No individual RAID level is inherently superior to any other. Each of the five array architectures is well-suited for certain types of applications and computing environments. For client/server applications, storage systems based on RAID levels 1, 0/1, and 5 have been the most widely used. This is because popular NOSs such as Windows NT® Server and NetWare manage data in ways similar to how these RAID architectures perform.
RAID 0
Data striping without redundancy (no protection).
• Minimum number of drives: 2
• Strengths: Highest performance.
• Weaknesses: No data protection; One drive fails, all data is lost.
DRIVE 1 DRIVE 2
Data A Data A
Data B Data B
Data C Data C
RAID 1
Disk mirroring.
• Minimum number of drives: 2
• Strengths: Very high performance; Very high data protection; Very minimal penalty on write performance.
• Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is required.
Mirroring
Standard Host
Adapter
DRIVE 1 DRIVE 2
Data A Data A
Data B Data B
Data C Data C
Original Data Mirrored Data
Duplexing
Standard Host
Adapter 1 Standard Host
Adapter 2
DRIVE 1 DRIVE 2
Data A Data A
Data B Data B
Data C Data C
Original Data Mirrored Data
RAID 2
No practical use.
• Minimum number of drives: Not used in LAN
• Strengths: Previously used for RAM error environments correction (known as Hamming Code ) and in disk drives before he use of embedded error correction.
• Weaknesses: No practical use; Same performance can be achieved by RAID 3 at lower cost
RAID 3
Byte-level data striping with dedicated parity drive.
• Minimum number of drives: 3
• Strengths: Excellent performance for large, sequential data requests.
• Weaknesses: Not well-suited for transaction-oriented network applications; Single parity drive does not support multiple, simultaneous read and write requests
RAID 4
Block-level data striping with dedicated parity drive.
• Minimum number of drives: 3 (Not widely used)
• Strengths: Data striping supports multiple simultaneous read requests.
• Weaknesses: Write requests suffer from same single parity-drive bottleneck as RAID 3; RAID 5 offers equal data protection and better performance at same cost.,
RAID 5
Block-level data striping with distributed parity.
• Minimum number of drives: 3
• Strengths: Best cost/performance for transaction-oriented networks; Very high performance, very high data protection; Supports multiple simultaneous reads and writes; Can also be optimized for large, sequential requests.
• Weaknesses: Write performance is slower than RAID 0 or RAID 1.
DRIVE 1 DRIVE 2 DRIVE 3
Parity A Data A Data A
Data B Parity B Data B
Data C Data C Parity C
RAID 01(0+1) AND RAID 10(1+0)
Combination of RAID 0 (data striping) and RAID 1 (mirroring). RAID 01 (0+1) is a mirrored configuration of two striped sets (mirror of stripes); RAID 10 (1+0) is a stripe across a number of mirrored sets(stripe of mirrors). RAID 10 provides better fault tolerance and rebuild performance than RAID 01. Both array types provide very good to excellent overall performance by combining the speed of RAID 0 with the redundancy of RAID 1 without requiring parity calculations.
• Minimum number of drives: 4
• Strengths: Highest performance, highest data protection (can tolerate multiple drive failures).
• Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is required; Requires minimum of four drives
RAID 01 (0+1 mirror of stripes)
DRIVE 1 DRIVE 2 DRIVE 3 DRIVE 4
Data A Data A mA mA
Data B Data B mB mB
Data C Data C mC mC
Original Data Original Data Mirrored Data Mirrored Data
RAID 10 (1+0 stripe of mirrors)
DRIVE 1 DRIVE 2 DRIVE 3 DRIVE 4
Data A mA Data B mB
Data C mC Data D mD
Data E mE Data F mF
Original Data Mirrored Data Original Data Mirrored Data
TYPES OF RAID
There are three primary array implementations: software-based arrays, bus-based array adapters/controllers, and subsystem-based external array controllers. As with the various RAID levels, no one implementation is clearly better than another -- although software-based arrays are rapidly losing favor as high-performance, low-cost array adapters become increasingly available. Each array solution meets different server and network requirements, depending on the number of users, applications, and storage requirements.
It is important to note that all RAID code is based on software. The difference among the solutions is where that software code is executed -- on the host CPU (software-based arrays) or offloaded to an on-board processor (bus-based and external array controllers).
Description Advantages
Software-based RAID Primarily used with entry-level servers, software-based arrays rely on a standard host adapter and execute all I/O commands and mathematically intensive RAID algorithms in the host server CPU. This can slow system performance by increasing host PCI bus traffic, CPU utilization, and CPU interrupts. Some NOSs such as NetWare and Windows NT include embedded RAID software. The chief advantage of this embedded RAID software has been its lower cost compared to higher-priced RAID alternatives. However, this advantage is disappearing with the advent of lower-cost, bus-based array adapters. • Low price
• Only requires a standard controller.
Hardware-based RAID Unlike software-based arrays, bus-based array adapters/controllers plug into a host bus slot [typically a 133 MByte (MB)/sec PCI bus] and offload some or all of the I/O commands and RAID operations to one or more secondary processors. Originally used only with mid- to high-end servers due to cost, lower-cost bus-based array adapters are now available specifically for entry-level server network applications.
In addition to offering the fault-tolerant benefits of RAID, bus-based array adapters/controllers perform connectivity functions that are similar to standard host adapters. By residing directly on a host PCI bus, they provide the highest performance of all array types. Bus-based arrays also deliver more robust fault-tolerant features than embedded NOS RAID software.
As newer, high-end technologies such as Fibre Channel become readily available, the performance advantage of bus-based arrays compared to external array controller solutions may diminish. • Data protection and performance benefits of RAID
• More robust fault-tolerant features and increased performance versus software-based RAID.
External Hardware RAID Card Intelligent external array controllers "bridge" between one or more server I/O interfaces and single- or multiple-device channels. These controllers feature an on-board microprocessor, which provides high performance and handles functions such as executing RAID software code and supporting data caching.
External array controllers offer complete operating system independence, the highest availability, and the ability to scale storage to extraordinarily large capacities (up to a terabyte and beyond). These controllers are usually installed in networks of stand alone Intel-based and UNIX-based servers as well as clustered server environments. • OS independent
• Build super high-capacity storage systems for high-end servers.
SERVER TECHNOLOGY COMPARISON
UDMA SCSI Fibre Channel
Best Suited For Low-cost entry level server with limited expandability Low to high-end server when scalability is desired Server-to-Server campus networks
Advantages • Uses low-cost ATA drives • Performance: up to 160 MB/s
• Reliability
• Connectivity to the largest variety of peripherals
• Expandability • Performance: up to 100 MB/s
• Dual active loop data path capability
• Infinitely scalable
PARITY
The concept behind RAID is relatively simple. The fundamental premise is to be able to recover data on-line in the event of a disk failure by using a form of redundancy called parity. In its simplest form, parity is an addition of all the drives used in an array. Recovery from a drive failure is achieved by reading the remaining good data and checking it against parity data stored by the array. Parity is used by RAID levels 2, 3, 4, and 5. RAID 1 does not use parity because all data is completely duplicated (mirrored). RAID 0, used only to increase performance, offers no data redundancy at all.
A + B + C + D = PARITY
1 + 2 + 3 + 4 = 10
1 + 2 + X + 4 = 10
7 + X = 10
-7 + = -7
--------- ----------
X 3
MISSING RECOVERED
DATA DATA
FAULT TOLERANCE
RAID technology does not prevent drive failures. However, RAID does provide insurance against disk drive failures by enabling real-time data recovery without data loss.
The fault tolerance of arrays can also be significantly enhanced by choosing the right storage enclosure. Enclosures that feature redundant, hot-swappable drives, power supplies, and fans can greatly increase storage subsystem uptime based on a number of widely accepted measures:
• MTDL:
Mean Time to Data Loss. The average time before the failure of an array component causes data to be lost or corrupted.
• MTDA:
Mean Time between Data Access (or availability). The average time before non-redundant components fail, causing data inaccessibility without loss or corruption.
• MTTR:
Mean Time To Repair. The average time required to bring an array storage subsystem back to full fault tolerance.
• MTBF:
Mean Time Between Failure. Used to measure computer component average reliability/life expectancy. MTBF is not as well-suited for measuring the reliability of array storage systems as MTDL, MTTR or MTDA (see below) because it does not account for an array's ability to recover from a drive failure. In addition, enhanced enclosure environments used with arrays to increase uptime can further limit the applicability of MTBF ratings for array solutions
RAID is a method of combining several hard disk drives into one logical unit (two or more disks grouped together to appear as a single device to the host system). RAID technology was developed to address the fault-tolerance and performance limitations of conventional disk storage. It can offer fault tolerance and higher throughput levels than a single hard drive or group of independent hard drives. While arrays were once considered complex and relatively specialized storage solutions, today they are easy to use and essential for a broad spectrum of client/server applications
HISTORY
RAID technology was first defined by a group of computer scientists at the University of California at Berkeley in 1987. The scientists studied the possibility of using two or more disks to appear as a single device to the host system.
Although the array's performance was better than that of large, single-disk storage systems, reliability was unacceptably low. To address this, the scientists proposed redundant architectures to provide ways of achieving storage fault tolerance. In addition to defining RAID levels 1 through 5, the scientists also studied data striping -- a non-redundant array configuration that distributes files across multiple disks in an array. Often known as RAID 0, this configuration actually provides no data protection. However, it does offer maximum throughput for some data-intensive applications such as desktop digital video production.
THE DRIVING FACTORS BEHIND RAID
A number of factors are responsible for the growing adoption of arrays for critical network storage.
More and more organizations have created enterprise-wide networks to improve productivity and streamline information flow. While the distributed data stored on network servers provides substantial cost benefits, these savings can be quickly offset if information is frequently lost or becomes inaccessible. As today's applications create larger files, network storage needs have increased proportionately. In addition, accelerating CPU speeds have outstripped data transfer rates to storage media, creating bottlenecks in today's systems.
RAID storage solutions overcome these challenges by providing a combination of outstanding data availability, extraordinary and highly scalable performance, high capacity, and recovery with no loss of data or interruption of user access.
By integrating multiple drives into a single array -- which is viewed by the network operating system as a single disk drive -- organizations can create cost-effective, minicomputersized solutions of up to a terabyte or more of storage.
RAID LEVELS
There are several different RAID "levels" or redundancy schemes, each with inherent cost, performance, and availability (fault-tolerance) characteristics designed to meet different storage needs. No individual RAID level is inherently superior to any other. Each of the five array architectures is well-suited for certain types of applications and computing environments. For client/server applications, storage systems based on RAID levels 1, 0/1, and 5 have been the most widely used. This is because popular NOSs such as Windows NT® Server and NetWare manage data in ways similar to how these RAID architectures perform.
RAID 0
Data striping without redundancy (no protection).
• Minimum number of drives: 2
• Strengths: Highest performance.
• Weaknesses: No data protection; One drive fails, all data is lost.
DRIVE 1 DRIVE 2
Data A Data A
Data B Data B
Data C Data C
RAID 1
Disk mirroring.
• Minimum number of drives: 2
• Strengths: Very high performance; Very high data protection; Very minimal penalty on write performance.
• Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is required.
Mirroring
Standard Host
Adapter
DRIVE 1 DRIVE 2
Data A Data A
Data B Data B
Data C Data C
Original Data Mirrored Data
Duplexing
Standard Host
Adapter 1 Standard Host
Adapter 2
DRIVE 1 DRIVE 2
Data A Data A
Data B Data B
Data C Data C
Original Data Mirrored Data
RAID 2
No practical use.
• Minimum number of drives: Not used in LAN
• Strengths: Previously used for RAM error environments correction (known as Hamming Code ) and in disk drives before he use of embedded error correction.
• Weaknesses: No practical use; Same performance can be achieved by RAID 3 at lower cost
RAID 3
Byte-level data striping with dedicated parity drive.
• Minimum number of drives: 3
• Strengths: Excellent performance for large, sequential data requests.
• Weaknesses: Not well-suited for transaction-oriented network applications; Single parity drive does not support multiple, simultaneous read and write requests
RAID 4
Block-level data striping with dedicated parity drive.
• Minimum number of drives: 3 (Not widely used)
• Strengths: Data striping supports multiple simultaneous read requests.
• Weaknesses: Write requests suffer from same single parity-drive bottleneck as RAID 3; RAID 5 offers equal data protection and better performance at same cost.,
RAID 5
Block-level data striping with distributed parity.
• Minimum number of drives: 3
• Strengths: Best cost/performance for transaction-oriented networks; Very high performance, very high data protection; Supports multiple simultaneous reads and writes; Can also be optimized for large, sequential requests.
• Weaknesses: Write performance is slower than RAID 0 or RAID 1.
DRIVE 1 DRIVE 2 DRIVE 3
Parity A Data A Data A
Data B Parity B Data B
Data C Data C Parity C
RAID 01(0+1) AND RAID 10(1+0)
Combination of RAID 0 (data striping) and RAID 1 (mirroring). RAID 01 (0+1) is a mirrored configuration of two striped sets (mirror of stripes); RAID 10 (1+0) is a stripe across a number of mirrored sets(stripe of mirrors). RAID 10 provides better fault tolerance and rebuild performance than RAID 01. Both array types provide very good to excellent overall performance by combining the speed of RAID 0 with the redundancy of RAID 1 without requiring parity calculations.
• Minimum number of drives: 4
• Strengths: Highest performance, highest data protection (can tolerate multiple drive failures).
• Weaknesses: High redundancy cost overhead; Because all data is duplicated, twice the storage capacity is required; Requires minimum of four drives
RAID 01 (0+1 mirror of stripes)
DRIVE 1 DRIVE 2 DRIVE 3 DRIVE 4
Data A Data A mA mA
Data B Data B mB mB
Data C Data C mC mC
Original Data Original Data Mirrored Data Mirrored Data
RAID 10 (1+0 stripe of mirrors)
DRIVE 1 DRIVE 2 DRIVE 3 DRIVE 4
Data A mA Data B mB
Data C mC Data D mD
Data E mE Data F mF
Original Data Mirrored Data Original Data Mirrored Data
TYPES OF RAID
There are three primary array implementations: software-based arrays, bus-based array adapters/controllers, and subsystem-based external array controllers. As with the various RAID levels, no one implementation is clearly better than another -- although software-based arrays are rapidly losing favor as high-performance, low-cost array adapters become increasingly available. Each array solution meets different server and network requirements, depending on the number of users, applications, and storage requirements.
It is important to note that all RAID code is based on software. The difference among the solutions is where that software code is executed -- on the host CPU (software-based arrays) or offloaded to an on-board processor (bus-based and external array controllers).
Description Advantages
Software-based RAID Primarily used with entry-level servers, software-based arrays rely on a standard host adapter and execute all I/O commands and mathematically intensive RAID algorithms in the host server CPU. This can slow system performance by increasing host PCI bus traffic, CPU utilization, and CPU interrupts. Some NOSs such as NetWare and Windows NT include embedded RAID software. The chief advantage of this embedded RAID software has been its lower cost compared to higher-priced RAID alternatives. However, this advantage is disappearing with the advent of lower-cost, bus-based array adapters. • Low price
• Only requires a standard controller.
Hardware-based RAID Unlike software-based arrays, bus-based array adapters/controllers plug into a host bus slot [typically a 133 MByte (MB)/sec PCI bus] and offload some or all of the I/O commands and RAID operations to one or more secondary processors. Originally used only with mid- to high-end servers due to cost, lower-cost bus-based array adapters are now available specifically for entry-level server network applications.
In addition to offering the fault-tolerant benefits of RAID, bus-based array adapters/controllers perform connectivity functions that are similar to standard host adapters. By residing directly on a host PCI bus, they provide the highest performance of all array types. Bus-based arrays also deliver more robust fault-tolerant features than embedded NOS RAID software.
As newer, high-end technologies such as Fibre Channel become readily available, the performance advantage of bus-based arrays compared to external array controller solutions may diminish. • Data protection and performance benefits of RAID
• More robust fault-tolerant features and increased performance versus software-based RAID.
External Hardware RAID Card Intelligent external array controllers "bridge" between one or more server I/O interfaces and single- or multiple-device channels. These controllers feature an on-board microprocessor, which provides high performance and handles functions such as executing RAID software code and supporting data caching.
External array controllers offer complete operating system independence, the highest availability, and the ability to scale storage to extraordinarily large capacities (up to a terabyte and beyond). These controllers are usually installed in networks of stand alone Intel-based and UNIX-based servers as well as clustered server environments. • OS independent
• Build super high-capacity storage systems for high-end servers.
SERVER TECHNOLOGY COMPARISON
UDMA SCSI Fibre Channel
Best Suited For Low-cost entry level server with limited expandability Low to high-end server when scalability is desired Server-to-Server campus networks
Advantages • Uses low-cost ATA drives • Performance: up to 160 MB/s
• Reliability
• Connectivity to the largest variety of peripherals
• Expandability • Performance: up to 100 MB/s
• Dual active loop data path capability
• Infinitely scalable
PARITY
The concept behind RAID is relatively simple. The fundamental premise is to be able to recover data on-line in the event of a disk failure by using a form of redundancy called parity. In its simplest form, parity is an addition of all the drives used in an array. Recovery from a drive failure is achieved by reading the remaining good data and checking it against parity data stored by the array. Parity is used by RAID levels 2, 3, 4, and 5. RAID 1 does not use parity because all data is completely duplicated (mirrored). RAID 0, used only to increase performance, offers no data redundancy at all.
A + B + C + D = PARITY
1 + 2 + 3 + 4 = 10
1 + 2 + X + 4 = 10
7 + X = 10
-7 + = -7
--------- ----------
X 3
MISSING RECOVERED
DATA DATA
FAULT TOLERANCE
RAID technology does not prevent drive failures. However, RAID does provide insurance against disk drive failures by enabling real-time data recovery without data loss.
The fault tolerance of arrays can also be significantly enhanced by choosing the right storage enclosure. Enclosures that feature redundant, hot-swappable drives, power supplies, and fans can greatly increase storage subsystem uptime based on a number of widely accepted measures:
• MTDL:
Mean Time to Data Loss. The average time before the failure of an array component causes data to be lost or corrupted.
• MTDA:
Mean Time between Data Access (or availability). The average time before non-redundant components fail, causing data inaccessibility without loss or corruption.
• MTTR:
Mean Time To Repair. The average time required to bring an array storage subsystem back to full fault tolerance.
• MTBF:
Mean Time Between Failure. Used to measure computer component average reliability/life expectancy. MTBF is not as well-suited for measuring the reliability of array storage systems as MTDL, MTTR or MTDA (see below) because it does not account for an array's ability to recover from a drive failure. In addition, enhanced enclosure environments used with arrays to increase uptime can further limit the applicability of MTBF ratings for array solutions
Friday, March 12, 2010
Scenarios
Scenario-1:
You have installed windows service pack 2 and after updating windows up to service pack 3, you are able to log in system but receiving a continuous message that it is not a genuine copy of windows. What are the solutions available to this problem in both manner legal of illegal?
Illegal
Windows Genuine Notification popped up because Windows Update again installed it via Automatic Updates. It pops up while a user logs in to windows, displays a message near the system tray and keeps on reminding you in between work that the copy of windows is not genuine. It has been reported since its first release that even genuine users are getting this prompt, so Microsoft has them self release instructions for its removal. When I searched on Google about this issue, I landed up on pages which were providing many methods of its removal including those patching up existing files with their cracked versions which I would highly recommend avoiding them as they might contain malicious code and can be used to get you into more trouble.
I found out this method of removal of Windows Genuine Notification :
1. Launch Windows Task Manager.
2. End wgatray.exe process in Task Manager.
3. Restart Windows XP in Safe Mode.
4. Delete WgaTray.exe from C:\Windows\System32.
5. Delete WgaTray.exe from C:\Windows\System32\dllcache.
6. Lauch RegEdit.
7. Browse to the following location: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Notify
8. Delete the folder ‘WgaLogon’ and all its contents
9. Reboot Windows XP.
But the latest version of the WGN tool is a little tricky to handle. It will pop up again as soon as you end it from the task manager and while it is running in the memory, you can’t delete it too.
Illegal
Download a patch from the Internet and run it in your windows
Legal
Register your windows from Microsoft official website.
Scenario 2:
You have downloaded windows 7 from Microsoft official website in December 2009 on present day your system is rebooting after 2 hours. What are the solutions available to overcome this problem. Legal or Illegal?
Ans:
If you have a warm fuzzy feeling inside when thinking about Microsoft and their decision to let you play with their new OS for free until August next year; get ready for the kicker. From March that release candidate you are running is going to start reminding you a commercial copy of the OS needs to be purchased to continue enjoying the benefits of Windows 7 in the most intrusive way possible.
You can understand Microsoft wanting to remind users that they need to buy Windows 7, but it’s the method they have decided to employ that is going to annoy and frustrate users. From March 2010 Windows 7 RC will start automatically rebooting your PC every two hours. So, if you happen to be doing something important you’ll have to stop as the friendly “buy me!” shutdown reminder is invoked.
For the RC, bi-hourly shutdowns will begin on March 1st, 2010. You will be alerted to install a released version of Windows and your PC will shut down automatically every 2 hours. On June 1st, 2010 if you are still on the Windows 7 RC your license for the Windows 7 RC will expire and the non-genuine experience is triggered where your wallpaper is removed and “This copy of Windows is not genuine” will be displayed in the lower right corner above the taskbar.
This isn’t a new tactic Microsoft has implemented to remind users they need to upgrade and it did the same thing with Vista previews. Windows 7 is expected to release in October this year, but at the very latest will be out by January next year giving you plenty of time to buy a copy before the automatic shutdowns begin.
You have installed windows service pack 2 and after updating windows up to service pack 3, you are able to log in system but receiving a continuous message that it is not a genuine copy of windows. What are the solutions available to this problem in both manner legal of illegal?
Illegal
Windows Genuine Notification popped up because Windows Update again installed it via Automatic Updates. It pops up while a user logs in to windows, displays a message near the system tray and keeps on reminding you in between work that the copy of windows is not genuine. It has been reported since its first release that even genuine users are getting this prompt, so Microsoft has them self release instructions for its removal. When I searched on Google about this issue, I landed up on pages which were providing many methods of its removal including those patching up existing files with their cracked versions which I would highly recommend avoiding them as they might contain malicious code and can be used to get you into more trouble.
I found out this method of removal of Windows Genuine Notification :
1. Launch Windows Task Manager.
2. End wgatray.exe process in Task Manager.
3. Restart Windows XP in Safe Mode.
4. Delete WgaTray.exe from C:\Windows\System32.
5. Delete WgaTray.exe from C:\Windows\System32\dllcache.
6. Lauch RegEdit.
7. Browse to the following location: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Notify
8. Delete the folder ‘WgaLogon’ and all its contents
9. Reboot Windows XP.
But the latest version of the WGN tool is a little tricky to handle. It will pop up again as soon as you end it from the task manager and while it is running in the memory, you can’t delete it too.
Illegal
Download a patch from the Internet and run it in your windows
Legal
Register your windows from Microsoft official website.
Scenario 2:
You have downloaded windows 7 from Microsoft official website in December 2009 on present day your system is rebooting after 2 hours. What are the solutions available to overcome this problem. Legal or Illegal?
Ans:
If you have a warm fuzzy feeling inside when thinking about Microsoft and their decision to let you play with their new OS for free until August next year; get ready for the kicker. From March that release candidate you are running is going to start reminding you a commercial copy of the OS needs to be purchased to continue enjoying the benefits of Windows 7 in the most intrusive way possible.
You can understand Microsoft wanting to remind users that they need to buy Windows 7, but it’s the method they have decided to employ that is going to annoy and frustrate users. From March 2010 Windows 7 RC will start automatically rebooting your PC every two hours. So, if you happen to be doing something important you’ll have to stop as the friendly “buy me!” shutdown reminder is invoked.
For the RC, bi-hourly shutdowns will begin on March 1st, 2010. You will be alerted to install a released version of Windows and your PC will shut down automatically every 2 hours. On June 1st, 2010 if you are still on the Windows 7 RC your license for the Windows 7 RC will expire and the non-genuine experience is triggered where your wallpaper is removed and “This copy of Windows is not genuine” will be displayed in the lower right corner above the taskbar.
This isn’t a new tactic Microsoft has implemented to remind users they need to upgrade and it did the same thing with Vista previews. Windows 7 is expected to release in October this year, but at the very latest will be out by January next year giving you plenty of time to buy a copy before the automatic shutdowns begin.
Scenarios
Scenario-1:
You have installed windows service pack 2 and after updating windows up to service pack 3, you are able to log in system but receiving a continuous message that it is not a genuine copy of windows. What are the solutions available to this problem in both manner legal of illegal?
Illegal
Windows Genuine Notification popped up because Windows Update again installed it via Automatic Updates. It pops up while a user logs in to windows, displays a message near the system tray and keeps on reminding you in between work that the copy of windows is not genuine. It has been reported since its first release that even genuine users are getting this prompt, so Microsoft has them self release instructions for its removal. When I searched on Google about this issue, I landed up on pages which were providing many methods of its removal including those patching up existing files with their cracked versions which I would highly recommend avoiding them as they might contain malicious code and can be used to get you into more trouble.
I found out this method of removal of Windows Genuine Notification :
1. Launch Windows Task Manager.
2. End wgatray.exe process in Task Manager.
3. Restart Windows XP in Safe Mode.
4. Delete WgaTray.exe from C:\Windows\System32.
5. Delete WgaTray.exe from C:\Windows\System32\dllcache.
6. Lauch RegEdit.
7. Browse to the following location: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Notify
8. Delete the folder ‘WgaLogon’ and all its contents
9. Reboot Windows XP.
But the latest version of the WGN tool is a little tricky to handle. It will pop up again as soon as you end it from the task manager and while it is running in the memory, you can’t delete it too.
Illegal
Download a patch from the Internet and run it in your windows
Legal
Register your windows from Microsoft official website.
Scenario 2:
You have downloaded windows 7 from Microsoft official website in December 2009 on present day your system is rebooting after 2 hours. What are the solutions available to overcome this problem. Legal or Illegal?
Ans:
If you have a warm fuzzy feeling inside when thinking about Microsoft and their decision to let you play with their new OS for free until August next year; get ready for the kicker. From March that release candidate you are running is going to start reminding you a commercial copy of the OS needs to be purchased to continue enjoying the benefits of Windows 7 in the most intrusive way possible.
You can understand Microsoft wanting to remind users that they need to buy Windows 7, but it’s the method they have decided to employ that is going to annoy and frustrate users. From March 2010 Windows 7 RC will start automatically rebooting your PC every two hours. So, if you happen to be doing something important you’ll have to stop as the friendly “buy me!” shutdown reminder is invoked.
For the RC, bi-hourly shutdowns will begin on March 1st, 2010. You will be alerted to install a released version of Windows and your PC will shut down automatically every 2 hours. On June 1st, 2010 if you are still on the Windows 7 RC your license for the Windows 7 RC will expire and the non-genuine experience is triggered where your wallpaper is removed and “This copy of Windows is not genuine” will be displayed in the lower right corner above the taskbar.
This isn’t a new tactic Microsoft has implemented to remind users they need to upgrade and it did the same thing with Vista previews. Windows 7 is expected to release in October this year, but at the very latest will be out by January next year giving you plenty of time to buy a copy before the automatic shutdowns begin.
You have installed windows service pack 2 and after updating windows up to service pack 3, you are able to log in system but receiving a continuous message that it is not a genuine copy of windows. What are the solutions available to this problem in both manner legal of illegal?
Illegal
Windows Genuine Notification popped up because Windows Update again installed it via Automatic Updates. It pops up while a user logs in to windows, displays a message near the system tray and keeps on reminding you in between work that the copy of windows is not genuine. It has been reported since its first release that even genuine users are getting this prompt, so Microsoft has them self release instructions for its removal. When I searched on Google about this issue, I landed up on pages which were providing many methods of its removal including those patching up existing files with their cracked versions which I would highly recommend avoiding them as they might contain malicious code and can be used to get you into more trouble.
I found out this method of removal of Windows Genuine Notification :
1. Launch Windows Task Manager.
2. End wgatray.exe process in Task Manager.
3. Restart Windows XP in Safe Mode.
4. Delete WgaTray.exe from C:\Windows\System32.
5. Delete WgaTray.exe from C:\Windows\System32\dllcache.
6. Lauch RegEdit.
7. Browse to the following location: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Notify
8. Delete the folder ‘WgaLogon’ and all its contents
9. Reboot Windows XP.
But the latest version of the WGN tool is a little tricky to handle. It will pop up again as soon as you end it from the task manager and while it is running in the memory, you can’t delete it too.
Illegal
Download a patch from the Internet and run it in your windows
Legal
Register your windows from Microsoft official website.
Scenario 2:
You have downloaded windows 7 from Microsoft official website in December 2009 on present day your system is rebooting after 2 hours. What are the solutions available to overcome this problem. Legal or Illegal?
Ans:
If you have a warm fuzzy feeling inside when thinking about Microsoft and their decision to let you play with their new OS for free until August next year; get ready for the kicker. From March that release candidate you are running is going to start reminding you a commercial copy of the OS needs to be purchased to continue enjoying the benefits of Windows 7 in the most intrusive way possible.
You can understand Microsoft wanting to remind users that they need to buy Windows 7, but it’s the method they have decided to employ that is going to annoy and frustrate users. From March 2010 Windows 7 RC will start automatically rebooting your PC every two hours. So, if you happen to be doing something important you’ll have to stop as the friendly “buy me!” shutdown reminder is invoked.
For the RC, bi-hourly shutdowns will begin on March 1st, 2010. You will be alerted to install a released version of Windows and your PC will shut down automatically every 2 hours. On June 1st, 2010 if you are still on the Windows 7 RC your license for the Windows 7 RC will expire and the non-genuine experience is triggered where your wallpaper is removed and “This copy of Windows is not genuine” will be displayed in the lower right corner above the taskbar.
This isn’t a new tactic Microsoft has implemented to remind users they need to upgrade and it did the same thing with Vista previews. Windows 7 is expected to release in October this year, but at the very latest will be out by January next year giving you plenty of time to buy a copy before the automatic shutdowns begin.
Thursday, March 11, 2010
file systems
FAT and NTFS file systems ...
NTFS (New Technology File System)is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7.
NTFS supersedes the FAT file system as the preferred file system for Microsoft’s Windows operating systems. NTFS has several improvements over FAT and HPFS (High Performance File System) such as improved support for metadata and the use of advanced data structures to improve performance, reliability, and disk space utilization, plus additional extensions such as security access control lists (ACL) and file system journaling.
History
In the mid 1980s, Microsoft and IBM formed a joint project to create the next generation graphical operating system. The result of the project was OS/2, but eventually Microsoft and IBM disagreed on many important issues and separated. OS/2 remained an IBM project. Microsoft started to work on Windows NT. The OS/2 filesystem HPFS contained several important new features. When Microsoft created their new operating system, they borrowed many of these concepts for NTFS. Probably as a result of this common ancestry, HPFS and NTFS share the same disk partition identification type code (07). Sharing an ID is unusual since there were dozens of available codes, and other major filesystems have their own code. FAT has more than nine (one each for FAT12, FAT16, FAT32, etc.). Algorithms which identify the filesystem in a partition type 07 must perform additional checks. It is also clear that NTFS owes some of its architectural design to Files-11 used by VMS. This is hardly surprising since Dave Cutler was the main lead for both VMS and Windows NT.
Versions
NTFS has five released versions:
• v1.0 with NT 3.1 released mid-1993
• v1.1 with NT 3.5 released fall 1994
• v1.2 with NT 3.51 (mid-1995) and NT 4 (mid-1996) (occasionally referred to as "NTFS 4.0", because OS version is 4.0)
• v3.0 from Windows 2000 ("NTFS V5.0")
• v3.1 from Windows XP (autumn 2001; "NTFS V5.1), Windows Server 2003 (spring 2003; occasionally "NTFS V5.2), Windows Vista (mid-2005) (occasionally "NTFS V6.0), Windows Server 2008, Windows 7.
V1.0 and V1.1 (and newer) are incompatible: that is, volumes written by NT 3.5x cannot be read by NT 3.1 until an update on the NT 3.5x CD is applied to NT 3.1, which also adds FAT long file name support. V1.2 supports compressed files, named streams, ACL-based security, etc. V3.0 added disk quotas, encryption, sparse files, reparse points, update sequence number (USN) journaling, the $Extend folder and its files, and reorganized security descriptors so that multiple files which use the same security setting can share the same descriptor. V3.1 expanded the Master File Table (MFT) entries with redundant MFT record number (useful for recovering damaged MFT files).
Windows Vista introduced Transactional NTFS, NTFS symbolic links, partition shrinking and self-healing functionality though these features owe more to additional functionality of the operating system than the file system itself.
Features
NTFS v3.0 includes several new features over its predecessors: sparse file support, disk usage quotas, reparse points, distributed link tracking, and file-level encryption, also known as the Encrypting File System (EFS).
USN Journal
The USN Journal (Update Sequence Number Journal) is a system management feature that records changes to all files, streams and directories on the volume, as well as their various attributes and security settings.
It is a critical functionality of NTFS (a feature that FAT/FAT32 does not provide) for ensuring that its internal complex data structures (notably the volume allocation bitmap, or data moves performed by the defragmentation API, the modifications to MFT records such as moves of some variable-length attributes stored in MFT records and attribute lists, or updates to the shared security descriptors, or to the boot sector and its local mirrors where the last USN transaction committed on the volume is stored) and indices (for directories and security descriptors) will remain consistent in case of system crashes, and allow easy rollback of uncommitted changes to these critical data structures when the volume will be remounted.
In later versions of Windows, the USN journal has extended to trace the state of other transactional operations on other parts of the NTFS filesystem, such as the VSS shadow copies of system files with copy-on-write semantics, or the implementation of Transactional NTFS and of distributed filesystems (see below).
Hard links and short filenames
Originally included to support the POSIX subsystem in Windows NT hard links are similar to directory junctions, but used for files instead of directories. Hard links can only be applied to files on the same volume since an additional filename record is added to the file's MFT record. Short (8.3) filenames are also implemented as additional filename records that don't have separate directory entries. Hard links also have the behavior that changing the size or attributes of a file may not update the directory entries of other links until they are opened.
Alternate data streams (ADS)
Alternate data streams allow more than one data stream to be associated with a filename, using the filename format "filename:streamname" (e.g., "text.txt:extrastream"). Alternate streams are not listed in Windows Explorer, and their size is not included in the file's size. Only the main stream of a file is preserved when it is copied to a FAT-formatted USB drive, attached to an e-mail, or uploaded to a website. As a result, using alternate streams for critical data may cause problems. NTFS Streams were introduced in Windows NT 3.1, to enable Services for Macintosh (SFM) to store Macintosh resource forks. Although current versions of Windows Server no longer include SFM, third-party Apple Filing Protocol (AFP) products (such as Group Logic's ExtremeZ-IP) still use this feature of the file system.
Malware has used alternate data streams to hide its code; ome malware scanners and other special tools now check for data in alternate streams. Microsoft provides a tool called Streams to allow users to view streams on a selected volume.
Very small ADS are also added within Internet Explorer (and now also other browsers) to mark files that have been downloaded from external sites: they may be unsafe to run locally and the local shell will require confirmation from the user before opening them. When the user indicates that he no longer wants this confirmation dialog, this ADS is simply dropped from the MFT entry for downloaded files.
Some media players have also tried to use ADS to store custom metadata to media files, in order to organize the collections, without modifying the effective data content of the media files themselves (using embedded tags when they are supported by the media file formats such as MPEG and OGG containers); these metadata may be displayed in the Windows Explorer as extra information columns, with the help of a registered Windows Shell extension that can parse them, but most media players prefer to use their own separate database instead of ADS for storing these information (notably because ADS are visible to all users of these files, instead of being managed with distinct per-user security settings and having their values defined according to user preferences).
Sparse files
Sparse files are files which contain sparse data sets, data mostly filled with zeros. Database applications, for instance, sometimes use sparse files. Because of this, Microsoft has implemented support for efficient storage of sparse files by allowing an application to specify regions of empty (zero) data. An application that reads a sparse file reads it in the normal manner with the file system calculating what data should be returned based upon the file offset. As with compressed files, the actual sizes of sparse files are not taken into account when determining quota limits.
File compression
NTFS compresses files using a variant of the LZ77 algorithm. Although read–write access to compressed files is transparent, Microsoft recommends avoiding compression on server systems and/or network shares holding roaming profiles because it puts a considerable load on the processor. Single-user systems with limited hard disk space can benefit from NTFS compression. The slowest link in a computer is not the CPU but the speed of the hard drive, so NTFS compression allows the limited, slow storage space to be better used, in terms of both space and (often) speed. NTFS compression can also serve as a replacement for sparse files when a program (e. g. a download manager) is not able to create files without content as sparse files.
Volume Shadow Copy
The Volume Shadow Copy Service (VSS) keeps historical versions of files and folders on NTFS volumes by copying old, newly-overwritten data to shadow copy (copy-on-write). The old file data is overlaid on the new when the user requests a revert to an earlier version. This also allows data backup programs to archive files currently in use by the file system. On heavily loaded systems, Microsoft recommends setting up a shadow copy volume on a separate disk. To ensure consistent recovery in case of system crashes, the VSS also uses the USN journal to mark local transactions and ensure that committed changes to the system files will be effectively recovered after system restart when the NTFS volume will be remounted, or safely rolled back to an older version if the new version was not fully recorded before actual commits before closing the modified file. However, these VSS shadows are not coordinated globally on multiple files or volumes, except when using a transaction coordinator (see below). They can just be used to ensure that older versions will remain accessible during backup operations, for getting consistent system images in those backups.
Transactional NTFS
As of Windows Vista, applications can use Transactional NTFS to group changes to files together into a transaction. The transaction will guarantee that all changes happen, or none of them do, and it will guarantee that applications outside the transaction will not see the changes until the precise instant they are committed. It uses the similar techniques as those used for Volume Shadow Copies (i.e. copy-on-write) to ensure that overwritten data can be safely rolled back, and the UFS journaling log to mark the transactions that have still not been committed, or those that have been committed but still not fully applied (in case of system crash during a commit by one of the participants).
However, in a transactional-enabled filesystem, this can be used temporarily for all other files needed for any kind of partition, as long as the transaction is not committed, than just system files that are permanently marked with copy-on-write semantics and that are implicitly modified within their own local transactions.
The copy-on-write technique is however modified in order to allow efficient rollbacks and avoid the creation of fragmentation in the filesystem used by possibly many participants: the old data may be not overwritten immediately but kept where it is (notably when it is currently locked by someone else for consistent reads in its own transactions); in that case, only the new uncommitted data is kept in a temporary shadow (rather than the copy-on-write old data), that will be finally applied using normal VSS copy-on-write when the transaction will be committed by the writer. In addition, these temporary shadows for new data, only seen by the participating processes that have their own uncommitted data, are not necessarily immediately written to disk, but may just be maintained in memory or swapped out for later commits. Transaction NTFS does not restrict transactions to just the local NTFS volume, but also includes other transactional data or operations in other locations such as data stored in separate volumes, the local registry, or SQL databases, or the current states of system services or remote services.
These transactions are coordinated network-wide with all participants using a specific service, the Distributed Transactions Coordinator (DTC), to ensure that all participants will receive same commit state, and to transport the changes that have been validated by any participant (so that the others can invalidate their local caches for old data or rollback their ongoing uncommitted changes). Transactional NTFS allows, for example, the creation of network-wide consistent distributed filesystems, including with their local live or offline caches.
Encrypting File System (EFS)
EFS provides strong and user-transparent encryption of any file or folder on an NTFS volume. EFS works in conjunction with the EFS service, Microsoft's CryptoAPI and the EFS File System Run-Time Library (FSRTL). EFS works by encrypting a file with a bulk symmetric key (also known as the File Encryption Key, or FEK), which is used because it takes a relatively small amount of time to encrypt and decrypt large amounts of data than if an asymmetric key cipher is used. The symmetric key that is used to encrypt the file is then encrypted with a public key that is associated with the user who encrypted the file, and this encrypted data is stored in an alternate data stream of the encrypted file. To decrypt the file, the file system uses the private key of the user to decrypt the symmetric key that is stored in the file header. It then uses the symmetric key to decrypt the file. Because this is done at the file system level, it is transparent to the user. Also, in case of a user losing access to their key, support for additional decryption keys has been built in to the EFS system, so that a recovery agent can still access the files if needed. NTFS-provided encryption and compression are mutually exclusive—NTFS can be used for one and a third-party tool for the other.
The support of EFS is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Quotas
Disk quotas were introduced in NTFS v3. They allow the administrator of a computer that runs a version of Windows that supports NTFS to set a threshold of disk space that users may use. It also allows administrators to keep track of how much disk space each user is using. An administrator may specify a certain level of disk space that a user may use before they receive a warning, and then deny access to the user once they hit their upper limit of space. Disk quotas do not take into account NTFS's transparent file-compression, should this be enabled. Applications that query the amount of free space will also see the amount of free space left to the user who has a quota applied to them.
The support of disk quotas is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Reparse points
This feature was introduced in NTFS v3 Reparse points are used by associating a reparse tag in the user space attribute of a file or directory. When the object manager (see Windows NT line executive) parses a file system name lookup and encounters a reparse attribute, it knows to reparse the name lookup, passing the user controlled reparse data to every file system filter driver that is loaded into Windows 2000. Each filter driver examines the reparse data to see whether it is associated with that reparse point, and if that filter driver determines a match, then it intercepts the file system call and executes its special functionality. Reparse points are used to implement Volume Mount Points, Directory Junctions, Hierarchical Storage Management, Native Structured Storage, Single Instance Storage, and Symbolic Links
Volume mount points
Volume mount points are similar to Unix mount points, where the root of another file system is attached to a directory In NTFS, this allows additional file systems to be mounted without requiring a separate drive letter (such as C: or D:) for each
Once a volume has been mounted on top of an existing directory of another volume, the contents previously listed in that directory become invisible and are replaced by the content of the root directory of the mounted volume. The mounted volume could still have its own drive letter assigned separately. The file system does not allow volumes to be mutually mounted on each other. Volume mount points can be made to be either persistent (remounted automatically after system reboot) or not persistent (must be manually remounted after reboot
Mounted volumes may use other file systems than just NTFS; notably they may be remote shared directories, possibly with their own security settings and remapping of access rights according to the remote file system policy
Directory junctions
Similar to volume mount points, however directory junctions reference other directories in the file system instead of other volumes. For instance, the directory C:\exampledir with a directory junction attribute that contains a link to D:\linkeddir will automatically refer to the directory D:\linkeddir when it is accessed by a user-mode application. This function is conceptually similar to symbolic links to directories in Unix, except that the target in NTFS must always be another directory (typical Unix file systems allow the target of a symbolic link to be any type of file) and have the semantics of a hardlink (i.e., they must be immediately resolvable when they are createdDirectory joins (which can be created with the command MKLINK /J junctionName targetDirectory and removed with RMDIR junctionName from a console prompt) are persistent, and resolved on the server side as they share the same security realm of the local system or domain on which the parent volume is mounted and the same security settings for its contents as the content of the target directory; however the junction itself may have distinct security settings. Unlinking a directory junction join does not delete files in the target directory Note that some directory junctions are installed by default on Windows Vista, for compatibility with previous versions of Windows, such as Documents and Settings in the root directory of the system drive, which links to the Users physical directory in the root directory of the same volume. However they are hidden by default, and their security settings are set up so that the Windows Explorer will refuse to open them from within the Shell or in most applications, except for the local built-in SYSTEM user or the local Administrators group (both user accounts are used by system software installers). This additional security restriction has probably been made to avoid users of finding apparent duplicate files in the joined directories and deleting them by error, because the semantics of directory junctions is not the same as hardlinks: the reference counting is not used on the target contents and not even on the referenced container itself
Directory junctions are soft links (they will persist even if the target directory is removed), working as a limited form of symbolic links (with an additional restriction on the location of the target), but it is an optimized version which allows faster processing of the reparse point with which they are implemented, with less overhead than the newer NTFS symbolic links, and can be resolved on the server side (when they are found in remote shared directories
Symbolic links
Symbolic links (or soft links) were introduced in Windows Vista. Symbolic links are resolved on the client side. So when a symbolic link is shared, the target is subject to the access restrictions on the client, and not the server
Symbolic links can be created either to files (created with MKLINK symLink targetFilename) or to directories (created with MKLINK /D symLinkD targetDirectory), but the semantic of the link must be provided with the created link. The target however need not exist or be available when the symbolic link is created: when the symbolic link will be accessed and the target will be checked for availability, NTFS will also check if it has the correct type (file or directory); it will return a not-found error if the existing target has the wrong type
They can also reference shared directories on remote hosts or files and subdirectories within shared directories: their target is not mounted immediately at boot, but only temporarily on demand while opening them with the OpenFile() or CreateFile() API. Their definition is persistent on the NTFS volume where they are created (all types of symbolic links can be removed as if they were files, using DEL symLink from a command line prompt or batch
Single Instance Storage (SIS)
When there are several directories that have different, but similar, files, some of these files may have identical content. Single instance storage allows identical files to be merged to one file and create references to that merged file. SIS consists of a file system filter that manages copies, modification and merges to files; and a user space service (or groveler) that searches for files that are identical and need merging. SIS was mainly designed for remote installation servers as these may have multiple installation images that contain many identical files; SIS allows these to be consolidated but, unlike for example hard links, each file remains distinct; changes to one copy of a file will leave others unaltered. This is similar to copy-on-write, which is a technique by which memory copying is not really done until one copy is modified.
Hierarchical Storage Management (HSM)
Hierarchical Storage Management is a means of transferring files that are not used for some period of time to less expensive storage media. When the file is next accessed, the reparse point on that file determines that it is needed and retrieves it from storage
Native Structured Storage (NSS)
NSS was an ActiveX document storage technology that has since been discontinued by Microsoft It allowed ActiveX Documents to be stored in the same multi-stream format that ActiveX uses internally. An NSS file system filter was loaded and used to process the multiple streams transparently to the application, and when the file was transferred to a non-NTFS formatted disk volume it would also transfer the multiple streams into a single stream.
Interoperability
Details on the implementation's internals are not released, which makes it difficult for third-party vendors to provide tools to handle NTFS.
Linux
The ability to read and write to NTFS is provided by the NTFS-3G driver. It is included in most Linux distributions. Other outdated and mostly read-only solutions exist as well:
• Linux kernel 2.2: Kernel versions 2.2.0 and later include the ability to read NTFS partitions
• Linux kernel 2.6: Kernel versions 2.6.0 and later contain a driver written by Anton Altaparmakov (University of Cambridge) and Richard Russon. It supports file read, overwrite and resize.
• NTFSMount: A read/write userspace NTFS driver. It provides read-write access to NTFS, excluding writing compressed and encrypted files, changing file ownership, and access rights.
• Tuxera NTFS: High-performance read/write commercial kernel driver, mainly targeted for embedded devices from Tuxera Ltd which also develops the open source NTFS-3G driver.
• NTFS for Linux: A commercial driver with full read/write support available as free and non-free download(s) from Paragon Software Group.
• Captive NTFS: A 'wrapping' driver which uses Windows' own driver, ntfs.sys.
Note that all three userspace drivers, namely NTFSMount, NTFS-3G and Captive NTFS, are built on the Filesystem in Userspace (FUSE), a Linux kernel module tasked with bridging userspace and kernel code to save and retrieve data. All drivers listed above (except Tuxera NTFS and Paragon NTFS for Linux) are open source (GPL). Due to the complexity of internal NTFS structures, both the built-in 2.6.14 kernel driver and the FUSE drivers disallow changes to the volume that are considered unsafe, to avoid corruption.
Mac OS X
Mac OS X v10.3 and later include read-only support for NTFS-formatted partitions. The GPL-licensed NTFS-3G also works on Mac OS X through FUSE and allows reading and writing to NTFS partitions. A performance enhanced commercial version, called Tuxera NTFS for Mac, is also available from the NTFS-3G developers. NTFS write support has been discovered in Mac OS X 10.6, but has not been activated as of version 10.6.1, although hacks do exist to enable the functionality.
Microsoft Windows
While the different NTFS versions are for the most part fully forward- and backward-compatible, there are technical considerations for mounting newer NTFS volumes in older versions of Microsoft Windows. This affects dual-booting, and external portable hard drives.
For example, attempting to use an NTFS partition with "Previous Versions" (a.k.a. Volume Shadow Copy) on an operating system that doesn't support it, will result in the contents of those previous versions being lost.
Others
eComStation, and FreeBSD offer read-only NTFS support (there is a beta NTFS driver that allows write/delete for eComStation, but is generally considered unsafe). A free third-party tool for BeOS, which was based on NTFS-3G, allows full NTFS read and write. NTFS-3G also works on Linux, Mac OS X, FreeBSD, NetBSD, Solaris, QNX and Haiku, in addition to Linux, through FUSE A free for personal use read/write driver for MS-DOS called "NTFS4DOS" also exists.
Compatibility with FAT
Microsoft currently provides a tool (convert.exe) to convert HPFS (only on Windows NT 3), FAT16 and, on Windows 2000 and higher, FAT32 to NTFS, but not the other way around.
Resizing
Various third-party tools are all capable of safely resizing NTFS partitions. Microsoft added the ability to shrink or expand a partition with Windows Vista, but this capability is limited because it will not relocate page file fragments or files that have been marked as unmovable. So shrinking requires relocating or disabling any page file, the index of Windows Search, and any Shadow Copy used by System Restore. Using a 3rd-party tool is an easier option.
Universal time
For historical reasons, the versions of Windows that do not support NTFS all keep time internally as local zone time, and therefore so do all file systems other than NTFS that are supported by current versions of Windows. However, Windows NT and its descendants keep internal timestamps as UTC and make the appropriate conversions for display purposes. Therefore, NTFS timestamps are in UTC. This means that when files are copied or moved between NTFS and non-NTFS partitions, the OS needs to convert timestamps on the fly. But if some files are moved when daylight saving time (DST) is in effect, and other files are moved when standard time is in effect, there can be some ambiguities in the conversions. As a result, especially shortly after one of the days on which local zone time changes, users may observe that some files have timestamps that are incorrect by one hour. Due to the differences in implementation of DST between the northern and southern hemispheres, this can result in a potential timestamp error of up to 4 hours in any given 12 months.
Internals
In NTFS, all file data—file name, creation date, access permissions, and contents—are stored as metadata in the Master File Table. This abstract approach allowed easy addition of file system features during Windows NT's development — an interesting example is the addition of fields for indexing used by the Active Directory software.
NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names, etc.). This means UTF-16 codepoints are supported, but the file system does not check whether a sequence is valid UTF-16 (it allows any sequence of short values, not restricted to those in the Unicode standard).
Internally, NTFS uses B+ trees to index file system data. Although complex to implement, this allows faster file look up times in most cases. A file system journal is used to guarantee the integrity of the file system metadata but not individual files' content. Systems using NTFS are known to have improved reliability compared to FAT file systems. The Master File Table (MFT) contains metadata about every file, directory, and metafile on an NTFS volume. It includes filenames, locations, size, and permissions. Its structure supports algorithms which minimize disk fragmentation. A directory entry consists of a filename and a "file ID" which is the record number representing the file in the Master File Table. The file ID also contains a reuse count to detect stale references. While this strongly resembles the W_FID of Files-11, other NTFS structures radically differ.
Metafiles
NTFS contains several files which define and organize the file system. In all respects, most of these files are structured like any other user file ($Volume being the most peculiar), but are not of direct interest to file system clients. These metafiles define files, back up critical file system data, buffer file system changes, manage free space allocation, satisfy BIOS expectations, track bad allocation units, and store security and disk space usage information. All content is in an unnamed data stream, unless otherwise indicated.
Segment Number
File Name
Purpose
0 $MFT Describes all files on the volume, including file names, timestamps, stream names, and lists of cluster numbers where data streams reside, indexes, security identifiers, and file attributes like "read only", "compressed", "encrypted", etc.
1 $MFTMirr Duplicate of the first vital entries of $MFT, usually 4 entries (4 KB).
2 $LogFile Contains transaction log of file system metadata changes.
3 $Volume Contains information about the volume, namely the volume object identifier, volume label, file system version, and volume flags (mounted, chkdsk requested, requested $LogFile resize, mounted on NT 4, volume serial number updating, structure upgrade request). This data is not stored in a data stream, but in special MFT attributes: If present, a volume object ID is stored in an $OBJECT_ID record; the volume label is stored in a $VOLUME_NAME record, and the remaining volume data is in a $VOLUME_INFORMATION record. Note: volume serial number is stored in file $Boot (below).
4 $AttrDef A table of MFT attributes which associates numeric identifiers with names.
5 . Root directory. Directory data is stored in $INDEX_ROOT and $INDEX_ALLOCATION attributes both named $I30.
6 $Bitmap An array of bit entries: each bit indicates whether its corresponding cluster is used (allocated) or free (available for allocation).
7 $Boot Volume boot record. This file is always located at the first clusters on the volume. It contains bootstrap code (see NTLDR/BOOTMGR) and a BIOS parameter block including a volume serial number and cluster numbers of $MFT and $MFTMirr. $Boot is usually 8192 bytes long.
8 $BadClus A file which contains all the clusters marked as having bad sectors. This file simplifies cluster management by the chkdsk utility, both as a place to put newly discovered bad sectors, and for identifying unreferenced clusters. This file contains two data streams, even on volumes with no bad sectors: an unnamed stream contains bad sectors—it is zero length for perfect volumes; the second stream is named $Bad and contains all clusters on the volume not in the first stream.
9 $Secure Access control list database which reduces overhead having many identical ACLs stored with each file, by uniquely storing these ACLs in this database only (contains two indices $SII: perhaps Security ID Index and $SDH: Security Descriptor Hash which index the stream named $SDS containing actual ACL table).
10 $UpCase A table of unicode uppercase characters for ensuring case insensitivity in Win32 and DOS namespaces.
11 $Extend A filesystem directory containing various optional extensions, such as $Quota, $ObjId, $Reparse or $UsnJrnl.
12 ... 23 Reserved for $MFT extension entries.
usually 24 $Extend\$Quota Holds disk quota information. Contains two index roots, named $O and $Q.
usually 25 $Extend\$ObjId Holds distributed link tracking information. Contains an index root and allocation named $O.
usually 26 $Extend\$Reparse Holds reparse point data (such as symbolic links). Contains an index root and allocation named $R.
27 ... file.ext Beginning of regular file entries.
These metafiles are treated specially by Windows and are difficult to directly view: special purpose-built tools are needed.
From MFT records to attribute lists, attributes, and streams
For each file (or directory) described in the MFT record, there's a linear repository of stream descriptors (also named attributes), packed together in a variable-length record (also named an attributes list), with extra padding to fill the fixed 1KB size of every MFT record, and that fully describes the effective streams associated with that file.
Each stream (or attribute) itself has a single type (internally just a fixed-size integer in the stored descriptor, but most often handled in applications using an equivalent symbolic name in the FileOpen() or FileCreate() API call), a single optional stream name (completely unrelated to the effective filenames), plus optional associated data for that stream. For NTFS, the standard data of files, or the index data for directories are handled the same way as other data for alternate data streams, or for standard attributes. They are just one of the attributes stored in one or several attribute lists.
• For each file described in the MFT record (or in the non-resident respository of stream descriptors, see below), the stream descriptors identified by their (stream type value, stream name) must be unique. Additionally, NTFS has some ordering constraints for these descriptors.
• There's a predefined null stream type, used to indicate the end of the list of stream descriptors in the streams repository for that file. It must be present as the last stream descriptor in each stream repository (all other storage space available after it will be ignored and just consists in padding bytes to match the record size in the MFT or a cluster size in a non-resident streams repository).
• Some stream types are required and must be present in each MFT record, except unused records that are just indicated by a stream with null stream type.
o This is the case for the standard attributes that are stored as a fixed-size record and containing the timestamps and other basic single-bit attributes (compatible with those managed by FAT/FAT32 in DOS or Windows 95/98 applications).
• Some stream types cannot have a name and must remain anonymous.
o This is the case for the standard attributes, or for the preferred NTFS "filename" stream type, or the "short filename" stream type, when it is also present (for compatibility with DOS-like applications, see below). It is also possible for a file to only contain a short filename, in which case it will be the preferred one, as listed in the Windows Explorer.
o The filename streams stored in the streams repository do not make the file immediately accessible through the hierarchical filesystem. In fact, all the filenames must be indexed separately in at least one separate directory on the same volume, with its own MTF entry and its own security descriptors and attributes, that will reference the MFT entry number for that file. This allows the same file or directory to be "hardlinked" several times from several containers on the same volume, possibly with distinct filenames.
• The default data stream of a regular file is a stream of type $DATA but with an anonymous name, and the ADS's are similar but must be named.
• On the opposite, the default data stream of directories has a distinct type, but are not anonymous: they have a stream name ("$I30" in NTFS 3+) that reflects its indexing format.
Resident vs. non-resident data streams
To optimize the storage and reduce the I/O overhead for the very common case of streams with very small associated data, NTFS prefers to place this data within the stream descriptor (if the size of the stream descriptor does not then exceed the maximum size of the MFT record or the maximum size of a single entry within an non-resident stream repository, see below), instead of using the MFT entry space to list clusters containing the data; in that case, the stream descriptor will not store the data directly but will just store an allocation map pointing to the actual data stored elsewhere on the volume. When the stream data can be accessed directly from within the stream descriptor, it is called "resident data" by computer forensics workers. The amount of data which fits is highly dependent on the file's characteristics, but 700 to 800 bytes is common in single-stream files with non-lengthy filenames and no ACLs.
• Some stream descriptors (such as the preferred filename, the basic file attributes, or the main allocation map for each non-resident stream) cannot be made non-resident.
• Encrypted-by-NTFS, sparse data streams, or compressed data streams cannot be made resident.
• The format of the allocation map for non-resident streams depends on its capability of supporting sparse data storage. In the current implementation of NTFS, once a non-resident stream data has been marked and converted as sparse, it cannot be reverted to non-sparse data, so it cannot become resident again, unless this data is fully truncated, discarding the sparse allocation map completely.
• When a non-resident data stream is too much fragmented, so that its effective allocation map cannot fit entirely within the MFT record, the allocation map may be also stored as an non-resident stream, with just a small resident stream containing the indirect allocation map to the effective non-resident allocation map of the non-resident data stream.
• When there are too many streams for a file (including ADS's, extended attributes, or security descriptors), so that their descriptors cannot fit all within the MFT record, a non-resident stream may also be used to store an additional repository for the other stream descriptors (except those few small streams that cannot be non-resident), using the same format as the one used in the MFT record, but without the space constraints of the MFT record.
The NTFS filesystem driver will sometimes attempt to relocate the data of some of these non-resident streams into the streams repository, and will also attempt to relocate the stream descriptors stored in a non-resident repository back to the stream repository of the MFT record, based on priority and preferred ordering rules, and size constraints.
Since resident files do not directly occupy clusters ("allocation units"), it is possible for an NTFS volume to contain more files on a volume than there are clusters. For example, an 80 GB (74.5 GB) partition NTFS formats with 19,543,064 clusters of 4 KB. Subtracting system files (64 MB log file, a 2,442,888-byte $Bitmap file, and about 25 clusters of fixed overhead) leaves 19,526,158 clusters free for files and indices. Since there are four MFT records per cluster, this volume theoretically could hold almost 4 × 19,526,158 = 78,104,632 resident files.
Limitations
The following are a few limitations of NTFS:
File Names
File names are limited to 255 UTF-16 code words. Certain names are reserved in the volume root directory and cannot be used for files. These are: $MFT, $MFTMirr, $LogFile, $Volume, $AttrDef, . (dot), $Bitmap, $Boot, $BadClus, $Secure, $Upcase, and $Extend; . (dot) and $Extend are both directories; the others are files. The NT kernel limits full paths to 32,767 UTF-16 code words.
Maximum Volume Size
In theory, the maximum NTFS volume size is 264−1 clusters. However, the maximum NTFS volume size as implemented in Windows XP Professional is 232−1 clusters. For example, using 64 KB clusters, the maximum NTFS volume size is 256 TB minus 64 KB. Using the default cluster size of 4 KB, the maximum NTFS volume size is 16 TB minus 4 KB. (Both of these are vastly higher than the 128 GB limit lifted in Windows XP SP1.) Because partition tables on master boot record (MBR) disks only support partition sizes up to 2 TB, dynamic or GPT volumes must be used to create NTFS volumes over 2 TB. Booting from a GPT volume to a Windows environment requires a system with EFI and 64-bit support.
Maximum File Size
Theoretical: 16 EB minus 1 KB (264 − 210 or 18,446,744,073,709,550,592 bytes). Implementation: 16 TB minus 64 KB (244 − 216 or 17,592,185,978,880 bytes)
Alternate Data Streams
Windows system calls may handle alternate data streams. Depending on the operating system, utility and remote file system, a file transfer might silently strip data streams. A safe way of copying or moving files is to use the BackupRead and BackupWrite system calls, which allow programs to enumerate streams, to verify whether each stream should be written to the destination volume and to knowingly skip offending streams.
File Allocation Table or FAT is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of its relative simplicity. Performance of FAT compares poorly to most other file systems as it uses overly simplistic data structures, making file operations time-consuming, and makes poor use of disk space in situations where many small files are present.
For floppy disks, the FAT has been standardized as ECMA-107 and ISO/IEC 9293. Those standards include only FAT12 and FAT16 without long filename support; long filenames with FAT is partially patented.
The FAT file system is relatively straightforward technically and is supported by virtually all existing operating systems for personal computers. This makes it a useful format for solid-state memory cards and a convenient way to share data between operating systems.
History
The FAT file system was developed by Bill Gates and Marc McDonald during 1976–1977 It was the primary file system for various operating systems including DR-DOS, FreeDOS, MS-DOS, OS/2 (v1.1) and Microsoft Windows (up until Windows Me).
The FAT file system was created for managing disks in Microsoft Standalone Disk BASIC. In August 1980 Tim Paterson incorporated FAT into his 86-DOS operating system for the S-100 8086 CPU boards the file system was the main difference between 86-DOS and its predecessor, CP/M.
The name originates from the usage of a table which centralizes the information about which areas belong to files, are free or possibly unusable, and where each file is stored on the disk. To limit the size of the table, disk space is allocated to files in contiguous groups of hardware sectors called clusters. As disk drives have evolved, the maximum number of clusters has dramatically increased, and so the number of bits used to identify each cluster has grown. The successive major versions of the FAT format are named after the number of table element bits: 12, 16, and 32. The FAT standard has also been expanded in other ways while preserving backward compatibility with existing software.
FAT12
The initial version of FAT is now referred to as FAT12. Designed as a file system for floppy disks, it limited cluster addresses to 12-bit values, which not only limited the cluster count to 4078, but made FAT manipulation tricky with the PC's 8-bit and 16-bit registers. (Under Linux, FAT12 is limited to 4084 clusters.) The disk's size is stored as a 16-bit count of sectors, which limited the size to 32 MB FAT12 was used by several manufacturers with different physical formats, but a typical floppy disk at the time was 5.25-inch, single-sided, 40 tracks, with 8 sectors per track, resulting in a capacity of 160 KB for both the system areas and files. The FAT12 limitations exceeded this capacity by a factor of ten or more.
By convention, all the control structures were organized to fit inside the first track, thus avoiding head movement during read and write operations, although this varied depending on the manufacturer and physical format of the disk. At the time FAT12 was introduced, DOS did not support hierarchical directories, and the maximum number of files was typically limited to a few dozen. Hierarchical directories were introduced in MS-DOS version 2.0A limitation which was not addressed until much later was that any bad sector in the control structures area, track 0, could prevent the disk from being usable. The DOS formatting tool rejected such disks completely. Bad sectors were allowed only in the file area, where they made the entire holding cluster unusable as well. FAT12 remains in use on all common floppy disks, including 1.44MB ones.
Initial FAT16
In 1984, IBM released the PC AT, which featured a 20 MB hard disk. Microsoft introduced MS-DOS 3.0 in parallel. (The earlier PC XT was the first PC with a hard drive from IBM, and MS-DOS 2.0 supported that hard drive with FAT12.) Cluster addresses were increased to 16-bit, allowing for up to 65,517 clusters per volume, and consequently much greater file system sizes, at least in theory. However, the maximum possible number of sectors and the maximum (partition, rather than disk) size of 32 MB did not change. Therefore, although technically already "FAT16", this format was not what today is commonly understood as FAT16. With the initial implementation of FAT16 not actually providing for larger partition sizes than FAT12, the early benefit of FAT16 was to enable the use of smaller clusters, making disk usage more efficient, particularly for files several hundred bytes in size, which were far more common at the time. Also, the introduction of FAT16 actually did bring an increase in the maximum partition size under MS-DOS, since the implementation of FAT12 for hard disks in MS-DOS 2.0 was limited to 15 MB. (That is, the initial FAT16 did not support larger drives than FAT12, but MS-DOS 3.0 using FAT16 did support larger drives than MS-DOS 2.0 using FAT12, by a factor of two)
A 20 MB hard disk formatted under MS-DOS 3.0 was not accessible by the older MS-DOS 2.0. (This was because MS-DOS 2.0 did not support version 3.0's FAT-16 and because it did not support hard disk partitions over 15 MB in size.) Of course, MS-DOS 3.0 could still access MS-DOS 2.0 style 8 KB-cluster partitions.
MS-DOS 3.0 also introduced support for high-density 1.2 MB 5.25" diskettes, which notably had 15 sectors per track, hence more space for the FATs. This probably prompted a dubious optimization of the cluster size, which went down from 2 sectors to just 1. The net effect was that high density diskettes were significantly slower than older double density ones
Extended partition and logical drives
Apart from improving the structure of the FAT file system itself, a parallel development allowing an increase in the maximum possible FAT size was the introduction of multiple FAT partitions. Originally DOS was only prepared to handle one FAT partition, although it came with documentation and programming tools for the creation of installable device drivers to handle multiple partitions, and third-party suppliers quickly provided the missing software. Aside from that, partitions were used for sharing the disk between operating systems, typically DOS and Xenix at the time. Extra DOS partitions could not be used as boot partitions, because the installable device drivers were loaded (in config.sys) only after the first part of the DOS boot process. Later, third party tools became available that replaced the DOS master boot record (MBR) and directly loaded non-DOS drivers before DOS: such systems generally came with careful warnings that without the 3rd party software, the disk would not be compatible with DOS. Simply allowing several identical-looking DOS partitions could lead to naming problems: behaviour if more than one partition was marked active was undocumented (although well defined), as was the behaviour if there was more than one hard disk in the computer (which was machine dependent), or if the system was booted from a diskette.
To allow the use of more FAT partitions in a compatible way, a new partition type was introduced (in MS-DOS 3.2, January 1986), the extended partition, which is a container for additional partitions called logical drives. Originally only one logical drive was possible, permitting hard disks up to 64 MB. In MS-DOS 3.3 (August 1987) this limit was increased to 24 drives, equal to the maximum number of available letters for drive names (A and B being reserved for the first two floppy drives, at least one of which many, if not most, systems of the era were equipped with; where only one was installed, B always simulated a second drive using A). Logical drives are described by on-disk structures which closely resemble the Master Boot Record (MBR) of the disk (which describes the primary partitions), likely to simplify the implementation. Though some believe these partitions were nested in a way analogous to Russian matryoshka dolls, that isn't the case. They are stored as a row of separate blocks within a single box; these blocks are often referred to as being chained together, by the links in their extended boot record (EBR) sectors. Only one extended partition is allowed. Under MS-DOS, logical drives are not bootable, and the extended partition can only be created after the primary FAT partition, which removes all ambiguity but also eliminates the possibility of booting several DOS versions from the same hard disk. (A few systems other than MS-DOS can boot logical drives, and partitions can be created in any order using third party formatting tools.)
A useful side-effect of the extended partition scheme was to significantly increase the maximum number of partitions possible on a PC hard disk beyond the four which could be described by the MBR alone.
Prior to the introduction of extended partitions, some hard disk controllers (which at that time were usually separate option boards) could make large hard disks appear at the hardware interface level as two separate disks. Otherwise, DOS "Block Device Drivers" were used to access the other 3 possible partitions on a disk.
Final FAT16
Finally in November 1987, Compaq DOS 3.31 (an OEM version of MS-DOS 3.3 released by Compaq with their machines) introduced what is today called the FAT16 format, with the expansion of the 16-bit disk sector count to 32 bits. The result was initially called the DOS 3.31 Large File System. Although the on-disk changes were minor, the entire DOS disk driver had to be converted to use 32-bit sector numbers, a task complicated by the fact that it was written in 16-bit assembly language.
In 1988 this improvement became more generally available through MS-DOS 4.0 and OS/2 1.1. The limit on partition size was dictated by the 8-bit signed count of sectors per cluster, which had a maximum power-of-two value of 64. With the standard hard disk sector size of 512 bytes, this gives a maximum of 32 KB clusters, thereby fixing the "definitive" limit for the FAT16 partition size at 2 gigabytes. On magneto-optical media, which can have 1 or 2 KB sectors instead of 1/2 KB, this size limit is proportionally larger.
Much later, Windows NT increased the maximum cluster size to 64 KB by considering the sectors-per-cluster count as unsigned. However, the resulting format was not compatible with any other FAT implementation of the time, and it generated greater internal fragmentation. Windows 98 also supported reading and writing this variant, but its disk utilities did not work with it.
The number of root directory entries available is determined when the volume is formatted, and is stored in a 16-bit signed field, defining an absolute limit of 32767 entries (32736, a multiple of 32, in practice). For historical reasons, FAT12 and FAT16 media generally use 512 root directory entries on non-floppy media. Other sizes may be incompatible with some software or devices (entries being file and/or folder names in the original 8.3 format Some third party tools like mkdosfs allow the user to set this parameter
Long file names
One of the user experience goals for the designers of Windows 95 was the ability to use long filenames (LFNs—up to 255 UTF-16 code points long), in addition to classic 8.3 filenames. LFNs were implemented using a workaround in the way directory entries are laid out (see below).
The version of the file system with this extension is usually known as VFAT after the Windows 95 virtual device driver, also known as "Virtual FAT" in Microsoft's documentation. Interestingly, the VFAT driver actually appeared before Windows 95, in Windows for Workgroups 3.11, but was only used for implementing 32-bit file access and did not support long file names.
In Windows NT, support for long filenames on FAT started from version 3.5. OS/2 added long filename support to FAT using extended attributes (EA) before the introduction of VFAT; thus, VFAT long filenames are invisible to OS/2, and EA long filenames are invisible to Windows.
FAT32
In order to overcome size limit of FAT16, while at the same time allowing DOS real mode code to handle the format, and without reducing available conventional memory unnecessarily, Microsoft implemented a next generation, known as FAT32. Cluster values are represented by 32-bit numbers, of which 28 bits are used to hold the cluster number, for a maximum of approximately 268 million (228) clusters. This allows for drive sizes of up to 8 terabytes with 32KB clusters, but the boot sector uses a 32-bit field for the sector count, limiting volume size to 2 TB on a hard disk with 512 byte sectors.
On Windows 95/98, due to the version of Microsoft's SCANDISK utility included with these operating systems being a 16-bit application, the FAT structure is not allowed to grow beyond around 4.2 million (< 222) clusters, placing the volume limit at 127.53 GiB. A limitation in original versions of Windows 98/98SE's Fdisk utility causes it to incorrectly report disk sizes over 64 GB. A corrected version is available from Microsoft, but it cannot partition drives larger than 512GB. The Windows 2000/XP installation program and filesystem creation tool imposes a limitation of 32 GB However, both systems can read and write to FAT32 file systems of any size. This limitation is by design and according to Microsoft was imposed because many tasks on a very large FAT32 file system become slow and inefficient. This limitation can be bypassed by using third-party formatting utilities. Windows Me supports the FAT32 file system without any limits. However, similarly to Windows 95/98/98SE there is no native support for 48-bit LBA in Windows ME, meaning that the maximum disk size for ATA disks is 127.6 GiB, the maximum size of an ATA disk using the previous long-standard 28-bit LBA.
FAT32 was introduced with Windows 95 OSR2, although reformatting was needed to use it, and DriveSpace 3 (the version that came with Windows 95 OSR2 and Windows 98) never supported it. Windows 98 introduced a utility to convert existing hard disks from FAT16 to FAT32 without loss of data. In the NT line, native support for FAT32 arrived in Windows 2000. A free FAT32 driver for Windows NT 4.0 was available from Winternals, a company later acquired by Microsoft. Since the acquisition the driver is no longer officially available.
The maximum possible size for a file on a FAT32 volume is 4 GB minus 1 byte (232−1 bytes). Video applications, large databases, and some other software easily exceed this limit. Larger files require another formatting type such as NTFS.
Fragmentation
The FAT file system does not contain mechanisms which prevent newly written files from becoming scattered across the partition. Other file systems, like HPFS, use free space bitmaps that indicate used and available clusters, which could then be quickly looked up in order to find free contiguous areas (improved in exFAT). Another solution is the linkage of all free clusters into one or more lists (as is done in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead to performance penalties with large disks.
In fact, computing free disk space on FAT is one of the most resource intensive operations, as it requires reading the entire FAT linearly. A possible justification suggested by Microsoft's Raymond Chen for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a "DIR" operation, which always displays the free disk space as the last line. Displaying this line took longer and longer as the number of clusters increased.
The High Performance File System (HPFS) divides disk space into bands, which have their own free space bitmap, where multiple files opened for simultaneous write could be expanded separately.
Some of the perceived problems with fragmentation resulted from operating system and hardware limitations.
The single-tasking DOS and the traditionally single-tasking PC hard disk architecture (only 1 outstanding input/output request at a time, no DMA transfers) did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks.
Similarly, write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a crash, made easier by the lack of hardware protection between applications and the system.
MS-DOS also did not offer a system call which would allow applications to make sure a particular file has been completely written to disk in the presence of deferred writes (cf. fsync in Unix or DosBufReset in OS/2). Disk caches on MS-DOS were operating on disk block level and were not aware of higher-level structures of the file system. In this situation, cheating with regard to the real progress of a disk operation was most dangerous.
Modern operating systems have introduced these optimizations to FAT partitions, but optimizations can still produce unwanted artifacts in case of a system crash. A Windows NT system will allocate space to files on FAT in advance, selecting large contiguous areas, but in case of a crash, files which were being appended will appear larger than they were ever written into, with dozens of random kilobytes at the end.
With the large cluster sizes, 16 or 32K, forced by larger FAT32 partitions, the external fragmentation becomes somewhat less significant, and internal fragmentation, i.e. disk space waste (since files are rarely exact multiples of cluster size), starts to be a problem as well, especially when there are a great many small files.
Design
Overview
The following is an overview of the order of structures in a FAT partition or disk:
Contents Boot
Sector FS Information
Sector
(FAT32 only) More reserved
sectors
(optional) File
Allocation
Table #1 File
Allocation
Table #2 Root
Directory
(FAT12/16 only) Data Region (for files and directories) ...
(To end of partition or disk)
Size in sectors (number of reserved sectors) (number of FATs)*(sectors per FAT) (number of root entries*32)/Bytes per sector NumberOfClusters*SectorsPerCluster
A FAT file system is composed of four different sections.
1. The Reserved sectors, located at the very beginning. The first reserved sector (sector 0) is the Boot Sector (aka Partition Boot Record). It includes an area called the BIOS Parameter Block (with some basic file system information, in particular its type, and pointers to the location of the other sections) and usually contains the operating system's boot loader code. The total count of reserved sectors is indicated by a field inside the Boot Sector. Important information from the Boot Sector is accessible through an operating system structure called the Drive Parameter Block in DOS and OS/2. For FAT32 file systems, the reserved sectors include a File System Information Sector at sector 1 and a Backup Boot Sector at Sector 6.
2. The FAT Region. This typically contains two copies (may vary) of the File Allocation Table for the sake of redundancy checking, although the extra copy is rarely used, even by disk repair utilities. These are maps of the Data Region, indicating which clusters are used by files and directories.
3. The Root Directory Region. This is a Directory Table that stores information about the files and directories located in the root directory. It is only used with FAT12 and FAT16, and imposes on the root directory a fixed maximum size which is pre-allocated at creation of this volume. FAT32 stores the root directory in the Data Region, along with files and other directories, allowing it to grow without such a constraint. Thus, for FAT32, the Data Region starts here.
4. The Data Region. This is where the actual file and directory data is stored and takes up most of the partition. The size of files and subdirectories can be increased arbitrarily (as long as there are free clusters) by simply adding more links to the file's chain in the FAT. Note however, that files are allocated in units of clusters, so if a 1 KB file resides in a 32 KB cluster, 31 KB are wasted. FAT32 typically commences the Root Directory Table in cluster number 2: the first cluster of the Data Region.
FAT uses little endian format for entries in the header and the FAT(s).
Boot Sector
It is important to note that the first sector on a device isn't necessarily the boot sector. For partitioned devices (such as hard drives), the first sector is the Master Boot Record. On un-partitioned devices (eg. floppy disk) the first sector is the Volume Boot Record.
Common structure of the first 36 bytes used by all FAT versions:
Byte Offset Length (bytes) Description
0x00 3 Jump instruction. This instruction will be executed and will skip past the rest of the (non-executable) header if the partition is booted from. See Volume Boot Record. If the jump is two-byte near jmp it is followed by a NOP instruction.
0x03 8 OEM Name (padded with spaces). This value determines in which system disk was formatted. MS-DOS checks this field to determine which other parts of the boot record can be relied on. Common values are IBM 3.3 (with two spaces between the "IBM" and the "3.3"), MSDOS5.0, MSWIN4.1 and mkdosfs.
0x0b 2 Bytes per sector. A common value is 512, especially for file systems on IDE (or compatible) disks. The BIOS Parameter Block starts here.
0x0d 1 Sectors per cluster. Allowed values are powers of two from 1 to 128. However, the value must not be such that the number of bytes per cluster becomes greater than 32
0x0e 2 Reserved sector count. The number of sectors before the first FAT in the file system image. Should be 1 for FAT12/FAT16. Usually 32 for FAT32.
0x10 1 Number of file allocation tables. Almost always 2.
0x11 2 Maximum number of root directory entries. Only used on FAT12 and FAT16, where the root directory is handled specially. Should be 0 for FAT32. This value should always be such that the root directory ends on a sector boundary (i.e. such that its size becomes a multiple of the sector size). 224 is typical for floppy disks.
0x13 2 Total sectors (if zero, use 4 byte value at offset 0x20)
0x15 1 Media descriptor
0xF0 3.5" Double Sided, 80 tracks per side, 18 or 36 sectors per track (1.44MB or 2.88MB). 5.25" Double Sided, 80 tracks per side, 15 sectors per track (1.2MB). Used also for other media types.
0xF8 Fixed disk (i.e. Hard disk).
0xF9 3.5" Double sided, 80 tracks per side, 9 sectors per track (720K). 5.25" Double sided, 80 tracks per side, 15 sectors per track (1.2MB)
0xFA 5.25" Single sided, 80 tracks per side, 8 sectors per track (320K)
0xFB 3.5" Double sided, 80 tracks per side, 8 sectors per track (640K)
0xFC 5.25" Single sided, 40 tracks per side, 9 sectors per track (180K)
0xFD 5.25" Double sided, 40 tracks per side, 9 sectors per track (360K). Also used for 8".
0xFE 5.25" Single sided, 40 tracks per side, 8 sectors per track (160K). Also used for 8".
0xFF 5.25" Double sided, 40 tracks per side, 8 sectors per track (320K)
Same value of media descriptor should be repeated as first byte of each copy of FAT. Certain operating systems (MSX-DOS version 1.0) ignore boot sector parameters altogether and use media descriptor value from the first byte of FAT to determine file system parameters.
0x16 2 Sectors per File Allocation Table for FAT12/FAT16
0x18 2 Sectors per track
0x1a 2 Number of heads
0x1c 4 Count of hidden sectors preceding the partition that contains this FAT volume. This field should always be zero on media that are not partitioned.
0x20 4 Total sectors (if greater than 65535; otherwise, see offset 0x13)
Extended BIOS Parameter Block
Further structure used by FAT12 and FAT16, also known as Extended BIOS Parameter Block:
Byte Offset Length (bytes) Description
0x24 1 Physical drive number (0x00 for removable media, 0x80 for hard disks)
0x25 1 Reserved ("current head")
In Windows NT bit 0 is a dirty flag to request chkdsk at boot time. bit 1 requests surface scan too.
0x26 1 Extended boot signature. (Should be 0x29. Indicates that the following 3 entries exist.)
0x27 4 ID (serial number)
0x2b 11 Volume Label, padded with blanks (0x20).
0x36 8 FAT file system type, padded with blanks (0x20), e.g.: "FAT12 ", "FAT16 ". This is not meant to be used to determine drive type, however, some utilities use it in this way.
0x3e 448 Operating system boot code
0x1FE 2 Boot sector signature (0x55 0xAA)
The boot sector is portrayed here as found on e.g. an OS/2 1.3 boot diskette. Earlier versions used a shorter BIOS Parameter Block and their boot code would start earlier (for example at offset 0x2b in OS/2 1.1).
Further structure used by FAT32:
Byte Offset Length (bytes) Description
0x24 4 Sectors per file allocation table
0x28 2 FAT Flags (Only used during a conversion from a FAT12/16 volume.)
0x2a 2 Version (Defined as 0)
0x2c 4 Cluster number of root directory start
0x30 2 Sector number of FS Information Sector
0x32 2 Sector number of a copy of this boot sector (0 if no backup copy exists)
0x34 12 Reserved
0x40 1 Physical Drive Number (see FAT12/16 BPB at offset 0x24)
0x41 1 Reserved (see FAT12/16 BPB at offset 0x25)
0x42 1 Extended boot signature. (see FAT12/16 BPB at offset 0x26)
0x43 4 ID (serial number)
0x47 11 Volume Label
0x52 8 FAT file system type: "FAT32 "
0x5a 420 Operating system boot code
0x1FE 2 Boot sector signature (0x55 0xAA)
Exceptions
The implementation of FAT used in MS-DOS for the Apricot PC had a different boot sector layout, to accommodate that computer's non-IBM compatible BIOS. The jump instruction and OEM name were omitted, and the MS-DOS file system parameters (offsets 0x0B - 0x17 in the standard sector) were located at offset 0x50. Later versions of Apricot MS-DOS gained the ability to read and write disks with the standard boot sector in addition to those with the Apricot one.
DOS Plus on the BBC Master 512 did not use conventional boot sectors at all. Data disks omitted the boot sector and began with a single copy of the FAT (the first byte of the FAT was used to determine disk capacity) while boot disks began with a miniature ADFS file system containing the boot loader, followed by a single FAT. It could also access standard PC disks formatted to 180 KB or 360 KB, again using the first byte of the FAT to determine the capacity.
FS Information Sector
The "FS Information Sector" was introduced in FAT32[30] for speeding up access times of certain operations (in particular, getting the amount of free space). It is located at a sector number specified in the boot record at position 0x30 (usually sector 1, immediately after the boot record).
Byte Offset Length (bytes) Description
0x00 4 FS information sector signature (0x52 0x52 0x61 0x41 / "RRaA")
0x04 480 Reserved (byte values are 0x00)
0x1e4 4 FS information sector signature (0x72 0x72 0x41 0x61 / "rrAa")
0x1e8 4 Number of free clusters on the drive, or -1 if unknown
0x1ec 4 Number of the most recently allocated cluster
0x1f0 14 Reserved (byte values are 0x00)
0x1fe 2 FS information sector signature (0x55 0xAA)
File Allocation Table
A partition is divided up into identically sized clusters, small blocks of contiguous space. Cluster sizes vary depending on the type of FAT file system being used and the size of the partition, typically cluster sizes lie somewhere between 2 KB and 32 KB. Each file may occupy one or more of these clusters depending on its size; thus, a file is represented by a chain of these clusters (referred to as a singly linked list). However these clusters are not necessarily stored adjacent to one another on the disk's surface but are often instead fragmented throughout the Data Region.
The File Allocation Table (FAT) is a list of entries that map to each cluster on the partition. Each entry records one of five things:
• the cluster number of the next cluster in a chain
• a special end of clusterchain (EOC) entry that indicates the end of a chain
• a special entry to mark a bad cluster
• a special entry to mark a reserved cluster[citation needed]
• a zero to note that the cluster is unused
Each version of the FAT file system uses a different size for FAT entries. Smaller numbers result in a smaller FAT table, but waste space in large partitions by needing to allocate in large clusters. The FAT12 file system uses 12 bits per FAT entry, thus two entries span 3 bytes. It is consistently little-endian: if you consider the 3 bytes as one little-endian 24-bit number, the 12 least significant bits are the first entry and the 12 most significant bits are the second.
In the FAT32 file system, FAT entries are 32 bits, but only 28 of these are actually used; the 4 most significant bits are reserved.
FAT entry values:
FAT12 FAT16 FAT32 Description
0x000 0x0000 0x00000000 Free Cluster
0x001 0x0001 0x00000001 Reserved value; do not use
0x002–0xFEF 0x0002–0xFFEF 0x00000002–0x0FFFFFEF Used cluster; value points to next cluster
0xFF0–0xFF6 0xFFF0–0xFFF6 0x0FFFFFF0–0x0FFFFFF6 Reserved values; do not use[28].
0xFF7 0xFFF7 0x0FFFFFF7 Bad sector in cluster or reserved cluster
0xFF8–0xFFF 0xFFF8–0xFFFF 0x0FFFFFF8–0x0FFFFFFF Last cluster in file
Note that FAT32 uses only 28 bits of the 32 possible bits. The upper 4 bits are usually zero (as indicated in the table above) but are reserved and should be left untouched.
The first cluster of the Data Region is cluster #2. That leaves the first two entries of the FAT unused. In the first byte of the first entry a copy of the media descriptor is stored. The remaining 8 bits (if FAT16), or 20 bits (if FAT32) of this entry are 1. In the second entry the end-of-cluster-chain marker is stored. The high order two bits of the second entry are sometimes, in the case of FAT16 and FAT32, used for dirty volume management: high order bit 1: last shutdown was clean; next highest bit 1: during the previous mount no disk I/O errors were detected.[31]
Directory table
A directory table is a special type of file that represents a directory (also known as a folder). Each file or directory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the date and time of creation, the address of the first cluster of the file/directory's data and finally the size of the file/directory. Aside from the Root Directory Table in FAT12 and FAT16 file systems, which occupies the special Root Directory Region location, all Directory Tables are stored in the Data Region. The actual number of entries in a directory stored in the Data Region can grow by adding another cluster to the chain in the FAT.
Note that before each entry there can be "fake entries" to support the Long File Name. (See further down the article).
Legal characters for DOS file names include the following:
• Upper case letters A–Z
• Numbers 0–9
• Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the file name, also filenames with space in them could not be used on the DOS command line prior to Windows 95 because of the lack of a suitable escaping system)
• ! # $ % & ' ( ) - @ ^ _ ` { } ~
• Values 128–255
This excludes the following ASCII characters:
• " * / : < > ? \ |
Windows/MSDOS has no shell escape character
• + , . ; = [ ]
They are allowed in long file names only.
• Lower case letters a–z
Stored as A–Z. Allowed in long file names.
• Control characters 0–31
• Value 127 (DEL)
The DOS file names are in the OEM character set.
Directory entries, both in the Root Directory Region and in subdirectories, are of the following format (see also 8.3 filename):
Byte Offset Length Description
0x00 8 DOS file name (padded with spaces)
The first byte can have the following special values:
0x00 Entry is available and no subsequent entry is in use
0x05 Initial character is actually 0xE5.
0x2E 'Dot' entry; either '.' or '..'
0xE5 Entry has been previously erased and is available. File undelete utilities must replace this character with a regular character as part of the undeletion process.
0x08 3 DOS file extension (padded with spaces)
0x0b 1 File Attributes
Bit Mask Description
0 0x01 Read Only
1 0x02 Hidden
2 0x04 System
3 0x08 Volume Label
4 0x10 Subdirectory
5 0x20 Archive
6 0x40 Device (internal use only, never found on disk)
7 0x80 Unused
An attribute value of 0x0F is used to designate a long file name entry.
0x0c 1 Reserved; two bits are used by NT and later versions to encode case information (see below); otherwise 0[32]
0x0d 1 Create time, fine resolution: 10ms units, values from 0 to 199.
0x0e 2 Create time. The hour, minute and second are encoded according to the following bitmap:
Bits Description
15-11 Hours (0-23)
10-5 Minutes (0-59)
4-0 Seconds/2 (0-29)
Note that the seconds is recorded only to a 2 second resolution. Finer resolution for file creation is found at offset 0x0d.
0x10 2 Create date. The year, month and day are encoded according to the following bitmap:
Bits Description
15-9 Year (0 = 1980, 127 = 2107)
8-5 Month (1 = January, 12 = December)
4-0 Day (1 - 31)
0x12 2 Last access date; see offset 0x10 for description.
0x14 2 EA-Index (used by OS/2 and NT) in FAT12 and FAT16, High 2 bytes of first cluster number in FAT32
0x16 2 Last modified time; see offset 0x0e for description.
0x18 2 Last modified date; see offset 0x10 for description.
0x1a 2 First cluster in FAT12 and FAT16. Low 2 bytes of first cluster in FAT32. Entries with the Volume Label flag, subdirectory ".." pointing to root, and empty files with size 0 should have first cluster 0.
0x1c 4 File size in bytes. Entries with the Volume Label or Subdirectory flag set should have a size of 0.
Clusters are numbered from a cluster offset as defined above and the FilestartCluster is in 0x1a. This would mean the first data segment X can be calculated using the Boot Sector fields:
For FAT32
FileStartSector = ReservedSectors(0x0e) + (NumofFAT(0x10) * Sectors2FAT(0x24)) + ((X − 2) * SectorsPerCluster(0x0d))
For FAT16/12
FileStartSector = ReservedSectors(0x0e) + (NumofFAT(0x10) * Sectors2FAT(0x16)) + (MaxRootEntry(0x11) * 32 / BytesPerSector(0x0b)) + ((X − 2) * SectorsPerCluster(0x0d))
Long file names
Long File Names (LFN) are stored on a FAT file system using a trick—adding (possibly multiple) additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS.
Older versions of PC-DOS mistake LFN names in the root directory for the volume label, and are likely to display an incorrect label.
Each phony entry can contain up to 13 UTF-16 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UTF-16 characters.[32]
After the last UTF-16 character, a 0x00 0x00 is added. Other not used characters are filled with 0xFF 0xFF.
LFN entries use the following format:
Byte Offset Length Description
0x00 1 Sequence Number
0x01 10 Name characters (five UTF-16 characters)
0x0b 1 Attributes (always 0x0F)
0x0c 1 Reserved (always 0x00)
0x0d 1 Checksum of DOS file name
0x0e 12 Name characters (six UTF-16 characters)
0x1a 2 First cluster (always 0x0000)
0x1c 4 Name characters (two UTF-16 characters)
If there are multiple LFN entries, required to represent a file name, firstly comes the last LFN entry (the last part of the filename). The sequence number here also has bit 7 (0x40) checked (this means the last LFN entry. However it's the first entry got when reading the directory file). The last LFN entry has the biggest sequence number which decreases in following entries. The first LFN entry has sequence number 1. Bit 8 (0x80) of the sequence number is used to indicate that the entry is deleted.
For example if we have filename "File with very long filename.ext" it would be formatted like this:
Sequence number Entry data
0x43 "me.ext"
0x02 "y long filena"
0x01 "File with ver"
??? Normal 8.3 entry
A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (Note that pFcbName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with spaces (ASCII 0x20) char. For example, "Readme.txt" would be "README TXT".)
unsigned char lfn_checksum(const unsigned char *pFcbName)
{
int i;
unsigned char sum=0;
for (i=11; i; i--)
sum = ((sum & 1) << 7) + (sum >> 1) + *pFcbName++;
return sum;
}
If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice-versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions such as XP. Instead, two bits in byte 0x0c of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support this. This creates a backwards-compatibility problem with older Windows versions (95, 98, ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.
Third-party extensions
Before Microsoft added support for long filenames and creation/access time stamps, bytes 0x0C–0x15 of the directory entry were used by alternative operating systems to store additional metadata. These included:
Byte Offset Length System Description
0x0C 2 RISC OS
File type, 0x000 - 0xFFF
0x0C 1 DOS Plus
User-defined file attributes F1-F4
Bit Mask Description
7 0x80 F1
6 0x40 F2
5 0x20 F3
4 0x10 F4
0x0C 1 MSX-DOS 2
For a deleted file, the original first character of the filename.
0x0D 1 DR-DOS
For a deleted file, the original first character of the filename.
0x0E 2 DR-DOS and FlexOS
Encrypted file password
0x0E 2 ANDOS
File address in the memory
0x10 4 DR-DOS 7 For a deleted file, its original file time and date; deleted files have their normal time and date fields set to the time of deletion
0x12 2 DR-DOS 6 and FlexOS File owner ID
0x14 2 DR-DOS and FlexOS File permissions bitmap (execute permissions are only used by FlexOS):
Bit Mask Description
0 0x0001 Owner delete requires password
1 0x0002 Owner execute requires password
2 0x0004 Owner write requires password
3 0x0008 Owner read requires password
4 0x0010 Group delete requires password
5 0x0020 Group execute requires password
6 0x0040 Group write requires password
7 0x0080 Group read requires password
8 0x0100 World delete requires password
9 0x0200 World execute requires password
10 0x0400 World write requires password
11 0x0800 World read requires password
FAT licensing
Microsoft applied for, and was granted, a series of patents for key parts of the FAT file system in the mid-1990s. Being almost universally compatible and well-understood, FAT is frequently chosen as an interchange format for flash media used in digital cameras and PDAs.
On December 3, 2003 Microsoft announced[34] it would be offering licenses for use of its FAT specification and "associated intellectual property", at the cost of a US$0.25 royalty per unit sold, with a $250,000 maximum royalty per license agreement.[35]
To this end, Microsoft cited four patents on the FAT file system as the basis of its intellectual property claims. All four pertain to long-filename extensions to FAT first seen in Windows 95:
• U.S. Patent 5,745,902 - Method and system for accessing a file using file names having different file name formats. Filed July 6, 1992. This covered a means of generating and associating a short, 8.3 filename with long one (for example, "Microsoft.txt" with "MICROS~1.TXT") and a means of enumerating conflicting short filenames (for example, "MICROS~2.TXT" and "MICROS~3.TXT"). It is unclear whether this patent would cover an implementation of FAT without explicit long filename capabilities. Hard links in Unix file systems do not appear to be prior art: deleting a FAT file via its long name will also remove its short name. Renaming a file to a "short" name also updates the long file name for coherency; similarly, renaming a file to a "long" name will allocate a new "short" name. In NTFS, hard links and dual names are separate concepts and each hard link has two names. Finally, at the API level, both names are always provided together when a directory lookup is requested from the system; they do not appear as two separate files and do not have to be "matched" to determine unique files.
• U.S. Patent 5,579,517 - Common name space for long and short filenames. Filed for on 1995-04-24. This covers the method of chaining together multiple consecutive 8.3 named directory entries to hold long filenames, with some of the entries specially marked to prevent their confusing older, long filename-unaware FAT implementations.
o The Public Patent Foundation successfully challenged this patent; the claims were rejected[36] on 2004-09-14, due to prior disclosure[37] of the claimed techniques in patents U.S. Patent 5,307,494 and U.S. Patent 5,367,671. This decision was later overturned by the Patent Office on 2006-01-10.
• U.S. Patent 5,758,352 - Common name space for long and short filenames. Filed on 1996-09-05. This is very similar to 5,579,517.
o The Public Patent Foundation successfully challenged this patent (USPTO); The USPTO rejected this patent on 2005-10-05, on the grounds that "the six assignees names were incorrect".[38][39] This decision was also later overturned by the Patent Office on 2006-01-10.
• U.S. Patent 6,286,013 - Method and system for providing a common name space for long and short file names in an operating system. Filed on 1997-01-28. This makes claims on the methods used when Windows 95, Windows 98 and Windows Me expose long filenames to their MS-DOS compatibility layer. It does not appear to affect any non-Microsoft FAT implementations.
Many technical commentators[who?] have concluded that these patents only cover FAT implementations that include support for long filenames, and that removable solid state media and consumer devices only using short names would be unaffected.
Additionally, in the document "Microsoft Extensible Firmware Initiative FAT 32 File System Specification, FAT: General Overview of On-Disk Format" published by Microsoft (version 1.03, 2000-12-06), Microsoft specifically grants a number of rights, which many readers have interpreted as permitting operating system vendors to implement FAT.
Microsoft is not the only company to have applied for patents for parts of the FAT file system. Other patents affecting FAT include:
• U.S. Patent 5,367,671 - System for accessing extended object attribute (EA) data through file name or EA handle linkages in path tables. Filed on 1990-09-25 by Barry A. Feigenbaum and Felix Miro of IBM, this makes claims on the methods used by OS/2, Windows NT, and Linux for storing extended attribute data in the "EA DATA. SF" file.
Appeal
As there was widespread call for these patents to be re-examined, the Public Patent Foundation (PUBPAT) submitted evidence to the US Patent and Trade Office (USPTO) disputing the validity of these patents, including prior art references from Xerox and IBM. The USPTO acknowledged that the evidence raised "substantial new question[s] of patentability," and opened an investigation into the validity of Microsoft's FAT patents.[40]
On 2004-09-30 the USPTO rejected all claims of U.S. Patent 5,579,517, based primarily on evidence provided by PUBPAT. Dan Ravicher, the foundation's executive director, said, "The Patent Office has simply confirmed what we already knew for some time now, Microsoft's FAT patent is bogus."
According to the PUBPAT press release, "Microsoft still has the opportunity to respond to the Patent Office's rejection. Typically, third party requests for re-examination, like the one filed by PUBPAT, are successful in having the subject patent either narrowed or completely revoked roughly 70% of the time."
On 2005-10-05 the Patent Office announced that, following the re-examination process, it had again rejected all claims of patent 5,579,517, and it additionally found U.S. Patent 5,758,352 invalid on the grounds that the patent had incorrect assignees.
Finally, on 2006-01-10 the Patent Office ruled that features of Microsoft's implementation of the FAT system were "novel and non-obvious", reversing both earlier non-final decisions
Patent infringement lawsuit
In February 2009, Microsoft filed a patent infringement lawsuit against TomTom alleging that the device maker's products infringe on patents related to FAT32 filesystem. As some TomTom products are based on Linux, this marked the first time that Microsoft tried to enforce its patents against the Linux platformThe lawsuit was settled out of court the following month with an agreement that Microsoft be given access to four of TomTom's patents, that TomTom will drop support for the FAT32 filesystem from its products, and that in return Microsoft not seek legal action against TomTom for the five year duration of the settlement agreement
Posted by kittu at 12:11 AM 0 comments
FAT and NTFS file systems ...
NTFS (New Technology File System)is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7.
NTFS supersedes the FAT file system as the preferred file system for Microsoft’s Windows operating systems. NTFS has several improvements over FAT and HPFS (High Performance File System) such as improved support for metadata and the use of advanced data structures to improve performance, reliability, and disk space utilization, plus additional extensions such as security access control lists (ACL) and file system journaling.
History
In the mid 1980s, Microsoft and IBM formed a joint project to create the next generation graphical operating system. The result of the project was OS/2, but eventually Microsoft and IBM disagreed on many important issues and separated. OS/2 remained an IBM project. Microsoft started to work on Windows NT. The OS/2 filesystem HPFS contained several important new features. When Microsoft created their new operating system, they borrowed many of these concepts for NTFS. Probably as a result of this common ancestry, HPFS and NTFS share the same disk partition identification type code (07). Sharing an ID is unusual since there were dozens of available codes, and other major filesystems have their own code. FAT has more than nine (one each for FAT12, FAT16, FAT32, etc.). Algorithms which identify the filesystem in a partition type 07 must perform additional checks. It is also clear that NTFS owes some of its architectural design to Files-11 used by VMS. This is hardly surprising since Dave Cutler was the main lead for both VMS and Windows NT.
Versions
NTFS has five released versions:
• v1.0 with NT 3.1 released mid-1993
• v1.1 with NT 3.5 released fall 1994
• v1.2 with NT 3.51 (mid-1995) and NT 4 (mid-1996) (occasionally referred to as "NTFS 4.0", because OS version is 4.0)
• v3.0 from Windows 2000 ("NTFS V5.0")
• v3.1 from Windows XP (autumn 2001; "NTFS V5.1), Windows Server 2003 (spring 2003; occasionally "NTFS V5.2), Windows Vista (mid-2005) (occasionally "NTFS V6.0), Windows Server 2008, Windows 7.
V1.0 and V1.1 (and newer) are incompatible: that is, volumes written by NT 3.5x cannot be read by NT 3.1 until an update on the NT 3.5x CD is applied to NT 3.1, which also adds FAT long file name support. V1.2 supports compressed files, named streams, ACL-based security, etc. V3.0 added disk quotas, encryption, sparse files, reparse points, update sequence number (USN) journaling, the $Extend folder and its files, and reorganized security descriptors so that multiple files which use the same security setting can share the same descriptor. V3.1 expanded the Master File Table (MFT) entries with redundant MFT record number (useful for recovering damaged MFT files).
Windows Vista introduced Transactional NTFS, NTFS symbolic links, partition shrinking and self-healing functionality though these features owe more to additional functionality of the operating system than the file system itself.
Features
NTFS v3.0 includes several new features over its predecessors: sparse file support, disk usage quotas, reparse points, distributed link tracking, and file-level encryption, also known as the Encrypting File System (EFS).
USN Journal
The USN Journal (Update Sequence Number Journal) is a system management feature that records changes to all files, streams and directories on the volume, as well as their various attributes and security settings.
It is a critical functionality of NTFS (a feature that FAT/FAT32 does not provide) for ensuring that its internal complex data structures (notably the volume allocation bitmap, or data moves performed by the defragmentation API, the modifications to MFT records such as moves of some variable-length attributes stored in MFT records and attribute lists, or updates to the shared security descriptors, or to the boot sector and its local mirrors where the last USN transaction committed on the volume is stored) and indices (for directories and security descriptors) will remain consistent in case of system crashes, and allow easy rollback of uncommitted changes to these critical data structures when the volume will be remounted.
In later versions of Windows, the USN journal has extended to trace the state of other transactional operations on other parts of the NTFS filesystem, such as the VSS shadow copies of system files with copy-on-write semantics, or the implementation of Transactional NTFS and of distributed filesystems (see below).
Hard links and short filenames
Originally included to support the POSIX subsystem in Windows NT hard links are similar to directory junctions, but used for files instead of directories. Hard links can only be applied to files on the same volume since an additional filename record is added to the file's MFT record. Short (8.3) filenames are also implemented as additional filename records that don't have separate directory entries. Hard links also have the behavior that changing the size or attributes of a file may not update the directory entries of other links until they are opened.
Alternate data streams (ADS)
Alternate data streams allow more than one data stream to be associated with a filename, using the filename format "filename:streamname" (e.g., "text.txt:extrastream"). Alternate streams are not listed in Windows Explorer, and their size is not included in the file's size. Only the main stream of a file is preserved when it is copied to a FAT-formatted USB drive, attached to an e-mail, or uploaded to a website. As a result, using alternate streams for critical data may cause problems. NTFS Streams were introduced in Windows NT 3.1, to enable Services for Macintosh (SFM) to store Macintosh resource forks. Although current versions of Windows Server no longer include SFM, third-party Apple Filing Protocol (AFP) products (such as Group Logic's ExtremeZ-IP) still use this feature of the file system.
Malware has used alternate data streams to hide its code; ome malware scanners and other special tools now check for data in alternate streams. Microsoft provides a tool called Streams to allow users to view streams on a selected volume.
Very small ADS are also added within Internet Explorer (and now also other browsers) to mark files that have been downloaded from external sites: they may be unsafe to run locally and the local shell will require confirmation from the user before opening them. When the user indicates that he no longer wants this confirmation dialog, this ADS is simply dropped from the MFT entry for downloaded files.
Some media players have also tried to use ADS to store custom metadata to media files, in order to organize the collections, without modifying the effective data content of the media files themselves (using embedded tags when they are supported by the media file formats such as MPEG and OGG containers); these metadata may be displayed in the Windows Explorer as extra information columns, with the help of a registered Windows Shell extension that can parse them, but most media players prefer to use their own separate database instead of ADS for storing these information (notably because ADS are visible to all users of these files, instead of being managed with distinct per-user security settings and having their values defined according to user preferences).
Sparse files
Sparse files are files which contain sparse data sets, data mostly filled with zeros. Database applications, for instance, sometimes use sparse files. Because of this, Microsoft has implemented support for efficient storage of sparse files by allowing an application to specify regions of empty (zero) data. An application that reads a sparse file reads it in the normal manner with the file system calculating what data should be returned based upon the file offset. As with compressed files, the actual sizes of sparse files are not taken into account when determining quota limits.
File compression
NTFS compresses files using a variant of the LZ77 algorithm. Although read–write access to compressed files is transparent, Microsoft recommends avoiding compression on server systems and/or network shares holding roaming profiles because it puts a considerable load on the processor. Single-user systems with limited hard disk space can benefit from NTFS compression. The slowest link in a computer is not the CPU but the speed of the hard drive, so NTFS compression allows the limited, slow storage space to be better used, in terms of both space and (often) speed. NTFS compression can also serve as a replacement for sparse files when a program (e. g. a download manager) is not able to create files without content as sparse files.
Volume Shadow Copy
The Volume Shadow Copy Service (VSS) keeps historical versions of files and folders on NTFS volumes by copying old, newly-overwritten data to shadow copy (copy-on-write). The old file data is overlaid on the new when the user requests a revert to an earlier version. This also allows data backup programs to archive files currently in use by the file system. On heavily loaded systems, Microsoft recommends setting up a shadow copy volume on a separate disk. To ensure consistent recovery in case of system crashes, the VSS also uses the USN journal to mark local transactions and ensure that committed changes to the system files will be effectively recovered after system restart when the NTFS volume will be remounted, or safely rolled back to an older version if the new version was not fully recorded before actual commits before closing the modified file. However, these VSS shadows are not coordinated globally on multiple files or volumes, except when using a transaction coordinator (see below). They can just be used to ensure that older versions will remain accessible during backup operations, for getting consistent system images in those backups.
Transactional NTFS
As of Windows Vista, applications can use Transactional NTFS to group changes to files together into a transaction. The transaction will guarantee that all changes happen, or none of them do, and it will guarantee that applications outside the transaction will not see the changes until the precise instant they are committed. It uses the similar techniques as those used for Volume Shadow Copies (i.e. copy-on-write) to ensure that overwritten data can be safely rolled back, and the UFS journaling log to mark the transactions that have still not been committed, or those that have been committed but still not fully applied (in case of system crash during a commit by one of the participants).
However, in a transactional-enabled filesystem, this can be used temporarily for all other files needed for any kind of partition, as long as the transaction is not committed, than just system files that are permanently marked with copy-on-write semantics and that are implicitly modified within their own local transactions.
The copy-on-write technique is however modified in order to allow efficient rollbacks and avoid the creation of fragmentation in the filesystem used by possibly many participants: the old data may be not overwritten immediately but kept where it is (notably when it is currently locked by someone else for consistent reads in its own transactions); in that case, only the new uncommitted data is kept in a temporary shadow (rather than the copy-on-write old data), that will be finally applied using normal VSS copy-on-write when the transaction will be committed by the writer. In addition, these temporary shadows for new data, only seen by the participating processes that have their own uncommitted data, are not necessarily immediately written to disk, but may just be maintained in memory or swapped out for later commits. Transaction NTFS does not restrict transactions to just the local NTFS volume, but also includes other transactional data or operations in other locations such as data stored in separate volumes, the local registry, or SQL databases, or the current states of system services or remote services.
These transactions are coordinated network-wide with all participants using a specific service, the Distributed Transactions Coordinator (DTC), to ensure that all participants will receive same commit state, and to transport the changes that have been validated by any participant (so that the others can invalidate their local caches for old data or rollback their ongoing uncommitted changes). Transactional NTFS allows, for example, the creation of network-wide consistent distributed filesystems, including with their local live or offline caches.
Encrypting File System (EFS)
EFS provides strong and user-transparent encryption of any file or folder on an NTFS volume. EFS works in conjunction with the EFS service, Microsoft's CryptoAPI and the EFS File System Run-Time Library (FSRTL). EFS works by encrypting a file with a bulk symmetric key (also known as the File Encryption Key, or FEK), which is used because it takes a relatively small amount of time to encrypt and decrypt large amounts of data than if an asymmetric key cipher is used. The symmetric key that is used to encrypt the file is then encrypted with a public key that is associated with the user who encrypted the file, and this encrypted data is stored in an alternate data stream of the encrypted file. To decrypt the file, the file system uses the private key of the user to decrypt the symmetric key that is stored in the file header. It then uses the symmetric key to decrypt the file. Because this is done at the file system level, it is transparent to the user. Also, in case of a user losing access to their key, support for additional decryption keys has been built in to the EFS system, so that a recovery agent can still access the files if needed. NTFS-provided encryption and compression are mutually exclusive—NTFS can be used for one and a third-party tool for the other.
The support of EFS is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Quotas
Disk quotas were introduced in NTFS v3. They allow the administrator of a computer that runs a version of Windows that supports NTFS to set a threshold of disk space that users may use. It also allows administrators to keep track of how much disk space each user is using. An administrator may specify a certain level of disk space that a user may use before they receive a warning, and then deny access to the user once they hit their upper limit of space. Disk quotas do not take into account NTFS's transparent file-compression, should this be enabled. Applications that query the amount of free space will also see the amount of free space left to the user who has a quota applied to them.
The support of disk quotas is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Reparse points
This feature was introduced in NTFS v3 Reparse points are used by associating a reparse tag in the user space attribute of a file or directory. When the object manager (see Windows NT line executive) parses a file system name lookup and encounters a reparse attribute, it knows to reparse the name lookup, passing the user controlled reparse data to every file system filter driver that is loaded into Windows 2000. Each filter driver examines the reparse data to see whether it is associated with that reparse point, and if that filter driver determines a match, then it intercepts the file system call and executes its special functionality. Reparse points are used to implement Volume Mount Points, Directory Junctions, Hierarchical Storage Management, Native Structured Storage, Single Instance Storage, and Symbolic Links
Volume mount points
Volume mount points are similar to Unix mount points, where the root of another file system is attached to a directory In NTFS, this allows additional file systems to be mounted without requiring a separate drive letter (such as C: or D:) for each
Once a volume has been mounted on top of an existing directory of another volume, the contents previously listed in that directory become invisible and are replaced by the content of the root directory of the mounted volume. The mounted volume could still have its own drive letter assigned separately. The file system does not allow volumes to be mutually mounted on each other. Volume mount points can be made to be either persistent (remounted automatically after system reboot) or not persistent (must be manually remounted after reboot
Mounted volumes may use other file systems than just NTFS; notably they may be remote shared directories, possibly with their own security settings and remapping of access rights according to the remote file system policy
Directory junctions
Similar to volume mount points, however directory junctions reference other directories in the file system instead of other volumes. For instance, the directory C:\exampledir with a directory junction attribute that contains a link to D:\linkeddir will automatically refer to the directory D:\linkeddir when it is accessed by a user-mode application. This function is conceptually similar to symbolic links to directories in Unix, except that the target in NTFS must always be another directory (typical Unix file systems allow the target of a symbolic link to be any type of file) and have the semantics of a hardlink (i.e., they must be immediately resolvable when they are createdDirectory joins (which can be created with the command MKLINK /J junctionName targetDirectory and removed with RMDIR junctionName from a console prompt) are persistent, and resolved on the server side as they share the same security realm of the local system or domain on which the parent volume is mounted and the same security settings for its contents as the content of the target directory; however the junction itself may have distinct security settings. Unlinking a directory junction join does not delete files in the target directory Note that some directory junctions are installed by default on Windows Vista, for compatibility with previous versions of Windows, such as Documents and Settings in the root directory of the system drive, which links to the Users physical directory in the root directory of the same volume. However they are hidden by default, and their security settings are set up so that the Windows Explorer will refuse to open them from within the Shell or in most applications, except for the local built-in SYSTEM user or the local Administrators group (both user accounts are used by system software installers). This additional security restriction has probably been made to avoid users of finding apparent duplicate files in the joined directories and deleting them by error, because the semantics of directory junctions is not the same as hardlinks: the reference counting is not used on the target contents and not even on the referenced container itself
Directory junctions are soft links (they will persist even if the target directory is removed), working as a limited form of symbolic links (with an additional restriction on the location of the target), but it is an optimized version which allows faster processing of the reparse point with which they are implemented, with less overhead than the newer NTFS symbolic links, and can be resolved on the server side (when they are found in remote shared directories
Symbolic links
Symbolic links (or soft links) were introduced in Windows Vista. Symbolic links are resolved on the client side. So when a symbolic link is shared, the target is subject to the access restrictions on the client, and not the server
Symbolic links can be created either to files (created with MKLINK symLink targetFilename) or to directories (created with MKLINK /D symLinkD targetDirectory), but the semantic of the link must be provided with the created link. The target however need not exist or be available when the symbolic link is created: when the symbolic link will be accessed and the target will be checked for availability, NTFS will also check if it has the correct type (file or directory); it will return a not-found error if the existing target has the wrong type
They can also reference shared directories on remote hosts or files and subdirectories within shared directories: their target is not mounted immediately at boot, but only temporarily on demand while opening them with the OpenFile() or CreateFile() API. Their definition is persistent on the NTFS volume where they are created (all types of symbolic links can be removed as if they were files, using DEL symLink from a command line prompt or batch
Single Instance Storage (SIS)
When there are several directories that have different, but similar, files, some of these files may have identical content. Single instance storage allows identical files to be merged to one file and create references to that merged file. SIS consists of a file system filter that manages copies, modification and merges to files; and a user space service (or groveler) that searches for files that are identical and need merging. SIS was mainly designed for remote installation servers as these may have multiple installation images that contain many identical files; SIS allows these to be consolidated but, unlike for example hard links, each file remains distinct; changes to one copy of a file will leave others unaltered. This is similar to copy-on-write, which is a technique by which memory copying is not really done until one copy is modified.
Hierarchical Storage Management (HSM)
Hierarchical Storage Management is a means of transferring files that are not used for some period of time to less expensive storage media. When the file is next accessed, the reparse point on that file determines that it is needed and retrieves it from storage
Native Structured Storage (NSS)
NSS was an ActiveX document storage technology that has since been discontinued by Microsoft It allowed ActiveX Documents to be stored in the same multi-stream format that ActiveX uses internally. An NSS file system filter was loaded and used to process the multiple streams transparently to the application, and when the file was transferred to a non-NTFS formatted disk volume it would also transfer the multiple streams into a single stream.
Interoperability
Details on the implementation's internals are not released, which makes it difficult for third-party vendors to provide tools to handle NTFS.
Linux
The ability to read and write to NTFS is provided by the NTFS-3G driver. It is included in most Linux distributions. Other outdated and mostly read-only solutions exist as well:
• Linux kernel 2.2: Kernel versions 2.2.0 and later include the ability to read NTFS partitions
• Linux kernel 2.6: Kernel versions 2.6.0 and later contain a driver written by Anton Altaparmakov (University of Cambridge) and Richard Russon. It supports file read, overwrite and resize.
• NTFSMount: A read/write userspace NTFS driver. It provides read-write access to NTFS, excluding writing compressed and encrypted files, changing file ownership, and access rights.
• Tuxera NTFS: High-performance read/write commercial kernel driver, mainly targeted for embedded devices from Tuxera Ltd which also develops the open source NTFS-3G driver.
• NTFS for Linux: A commercial driver with full read/write support available as free and non-free download(s) from Paragon Software Group.
• Captive NTFS: A 'wrapping' driver which uses Windows' own driver, ntfs.sys.
Note that all three userspace drivers, namely NTFSMount, NTFS-3G and Captive NTFS, are built on the Filesystem in Userspace (FUSE), a Linux kernel module tasked with bridging userspace and kernel code to save and retrieve data. All drivers listed above (except Tuxera NTFS and Paragon NTFS for Linux) are open source (GPL). Due to the complexity of internal NTFS structures, both the built-in 2.6.14 kernel driver and the FUSE drivers disallow changes to the volume that are considered unsafe, to avoid corruption.
Mac OS X
Mac OS X v10.3 and later include read-only support for NTFS-formatted partitions. The GPL-licensed NTFS-3G also works on Mac OS X through FUSE and allows reading and writing to NTFS partitions. A performance enhanced commercial version, called Tuxera NTFS for Mac, is also available from the NTFS-3G developers. NTFS write support has been discovered in Mac OS X 10.6, but has not been activated as of version 10.6.1, although hacks do exist to enable the functionality.
Microsoft Windows
While the different NTFS versions are for the most part fully forward- and backward-compatible, there are technical considerations for mounting newer NTFS volumes in older versions of Microsoft Windows. This affects dual-booting, and external portable hard drives.
For example, attempting to use an NTFS partition with "Previous Versions" (a.k.a. Volume Shadow Copy) on an operating system that doesn't support it, will result in the contents of those previous versions being lost.
Others
eComStation, and FreeBSD offer read-only NTFS support (there is a beta NTFS driver that allows write/delete for eComStation, but is generally considered unsafe). A free third-party tool for BeOS, which was based on NTFS-3G, allows full NTFS read and write. NTFS-3G also works on Linux, Mac OS X, FreeBSD, NetBSD, Solaris, QNX and Haiku, in addition to Linux, through FUSE A free for personal use read/write driver for MS-DOS called "NTFS4DOS" also exists.
Compatibility with FAT
Microsoft currently provides a tool (convert.exe) to convert HPFS (only on Windows NT 3), FAT16 and, on Windows 2000 and higher, FAT32 to NTFS, but not the other way around.
Resizing
Various third-party tools are all capable of safely resizing NTFS partitions. Microsoft added the ability to shrink or expand a partition with Windows Vista, but this capability is limited because it will not relocate page file fragments or files that have been marked as unmovable. So shrinking requires relocating or disabling any page file, the index of Windows Search, and any Shadow Copy used by System Restore. Using a 3rd-party tool is an easier option.
Universal time
For historical reasons, the versions of Windows that do not support NTFS all keep time internally as local zone time, and therefore so do all file systems other than NTFS that are supported by current versions of Windows. However, Windows NT and its descendants keep internal timestamps as UTC and make the appropriate conversions for display purposes. Therefore, NTFS timestamps are in UTC. This means that when files are copied or moved between NTFS and non-NTFS partitions, the OS needs to convert timestamps on the fly. But if some files are moved when daylight saving time (DST) is in effect, and other files are moved when standard time is in effect, there can be some ambiguities in the conversions. As a result, especially shortly after one of the days on which local zone time changes, users may observe that some files have timestamps that are incorrect by one hour. Due to the differences in implementation of DST between the northern and southern hemispheres, this can result in a potential timestamp error of up to 4 hours in any given 12 months.
Internals
In NTFS, all file data—file name, creation date, access permissions, and contents—are stored as metadata in the Master File Table. This abstract approach allowed easy addition of file system features during Windows NT's development — an interesting example is the addition of fields for indexing used by the Active Directory software.
NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names, etc.). This means UTF-16 codepoints are supported, but the file system does not check whether a sequence is valid UTF-16 (it allows any sequence of short values, not restricted to those in the Unicode standard).
Internally, NTFS uses B+ trees to index file system data. Although complex to implement, this allows faster file look up times in most cases. A file system journal is used to guarantee the integrity of the file system metadata but not individual files' content. Systems using NTFS are known to have improved reliability compared to FAT file systems. The Master File Table (MFT) contains metadata about every file, directory, and metafile on an NTFS volume. It includes filenames, locations, size, and permissions. Its structure supports algorithms which minimize disk fragmentation. A directory entry consists of a filename and a "file ID" which is the record number representing the file in the Master File Table. The file ID also contains a reuse count to detect stale references. While this strongly resembles the W_FID of Files-11, other NTFS structures radically differ.
Metafiles
NTFS contains several files which define and organize the file system. In all respects, most of these files are structured like any other user file ($Volume being the most peculiar), but are not of direct interest to file system clients. These metafiles define files, back up critical file system data, buffer file system changes, manage free space allocation, satisfy BIOS expectations, track bad allocation units, and store security and disk space usage information. All content is in an unnamed data stream, unless otherwise indicated.
Segment Number
File Name
Purpose
0 $MFT Describes all files on the volume, including file names, timestamps, stream names, and lists of cluster numbers where data streams reside, indexes, security identifiers, and file attributes like "read only", "compressed", "encrypted", etc.
1 $MFTMirr Duplicate of the first vital entries of $MFT, usually 4 entries (4 KB).
2 $LogFile Contains transaction log of file system metadata changes.
3 $Volume Contains information about the volume, namely the volume object identifier, volume label, file system version, and volume flags (mounted, chkdsk requested, requested $LogFile resize, mounted on NT 4, volume serial number updating, structure upgrade request). This data is not stored in a data stream, but in special MFT attributes: If present, a volume object ID is stored in an $OBJECT_ID record; the volume label is stored in a $VOLUME_NAME record, and the remaining volume data is in a $VOLUME_INFORMATION record. Note: volume serial number is stored in file $Boot (below).
4 $AttrDef A table of MFT attributes which associates numeric identifiers with names.
5 . Root directory. Directory data is stored in $INDEX_ROOT and $INDEX_ALLOCATION attributes both named $I30.
6 $Bitmap An array of bit entries: each bit indicates whether its corresponding cluster is used (allocated) or free (available for allocation).
7 $Boot Volume boot record. This file is always located at the first clusters on the volume. It contains bootstrap code (see NTLDR/BOOTMGR) and a BIOS parameter block including a volume serial number and cluster numbers of $MFT and $MFTMirr. $Boot is usually 8192 bytes long.
8 $BadClus A file which contains all the clusters marked as having bad sectors. This file simplifies cluster management by the chkdsk utility, both as a place to put newly discovered bad sectors, and for identifying unreferenced clusters. This file contains two data streams, even on volumes with no bad sectors: an unnamed stream contains bad sectors—it is zero length for perfect volumes; the second stream is named $Bad and contains all clusters on the volume not in the first stream.
9 $Secure Access control list database which reduces overhead having many identical ACLs stored with each file, by uniquely storing these ACLs in this database only (contains two indices $SII: perhaps Security ID Index and $SDH: Security Descriptor Hash which index the stream named $SDS containing actual ACL table).
10 $UpCase A table of unicode uppercase characters for ensuring case insensitivity in Win32 and DOS namespaces.
11 $Extend A filesystem directory containing various optional extensions, such as $Quota, $ObjId, $Reparse or $UsnJrnl.
12 ... 23 Reserved for $MFT extension entries.
usually 24 $Extend\$Quota Holds disk quota information. Contains two index roots, named $O and $Q.
usually 25 $Extend\$ObjId Holds distributed link tracking information. Contains an index root and allocation named $O.
usually 26 $Extend\$Reparse Holds reparse point data (such as symbolic links). Contains an index root and allocation named $R.
27 ... file.ext Beginning of regular file entries.
These metafiles are treated specially by Windows and are difficult to directly view: special purpose-built tools are needed.
From MFT records to attribute lists, attributes, and streams
For each file (or directory) described in the MFT record, there's a linear repository of stream descriptors (also named attributes), packed together in a variable-length record (also named an attributes list), with extra padding to fill the fixed 1KB size of every MFT record, and that fully describes the effective streams associated with that file.
Each stream (or attribute) itself has a single type (internally just a fixed-size integer in the stored descriptor, but most often handled in applications using an equivalent symbolic name in the FileOpen() or FileCreate() API call), a single optional stream name (completely unrelated to the effective filenames), plus optional associated data for that stream. For NTFS, the standard data of files, or the index data for directories are handled the same way as other data for alternate data streams, or for standard attributes. They are just one of the attributes stored in one or several attribute lists.
• For each file described in the MFT record (or in the non-resident respository of stream descriptors, see below), the stream descriptors identified by their (stream type value, stream name) must be unique. Additionally, NTFS has some ordering constraints for these descriptors.
• There's a predefined null stream type, used to indicate the end of the list of stream descriptors in the streams repository for that file. It must be present as the last stream descriptor in each stream repository (all other storage space available after it will be ignored and just consists in padding bytes to match the record size in the MFT or a cluster size in a non-resident streams repository).
• Some stream types are required and must be present in each MFT record, except unused records that are just indicated by a stream with null stream type.
o This is the case for the standard attributes that are stored as a fixed-size record and containing the timestamps and other basic single-bit attributes (compatible with those managed by FAT/FAT32 in DOS or Windows 95/98 applications).
• Some stream types cannot have a name and must remain anonymous.
o This is the case for the standard attributes, or for the preferred NTFS "filename" stream type, or the "short filename" stream type, when it is also present (for compatibility with DOS-like applications, see below). It is also possible for a file to only contain a short filename, in which case it will be the preferred one, as listed in the Windows Explorer.
o The filename streams stored in the streams repository do not make the file immediately accessible through the hierarchical filesystem. In fact, all the filenames must be indexed separately in at least one separate directory on the same volume, with its own MTF entry and its own security descriptors and attributes, that will reference the MFT entry number for that file. This allows the same file or directory to be "hardlinked" several times from several containers on the same volume, possibly with distinct filenames.
• The default data stream of a regular file is a stream of type $DATA but with an anonymous name, and the ADS's are similar but must be named.
• On the opposite, the default data stream of directories has a distinct type, but are not anonymous: they have a stream name ("$I30" in NTFS 3+) that reflects its indexing format.
Resident vs. non-resident data streams
To optimize the storage and reduce the I/O overhead for the very common case of streams with very small associated data, NTFS prefers to place this data within the stream descriptor (if the size of the stream descriptor does not then exceed the maximum size of the MFT record or the maximum size of a single entry within an non-resident stream repository, see below), instead of using the MFT entry space to list clusters containing the data; in that case, the stream descriptor will not store the data directly but will just store an allocation map pointing to the actual data stored elsewhere on the volume. When the stream data can be accessed directly from within the stream descriptor, it is called "resident data" by computer forensics workers. The amount of data which fits is highly dependent on the file's characteristics, but 700 to 800 bytes is common in single-stream files with non-lengthy filenames and no ACLs.
• Some stream descriptors (such as the preferred filename, the basic file attributes, or the main allocation map for each non-resident stream) cannot be made non-resident.
• Encrypted-by-NTFS, sparse data streams, or compressed data streams cannot be made resident.
• The format of the allocation map for non-resident streams depends on its capability of supporting sparse data storage. In the current implementation of NTFS, once a non-resident stream data has been marked and converted as sparse, it cannot be reverted to non-sparse data, so it cannot become resident again, unless this data is fully truncated, discarding the sparse allocation map completely.
• When a non-resident data stream is too much fragmented, so that its effective allocation map cannot fit entirely within the MFT record, the allocation map may be also stored as an non-resident stream, with just a small resident stream containing the indirect allocation map to the effective non-resident allocation map of the non-resident data stream.
• When there are too many streams for a file (including ADS's, extended attributes, or security descriptors), so that their descriptors cannot fit all within the MFT record, a non-resident stream may also be used to store an additional repository for the other stream descriptors (except those few small streams that cannot be non-resident), using the same format as the one used in the MFT record, but without the space constraints of the MFT record.
The NTFS filesystem driver will sometimes attempt to relocate the data of some of these non-resident streams into the streams repository, and will also attempt to relocate the stream descriptors stored in a non-resident repository back to the stream repository of the MFT record, based on priority and preferred ordering rules, and size constraints.
Since resident files do not directly occupy clusters ("allocation units"), it is possible for an NTFS volume to contain more files on a volume than there are clusters. For example, an 80 GB (74.5 GB) partition NTFS formats with 19,543,064 clusters of 4 KB. Subtracting system files (64 MB log file, a 2,442,888-byte $Bitmap file, and about 25 clusters of fixed overhead) leaves 19,526,158 clusters free for files and indices. Since there are four MFT records per cluster, this volume theoretically could hold almost 4 × 19,526,158 = 78,104,632 resident files.
Limitations
The following are a few limitations of NTFS:
File Names
File names are limited to 255 UTF-16 code words. Certain names are reserved in the volume root directory and cannot be used for files. These are: $MFT, $MFTMirr, $LogFile, $Volume, $AttrDef, . (dot), $Bitmap, $Boot, $BadClus, $Secure, $Upcase, and $Extend; . (dot) and $Extend are both directories; the others are files. The NT kernel limits full paths to 32,767 UTF-16 code words.
Maximum Volume Size
In theory, the maximum NTFS volume size is 264−1 clusters. However, the maximum NTFS volume size as implemented in Windows XP Professional is 232−1 clusters. For example, using 64 KB clusters, the maximum NTFS volume size is 256 TB minus 64 KB. Using the default cluster size of 4 KB, the maximum NTFS volume size is 16 TB minus 4 KB. (Both of these are vastly higher than the 128 GB limit lifted in Windows XP SP1.) Because partition tables on master boot record (MBR) disks only support partition sizes up to 2 TB, dynamic or GPT volumes must be used to create NTFS volumes over 2 TB. Booting from a GPT volume to a Windows environment requires a system with EFI and 64-bit support.
Maximum File Size
Theoretical: 16 EB minus 1 KB (264 − 210 or 18,446,744,073,709,550,592 bytes). Implementation: 16 TB minus 64 KB (244 − 216 or 17,592,185,978,880 bytes)
Alternate Data Streams
Windows system calls may handle alternate data streams. Depending on the operating system, utility and remote file system, a file transfer might silently strip data streams. A safe way of copying or moving files is to use the BackupRead and BackupWrite system calls, which allow programs to enumerate streams, to verify whether each stream should be written to the destination volume and to knowingly skip offending streams.
File Allocation Table or FAT is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of its relative simplicity. Performance of FAT compares poorly to most other file systems as it uses overly simplistic data structures, making file operations time-consuming, and makes poor use of disk space in situations where many small files are present.
For floppy disks, the FAT has been standardized as ECMA-107 and ISO/IEC 9293. Those standards include only FAT12 and FAT16 without long filename support; long filenames with FAT is partially patented.
The FAT file system is relatively straightforward technically and is supported by virtually all existing operating systems for personal computers. This makes it a useful format for solid-state memory cards and a convenient way to share data between operating systems.
History
The FAT file system was developed by Bill Gates and Marc McDonald during 1976–1977 It was the primary file system for various operating systems including DR-DOS, FreeDOS, MS-DOS, OS/2 (v1.1) and Microsoft Windows (up until Windows Me).
The FAT file system was created for managing disks in Microsoft Standalone Disk BASIC. In August 1980 Tim Paterson incorporated FAT into his 86-DOS operating system for the S-100 8086 CPU boards the file system was the main difference between 86-DOS and its predecessor, CP/M.
The name originates from the usage of a table which centralizes the information about which areas belong to files, are free or possibly unusable, and where each file is stored on the disk. To limit the size of the table, disk space is allocated to files in contiguous groups of hardware sectors called clusters. As disk drives have evolved, the maximum number of clusters has dramatically increased, and so the number of bits used to identify each cluster has grown. The successive major versions of the FAT format are named after the number of table element bits: 12, 16, and 32. The FAT standard has also been expanded in other ways while preserving backward compatibility with existing software.
FAT12
The initial version of FAT is now referred to as FAT12. Designed as a file system for floppy disks, it limited cluster addresses to 12-bit values, which not only limited the cluster count to 4078, but made FAT manipulation tricky with the PC's 8-bit and 16-bit registers. (Under Linux, FAT12 is limited to 4084 clusters.) The disk's size is stored as a 16-bit count of sectors, which limited the size to 32 MB FAT12 was used by several manufacturers with different physical formats, but a typical floppy disk at the time was 5.25-inch, single-sided, 40 tracks, with 8 sectors per track, resulting in a capacity of 160 KB for both the system areas and files. The FAT12 limitations exceeded this capacity by a factor of ten or more.
By convention, all the control structures were organized to fit inside the first track, thus avoiding head movement during read and write operations, although this varied depending on the manufacturer and physical format of the disk. At the time FAT12 was introduced, DOS did not support hierarchical directories, and the maximum number of files was typically limited to a few dozen. Hierarchical directories were introduced in MS-DOS version 2.0A limitation which was not addressed until much later was that any bad sector in the control structures area, track 0, could prevent the disk from being usable. The DOS formatting tool rejected such disks completely. Bad sectors were allowed only in the file area, where they made the entire holding cluster unusable as well. FAT12 remains in use on all common floppy disks, including 1.44MB ones.
Initial FAT16
In 1984, IBM released the PC AT, which featured a 20 MB hard disk. Microsoft introduced MS-DOS 3.0 in parallel. (The earlier PC XT was the first PC with a hard drive from IBM, and MS-DOS 2.0 supported that hard drive with FAT12.) Cluster addresses were increased to 16-bit, allowing for up to 65,517 clusters per volume, and consequently much greater file system sizes, at least in theory. However, the maximum possible number of sectors and the maximum (partition, rather than disk) size of 32 MB did not change. Therefore, although technically already "FAT16", this format was not what today is commonly understood as FAT16. With the initial implementation of FAT16 not actually providing for larger partition sizes than FAT12, the early benefit of FAT16 was to enable the use of smaller clusters, making disk usage more efficient, particularly for files several hundred bytes in size, which were far more common at the time. Also, the introduction of FAT16 actually did bring an increase in the maximum partition size under MS-DOS, since the implementation of FAT12 for hard disks in MS-DOS 2.0 was limited to 15 MB. (That is, the initial FAT16 did not support larger drives than FAT12, but MS-DOS 3.0 using FAT16 did support larger drives than MS-DOS 2.0 using FAT12, by a factor of two)
A 20 MB hard disk formatted under MS-DOS 3.0 was not accessible by the older MS-DOS 2.0. (This was because MS-DOS 2.0 did not support version 3.0's FAT-16 and because it did not support hard disk partitions over 15 MB in size.) Of course, MS-DOS 3.0 could still access MS-DOS 2.0 style 8 KB-cluster partitions.
MS-DOS 3.0 also introduced support for high-density 1.2 MB 5.25" diskettes, which notably had 15 sectors per track, hence more space for the FATs. This probably prompted a dubious optimization of the cluster size, which went down from 2 sectors to just 1. The net effect was that high density diskettes were significantly slower than older double density ones
Extended partition and logical drives
Apart from improving the structure of the FAT file system itself, a parallel development allowing an increase in the maximum possible FAT size was the introduction of multiple FAT partitions. Originally DOS was only prepared to handle one FAT partition, although it came with documentation and programming tools for the creation of installable device drivers to handle multiple partitions, and third-party suppliers quickly provided the missing software. Aside from that, partitions were used for sharing the disk between operating systems, typically DOS and Xenix at the time. Extra DOS partitions could not be used as boot partitions, because the installable device drivers were loaded (in config.sys) only after the first part of the DOS boot process. Later, third party tools became available that replaced the DOS master boot record (MBR) and directly loaded non-DOS drivers before DOS: such systems generally came with careful warnings that without the 3rd party software, the disk would not be compatible with DOS. Simply allowing several identical-looking DOS partitions could lead to naming problems: behaviour if more than one partition was marked active was undocumented (although well defined), as was the behaviour if there was more than one hard disk in the computer (which was machine dependent), or if the system was booted from a diskette.
To allow the use of more FAT partitions in a compatible way, a new partition type was introduced (in MS-DOS 3.2, January 1986), the extended partition, which is a container for additional partitions called logical drives. Originally only one logical drive was possible, permitting hard disks up to 64 MB. In MS-DOS 3.3 (August 1987) this limit was increased to 24 drives, equal to the maximum number of available letters for drive names (A and B being reserved for the first two floppy drives, at least one of which many, if not most, systems of the era were equipped with; where only one was installed, B always simulated a second drive using A). Logical drives are described by on-disk structures which closely resemble the Master Boot Record (MBR) of the disk (which describes the primary partitions), likely to simplify the implementation. Though some believe these partitions were nested in a way analogous to Russian matryoshka dolls, that isn't the case. They are stored as a row of separate blocks within a single box; these blocks are often referred to as being chained together, by the links in their extended boot record (EBR) sectors. Only one extended partition is allowed. Under MS-DOS, logical drives are not bootable, and the extended partition can only be created after the primary FAT partition, which removes all ambiguity but also eliminates the possibility of booting several DOS versions from the same hard disk. (A few systems other than MS-DOS can boot logical drives, and partitions can be created in any order using third party formatting tools.)
A useful side-effect of the extended partition scheme was to significantly increase the maximum number of partitions possible on a PC hard disk beyond the four which could be described by the MBR alone.
Prior to the introduction of extended partitions, some hard disk controllers (which at that time were usually separate option boards) could make large hard disks appear at the hardware interface level as two separate disks. Otherwise, DOS "Block Device Drivers" were used to access the other 3 possible partitions on a disk.
Final FAT16
Finally in November 1987, Compaq DOS 3.31 (an OEM version of MS-DOS 3.3 released by Compaq with their machines) introduced what is today called the FAT16 format, with the expansion of the 16-bit disk sector count to 32 bits. The result was initially called the DOS 3.31 Large File System. Although the on-disk changes were minor, the entire DOS disk driver had to be converted to use 32-bit sector numbers, a task complicated by the fact that it was written in 16-bit assembly language.
In 1988 this improvement became more generally available through MS-DOS 4.0 and OS/2 1.1. The limit on partition size was dictated by the 8-bit signed count of sectors per cluster, which had a maximum power-of-two value of 64. With the standard hard disk sector size of 512 bytes, this gives a maximum of 32 KB clusters, thereby fixing the "definitive" limit for the FAT16 partition size at 2 gigabytes. On magneto-optical media, which can have 1 or 2 KB sectors instead of 1/2 KB, this size limit is proportionally larger.
Much later, Windows NT increased the maximum cluster size to 64 KB by considering the sectors-per-cluster count as unsigned. However, the resulting format was not compatible with any other FAT implementation of the time, and it generated greater internal fragmentation. Windows 98 also supported reading and writing this variant, but its disk utilities did not work with it.
The number of root directory entries available is determined when the volume is formatted, and is stored in a 16-bit signed field, defining an absolute limit of 32767 entries (32736, a multiple of 32, in practice). For historical reasons, FAT12 and FAT16 media generally use 512 root directory entries on non-floppy media. Other sizes may be incompatible with some software or devices (entries being file and/or folder names in the original 8.3 format Some third party tools like mkdosfs allow the user to set this parameter
Long file names
One of the user experience goals for the designers of Windows 95 was the ability to use long filenames (LFNs—up to 255 UTF-16 code points long), in addition to classic 8.3 filenames. LFNs were implemented using a workaround in the way directory entries are laid out (see below).
The version of the file system with this extension is usually known as VFAT after the Windows 95 virtual device driver, also known as "Virtual FAT" in Microsoft's documentation. Interestingly, the VFAT driver actually appeared before Windows 95, in Windows for Workgroups 3.11, but was only used for implementing 32-bit file access and did not support long file names.
In Windows NT, support for long filenames on FAT started from version 3.5. OS/2 added long filename support to FAT using extended attributes (EA) before the introduction of VFAT; thus, VFAT long filenames are invisible to OS/2, and EA long filenames are invisible to Windows.
FAT32
In order to overcome size limit of FAT16, while at the same time allowing DOS real mode code to handle the format, and without reducing available conventional memory unnecessarily, Microsoft implemented a next generation, known as FAT32. Cluster values are represented by 32-bit numbers, of which 28 bits are used to hold the cluster number, for a maximum of approximately 268 million (228) clusters. This allows for drive sizes of up to 8 terabytes with 32KB clusters, but the boot sector uses a 32-bit field for the sector count, limiting volume size to 2 TB on a hard disk with 512 byte sectors.
On Windows 95/98, due to the version of Microsoft's SCANDISK utility included with these operating systems being a 16-bit application, the FAT structure is not allowed to grow beyond around 4.2 million (< 222) clusters, placing the volume limit at 127.53 GiB. A limitation in original versions of Windows 98/98SE's Fdisk utility causes it to incorrectly report disk sizes over 64 GB. A corrected version is available from Microsoft, but it cannot partition drives larger than 512GB. The Windows 2000/XP installation program and filesystem creation tool imposes a limitation of 32 GB However, both systems can read and write to FAT32 file systems of any size. This limitation is by design and according to Microsoft was imposed because many tasks on a very large FAT32 file system become slow and inefficient. This limitation can be bypassed by using third-party formatting utilities. Windows Me supports the FAT32 file system without any limits. However, similarly to Windows 95/98/98SE there is no native support for 48-bit LBA in Windows ME, meaning that the maximum disk size for ATA disks is 127.6 GiB, the maximum size of an ATA disk using the previous long-standard 28-bit LBA.
FAT32 was introduced with Windows 95 OSR2, although reformatting was needed to use it, and DriveSpace 3 (the version that came with Windows 95 OSR2 and Windows 98) never supported it. Windows 98 introduced a utility to convert existing hard disks from FAT16 to FAT32 without loss of data. In the NT line, native support for FAT32 arrived in Windows 2000. A free FAT32 driver for Windows NT 4.0 was available from Winternals, a company later acquired by Microsoft. Since the acquisition the driver is no longer officially available.
The maximum possible size for a file on a FAT32 volume is 4 GB minus 1 byte (232−1 bytes). Video applications, large databases, and some other software easily exceed this limit. Larger files require another formatting type such as NTFS.
Fragmentation
The FAT file system does not contain mechanisms which prevent newly written files from becoming scattered across the partition. Other file systems, like HPFS, use free space bitmaps that indicate used and available clusters, which could then be quickly looked up in order to find free contiguous areas (improved in exFAT). Another solution is the linkage of all free clusters into one or more lists (as is done in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead to performance penalties with large disks.
In fact, computing free disk space on FAT is one of the most resource intensive operations, as it requires reading the entire FAT linearly. A possible justification suggested by Microsoft's Raymond Chen for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a "DIR" operation, which always displays the free disk space as the last line. Displaying this line took longer and longer as the number of clusters increased.
The High Performance File System (HPFS) divides disk space into bands, which have their own free space bitmap, where multiple files opened for simultaneous write could be expanded separately.
Some of the perceived problems with fragmentation resulted from operating system and hardware limitations.
The single-tasking DOS and the traditionally single-tasking PC hard disk architecture (only 1 outstanding input/output request at a time, no DMA transfers) did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks.
Similarly, write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a crash, made easier by the lack of hardware protection between applications and the system.
MS-DOS also did not offer a system call which would allow applications to make sure a particular file has been completely written to disk in the presence of deferred writes (cf. fsync in Unix or DosBufReset in OS/2). Disk caches on MS-DOS were operating on disk block level and were not aware of higher-level structures of the file system. In this situation, cheating with regard to the real progress of a disk operation was most dangerous.
Modern operating systems have introduced these optimizations to FAT partitions, but optimizations can still produce unwanted artifacts in case of a system crash. A Windows NT system will allocate space to files on FAT in advance, selecting large contiguous areas, but in case of a crash, files which were being appended will appear larger than they were ever written into, with dozens of random kilobytes at the end.
With the large cluster sizes, 16 or 32K, forced by larger FAT32 partitions, the external fragmentation becomes somewhat less significant, and internal fragmentation, i.e. disk space waste (since files are rarely exact multiples of cluster size), starts to be a problem as well, especially when there are a great many small files.
Design
Overview
The following is an overview of the order of structures in a FAT partition or disk:
Contents Boot
Sector FS Information
Sector
(FAT32 only) More reserved
sectors
(optional) File
Allocation
Table #1 File
Allocation
Table #2 Root
Directory
(FAT12/16 only) Data Region (for files and directories) ...
(To end of partition or disk)
Size in sectors (number of reserved sectors) (number of FATs)*(sectors per FAT) (number of root entries*32)/Bytes per sector NumberOfClusters*SectorsPerCluster
A FAT file system is composed of four different sections.
1. The Reserved sectors, located at the very beginning. The first reserved sector (sector 0) is the Boot Sector (aka Partition Boot Record). It includes an area called the BIOS Parameter Block (with some basic file system information, in particular its type, and pointers to the location of the other sections) and usually contains the operating system's boot loader code. The total count of reserved sectors is indicated by a field inside the Boot Sector. Important information from the Boot Sector is accessible through an operating system structure called the Drive Parameter Block in DOS and OS/2. For FAT32 file systems, the reserved sectors include a File System Information Sector at sector 1 and a Backup Boot Sector at Sector 6.
2. The FAT Region. This typically contains two copies (may vary) of the File Allocation Table for the sake of redundancy checking, although the extra copy is rarely used, even by disk repair utilities. These are maps of the Data Region, indicating which clusters are used by files and directories.
3. The Root Directory Region. This is a Directory Table that stores information about the files and directories located in the root directory. It is only used with FAT12 and FAT16, and imposes on the root directory a fixed maximum size which is pre-allocated at creation of this volume. FAT32 stores the root directory in the Data Region, along with files and other directories, allowing it to grow without such a constraint. Thus, for FAT32, the Data Region starts here.
4. The Data Region. This is where the actual file and directory data is stored and takes up most of the partition. The size of files and subdirectories can be increased arbitrarily (as long as there are free clusters) by simply adding more links to the file's chain in the FAT. Note however, that files are allocated in units of clusters, so if a 1 KB file resides in a 32 KB cluster, 31 KB are wasted. FAT32 typically commences the Root Directory Table in cluster number 2: the first cluster of the Data Region.
FAT uses little endian format for entries in the header and the FAT(s).
Boot Sector
It is important to note that the first sector on a device isn't necessarily the boot sector. For partitioned devices (such as hard drives), the first sector is the Master Boot Record. On un-partitioned devices (eg. floppy disk) the first sector is the Volume Boot Record.
Common structure of the first 36 bytes used by all FAT versions:
Byte Offset Length (bytes) Description
0x00 3 Jump instruction. This instruction will be executed and will skip past the rest of the (non-executable) header if the partition is booted from. See Volume Boot Record. If the jump is two-byte near jmp it is followed by a NOP instruction.
0x03 8 OEM Name (padded with spaces). This value determines in which system disk was formatted. MS-DOS checks this field to determine which other parts of the boot record can be relied on. Common values are IBM 3.3 (with two spaces between the "IBM" and the "3.3"), MSDOS5.0, MSWIN4.1 and mkdosfs.
0x0b 2 Bytes per sector. A common value is 512, especially for file systems on IDE (or compatible) disks. The BIOS Parameter Block starts here.
0x0d 1 Sectors per cluster. Allowed values are powers of two from 1 to 128. However, the value must not be such that the number of bytes per cluster becomes greater than 32
0x0e 2 Reserved sector count. The number of sectors before the first FAT in the file system image. Should be 1 for FAT12/FAT16. Usually 32 for FAT32.
0x10 1 Number of file allocation tables. Almost always 2.
0x11 2 Maximum number of root directory entries. Only used on FAT12 and FAT16, where the root directory is handled specially. Should be 0 for FAT32. This value should always be such that the root directory ends on a sector boundary (i.e. such that its size becomes a multiple of the sector size). 224 is typical for floppy disks.
0x13 2 Total sectors (if zero, use 4 byte value at offset 0x20)
0x15 1 Media descriptor
0xF0 3.5" Double Sided, 80 tracks per side, 18 or 36 sectors per track (1.44MB or 2.88MB). 5.25" Double Sided, 80 tracks per side, 15 sectors per track (1.2MB). Used also for other media types.
0xF8 Fixed disk (i.e. Hard disk).
0xF9 3.5" Double sided, 80 tracks per side, 9 sectors per track (720K). 5.25" Double sided, 80 tracks per side, 15 sectors per track (1.2MB)
0xFA 5.25" Single sided, 80 tracks per side, 8 sectors per track (320K)
0xFB 3.5" Double sided, 80 tracks per side, 8 sectors per track (640K)
0xFC 5.25" Single sided, 40 tracks per side, 9 sectors per track (180K)
0xFD 5.25" Double sided, 40 tracks per side, 9 sectors per track (360K). Also used for 8".
0xFE 5.25" Single sided, 40 tracks per side, 8 sectors per track (160K). Also used for 8".
0xFF 5.25" Double sided, 40 tracks per side, 8 sectors per track (320K)
Same value of media descriptor should be repeated as first byte of each copy of FAT. Certain operating systems (MSX-DOS version 1.0) ignore boot sector parameters altogether and use media descriptor value from the first byte of FAT to determine file system parameters.
0x16 2 Sectors per File Allocation Table for FAT12/FAT16
0x18 2 Sectors per track
0x1a 2 Number of heads
0x1c 4 Count of hidden sectors preceding the partition that contains this FAT volume. This field should always be zero on media that are not partitioned.
0x20 4 Total sectors (if greater than 65535; otherwise, see offset 0x13)
Extended BIOS Parameter Block
Further structure used by FAT12 and FAT16, also known as Extended BIOS Parameter Block:
Byte Offset Length (bytes) Description
0x24 1 Physical drive number (0x00 for removable media, 0x80 for hard disks)
0x25 1 Reserved ("current head")
In Windows NT bit 0 is a dirty flag to request chkdsk at boot time. bit 1 requests surface scan too.
0x26 1 Extended boot signature. (Should be 0x29. Indicates that the following 3 entries exist.)
0x27 4 ID (serial number)
0x2b 11 Volume Label, padded with blanks (0x20).
0x36 8 FAT file system type, padded with blanks (0x20), e.g.: "FAT12 ", "FAT16 ". This is not meant to be used to determine drive type, however, some utilities use it in this way.
0x3e 448 Operating system boot code
0x1FE 2 Boot sector signature (0x55 0xAA)
The boot sector is portrayed here as found on e.g. an OS/2 1.3 boot diskette. Earlier versions used a shorter BIOS Parameter Block and their boot code would start earlier (for example at offset 0x2b in OS/2 1.1).
Further structure used by FAT32:
Byte Offset Length (bytes) Description
0x24 4 Sectors per file allocation table
0x28 2 FAT Flags (Only used during a conversion from a FAT12/16 volume.)
0x2a 2 Version (Defined as 0)
0x2c 4 Cluster number of root directory start
0x30 2 Sector number of FS Information Sector
0x32 2 Sector number of a copy of this boot sector (0 if no backup copy exists)
0x34 12 Reserved
0x40 1 Physical Drive Number (see FAT12/16 BPB at offset 0x24)
0x41 1 Reserved (see FAT12/16 BPB at offset 0x25)
0x42 1 Extended boot signature. (see FAT12/16 BPB at offset 0x26)
0x43 4 ID (serial number)
0x47 11 Volume Label
0x52 8 FAT file system type: "FAT32 "
0x5a 420 Operating system boot code
0x1FE 2 Boot sector signature (0x55 0xAA)
Exceptions
The implementation of FAT used in MS-DOS for the Apricot PC had a different boot sector layout, to accommodate that computer's non-IBM compatible BIOS. The jump instruction and OEM name were omitted, and the MS-DOS file system parameters (offsets 0x0B - 0x17 in the standard sector) were located at offset 0x50. Later versions of Apricot MS-DOS gained the ability to read and write disks with the standard boot sector in addition to those with the Apricot one.
DOS Plus on the BBC Master 512 did not use conventional boot sectors at all. Data disks omitted the boot sector and began with a single copy of the FAT (the first byte of the FAT was used to determine disk capacity) while boot disks began with a miniature ADFS file system containing the boot loader, followed by a single FAT. It could also access standard PC disks formatted to 180 KB or 360 KB, again using the first byte of the FAT to determine the capacity.
FS Information Sector
The "FS Information Sector" was introduced in FAT32[30] for speeding up access times of certain operations (in particular, getting the amount of free space). It is located at a sector number specified in the boot record at position 0x30 (usually sector 1, immediately after the boot record).
Byte Offset Length (bytes) Description
0x00 4 FS information sector signature (0x52 0x52 0x61 0x41 / "RRaA")
0x04 480 Reserved (byte values are 0x00)
0x1e4 4 FS information sector signature (0x72 0x72 0x41 0x61 / "rrAa")
0x1e8 4 Number of free clusters on the drive, or -1 if unknown
0x1ec 4 Number of the most recently allocated cluster
0x1f0 14 Reserved (byte values are 0x00)
0x1fe 2 FS information sector signature (0x55 0xAA)
File Allocation Table
A partition is divided up into identically sized clusters, small blocks of contiguous space. Cluster sizes vary depending on the type of FAT file system being used and the size of the partition, typically cluster sizes lie somewhere between 2 KB and 32 KB. Each file may occupy one or more of these clusters depending on its size; thus, a file is represented by a chain of these clusters (referred to as a singly linked list). However these clusters are not necessarily stored adjacent to one another on the disk's surface but are often instead fragmented throughout the Data Region.
The File Allocation Table (FAT) is a list of entries that map to each cluster on the partition. Each entry records one of five things:
• the cluster number of the next cluster in a chain
• a special end of clusterchain (EOC) entry that indicates the end of a chain
• a special entry to mark a bad cluster
• a special entry to mark a reserved cluster[citation needed]
• a zero to note that the cluster is unused
Each version of the FAT file system uses a different size for FAT entries. Smaller numbers result in a smaller FAT table, but waste space in large partitions by needing to allocate in large clusters. The FAT12 file system uses 12 bits per FAT entry, thus two entries span 3 bytes. It is consistently little-endian: if you consider the 3 bytes as one little-endian 24-bit number, the 12 least significant bits are the first entry and the 12 most significant bits are the second.
In the FAT32 file system, FAT entries are 32 bits, but only 28 of these are actually used; the 4 most significant bits are reserved.
FAT entry values:
FAT12 FAT16 FAT32 Description
0x000 0x0000 0x00000000 Free Cluster
0x001 0x0001 0x00000001 Reserved value; do not use
0x002–0xFEF 0x0002–0xFFEF 0x00000002–0x0FFFFFEF Used cluster; value points to next cluster
0xFF0–0xFF6 0xFFF0–0xFFF6 0x0FFFFFF0–0x0FFFFFF6 Reserved values; do not use[28].
0xFF7 0xFFF7 0x0FFFFFF7 Bad sector in cluster or reserved cluster
0xFF8–0xFFF 0xFFF8–0xFFFF 0x0FFFFFF8–0x0FFFFFFF Last cluster in file
Note that FAT32 uses only 28 bits of the 32 possible bits. The upper 4 bits are usually zero (as indicated in the table above) but are reserved and should be left untouched.
The first cluster of the Data Region is cluster #2. That leaves the first two entries of the FAT unused. In the first byte of the first entry a copy of the media descriptor is stored. The remaining 8 bits (if FAT16), or 20 bits (if FAT32) of this entry are 1. In the second entry the end-of-cluster-chain marker is stored. The high order two bits of the second entry are sometimes, in the case of FAT16 and FAT32, used for dirty volume management: high order bit 1: last shutdown was clean; next highest bit 1: during the previous mount no disk I/O errors were detected.[31]
Directory table
A directory table is a special type of file that represents a directory (also known as a folder). Each file or directory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the date and time of creation, the address of the first cluster of the file/directory's data and finally the size of the file/directory. Aside from the Root Directory Table in FAT12 and FAT16 file systems, which occupies the special Root Directory Region location, all Directory Tables are stored in the Data Region. The actual number of entries in a directory stored in the Data Region can grow by adding another cluster to the chain in the FAT.
Note that before each entry there can be "fake entries" to support the Long File Name. (See further down the article).
Legal characters for DOS file names include the following:
• Upper case letters A–Z
• Numbers 0–9
• Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the file name, also filenames with space in them could not be used on the DOS command line prior to Windows 95 because of the lack of a suitable escaping system)
• ! # $ % & ' ( ) - @ ^ _ ` { } ~
• Values 128–255
This excludes the following ASCII characters:
• " * / : < > ? \ |
Windows/MSDOS has no shell escape character
• + , . ; = [ ]
They are allowed in long file names only.
• Lower case letters a–z
Stored as A–Z. Allowed in long file names.
• Control characters 0–31
• Value 127 (DEL)
The DOS file names are in the OEM character set.
Directory entries, both in the Root Directory Region and in subdirectories, are of the following format (see also 8.3 filename):
Byte Offset Length Description
0x00 8 DOS file name (padded with spaces)
The first byte can have the following special values:
0x00 Entry is available and no subsequent entry is in use
0x05 Initial character is actually 0xE5.
0x2E 'Dot' entry; either '.' or '..'
0xE5 Entry has been previously erased and is available. File undelete utilities must replace this character with a regular character as part of the undeletion process.
0x08 3 DOS file extension (padded with spaces)
0x0b 1 File Attributes
Bit Mask Description
0 0x01 Read Only
1 0x02 Hidden
2 0x04 System
3 0x08 Volume Label
4 0x10 Subdirectory
5 0x20 Archive
6 0x40 Device (internal use only, never found on disk)
7 0x80 Unused
An attribute value of 0x0F is used to designate a long file name entry.
0x0c 1 Reserved; two bits are used by NT and later versions to encode case information (see below); otherwise 0[32]
0x0d 1 Create time, fine resolution: 10ms units, values from 0 to 199.
0x0e 2 Create time. The hour, minute and second are encoded according to the following bitmap:
Bits Description
15-11 Hours (0-23)
10-5 Minutes (0-59)
4-0 Seconds/2 (0-29)
Note that the seconds is recorded only to a 2 second resolution. Finer resolution for file creation is found at offset 0x0d.
0x10 2 Create date. The year, month and day are encoded according to the following bitmap:
Bits Description
15-9 Year (0 = 1980, 127 = 2107)
8-5 Month (1 = January, 12 = December)
4-0 Day (1 - 31)
0x12 2 Last access date; see offset 0x10 for description.
0x14 2 EA-Index (used by OS/2 and NT) in FAT12 and FAT16, High 2 bytes of first cluster number in FAT32
0x16 2 Last modified time; see offset 0x0e for description.
0x18 2 Last modified date; see offset 0x10 for description.
0x1a 2 First cluster in FAT12 and FAT16. Low 2 bytes of first cluster in FAT32. Entries with the Volume Label flag, subdirectory ".." pointing to root, and empty files with size 0 should have first cluster 0.
0x1c 4 File size in bytes. Entries with the Volume Label or Subdirectory flag set should have a size of 0.
Clusters are numbered from a cluster offset as defined above and the FilestartCluster is in 0x1a. This would mean the first data segment X can be calculated using the Boot Sector fields:
For FAT32
FileStartSector = ReservedSectors(0x0e) + (NumofFAT(0x10) * Sectors2FAT(0x24)) + ((X − 2) * SectorsPerCluster(0x0d))
For FAT16/12
FileStartSector = ReservedSectors(0x0e) + (NumofFAT(0x10) * Sectors2FAT(0x16)) + (MaxRootEntry(0x11) * 32 / BytesPerSector(0x0b)) + ((X − 2) * SectorsPerCluster(0x0d))
Long file names
Long File Names (LFN) are stored on a FAT file system using a trick—adding (possibly multiple) additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS.
Older versions of PC-DOS mistake LFN names in the root directory for the volume label, and are likely to display an incorrect label.
Each phony entry can contain up to 13 UTF-16 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UTF-16 characters.[32]
After the last UTF-16 character, a 0x00 0x00 is added. Other not used characters are filled with 0xFF 0xFF.
LFN entries use the following format:
Byte Offset Length Description
0x00 1 Sequence Number
0x01 10 Name characters (five UTF-16 characters)
0x0b 1 Attributes (always 0x0F)
0x0c 1 Reserved (always 0x00)
0x0d 1 Checksum of DOS file name
0x0e 12 Name characters (six UTF-16 characters)
0x1a 2 First cluster (always 0x0000)
0x1c 4 Name characters (two UTF-16 characters)
If there are multiple LFN entries, required to represent a file name, firstly comes the last LFN entry (the last part of the filename). The sequence number here also has bit 7 (0x40) checked (this means the last LFN entry. However it's the first entry got when reading the directory file). The last LFN entry has the biggest sequence number which decreases in following entries. The first LFN entry has sequence number 1. Bit 8 (0x80) of the sequence number is used to indicate that the entry is deleted.
For example if we have filename "File with very long filename.ext" it would be formatted like this:
Sequence number Entry data
0x43 "me.ext"
0x02 "y long filena"
0x01 "File with ver"
??? Normal 8.3 entry
A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (Note that pFcbName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with spaces (ASCII 0x20) char. For example, "Readme.txt" would be "README TXT".)
unsigned char lfn_checksum(const unsigned char *pFcbName)
{
int i;
unsigned char sum=0;
for (i=11; i; i--)
sum = ((sum & 1) << 7) + (sum >> 1) + *pFcbName++;
return sum;
}
If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice-versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions such as XP. Instead, two bits in byte 0x0c of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support this. This creates a backwards-compatibility problem with older Windows versions (95, 98, ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.
Third-party extensions
Before Microsoft added support for long filenames and creation/access time stamps, bytes 0x0C–0x15 of the directory entry were used by alternative operating systems to store additional metadata. These included:
Byte Offset Length System Description
0x0C 2 RISC OS
File type, 0x000 - 0xFFF
0x0C 1 DOS Plus
User-defined file attributes F1-F4
Bit Mask Description
7 0x80 F1
6 0x40 F2
5 0x20 F3
4 0x10 F4
0x0C 1 MSX-DOS 2
For a deleted file, the original first character of the filename.
0x0D 1 DR-DOS
For a deleted file, the original first character of the filename.
0x0E 2 DR-DOS and FlexOS
Encrypted file password
0x0E 2 ANDOS
File address in the memory
0x10 4 DR-DOS 7 For a deleted file, its original file time and date; deleted files have their normal time and date fields set to the time of deletion
0x12 2 DR-DOS 6 and FlexOS File owner ID
0x14 2 DR-DOS and FlexOS File permissions bitmap (execute permissions are only used by FlexOS):
Bit Mask Description
0 0x0001 Owner delete requires password
1 0x0002 Owner execute requires password
2 0x0004 Owner write requires password
3 0x0008 Owner read requires password
4 0x0010 Group delete requires password
5 0x0020 Group execute requires password
6 0x0040 Group write requires password
7 0x0080 Group read requires password
8 0x0100 World delete requires password
9 0x0200 World execute requires password
10 0x0400 World write requires password
11 0x0800 World read requires password
FAT licensing
Microsoft applied for, and was granted, a series of patents for key parts of the FAT file system in the mid-1990s. Being almost universally compatible and well-understood, FAT is frequently chosen as an interchange format for flash media used in digital cameras and PDAs.
On December 3, 2003 Microsoft announced[34] it would be offering licenses for use of its FAT specification and "associated intellectual property", at the cost of a US$0.25 royalty per unit sold, with a $250,000 maximum royalty per license agreement.[35]
To this end, Microsoft cited four patents on the FAT file system as the basis of its intellectual property claims. All four pertain to long-filename extensions to FAT first seen in Windows 95:
• U.S. Patent 5,745,902 - Method and system for accessing a file using file names having different file name formats. Filed July 6, 1992. This covered a means of generating and associating a short, 8.3 filename with long one (for example, "Microsoft.txt" with "MICROS~1.TXT") and a means of enumerating conflicting short filenames (for example, "MICROS~2.TXT" and "MICROS~3.TXT"). It is unclear whether this patent would cover an implementation of FAT without explicit long filename capabilities. Hard links in Unix file systems do not appear to be prior art: deleting a FAT file via its long name will also remove its short name. Renaming a file to a "short" name also updates the long file name for coherency; similarly, renaming a file to a "long" name will allocate a new "short" name. In NTFS, hard links and dual names are separate concepts and each hard link has two names. Finally, at the API level, both names are always provided together when a directory lookup is requested from the system; they do not appear as two separate files and do not have to be "matched" to determine unique files.
• U.S. Patent 5,579,517 - Common name space for long and short filenames. Filed for on 1995-04-24. This covers the method of chaining together multiple consecutive 8.3 named directory entries to hold long filenames, with some of the entries specially marked to prevent their confusing older, long filename-unaware FAT implementations.
o The Public Patent Foundation successfully challenged this patent; the claims were rejected[36] on 2004-09-14, due to prior disclosure[37] of the claimed techniques in patents U.S. Patent 5,307,494 and U.S. Patent 5,367,671. This decision was later overturned by the Patent Office on 2006-01-10.
• U.S. Patent 5,758,352 - Common name space for long and short filenames. Filed on 1996-09-05. This is very similar to 5,579,517.
o The Public Patent Foundation successfully challenged this patent (USPTO); The USPTO rejected this patent on 2005-10-05, on the grounds that "the six assignees names were incorrect".[38][39] This decision was also later overturned by the Patent Office on 2006-01-10.
• U.S. Patent 6,286,013 - Method and system for providing a common name space for long and short file names in an operating system. Filed on 1997-01-28. This makes claims on the methods used when Windows 95, Windows 98 and Windows Me expose long filenames to their MS-DOS compatibility layer. It does not appear to affect any non-Microsoft FAT implementations.
Many technical commentators[who?] have concluded that these patents only cover FAT implementations that include support for long filenames, and that removable solid state media and consumer devices only using short names would be unaffected.
Additionally, in the document "Microsoft Extensible Firmware Initiative FAT 32 File System Specification, FAT: General Overview of On-Disk Format" published by Microsoft (version 1.03, 2000-12-06), Microsoft specifically grants a number of rights, which many readers have interpreted as permitting operating system vendors to implement FAT.
Microsoft is not the only company to have applied for patents for parts of the FAT file system. Other patents affecting FAT include:
• U.S. Patent 5,367,671 - System for accessing extended object attribute (EA) data through file name or EA handle linkages in path tables. Filed on 1990-09-25 by Barry A. Feigenbaum and Felix Miro of IBM, this makes claims on the methods used by OS/2, Windows NT, and Linux for storing extended attribute data in the "EA DATA. SF" file.
Appeal
As there was widespread call for these patents to be re-examined, the Public Patent Foundation (PUBPAT) submitted evidence to the US Patent and Trade Office (USPTO) disputing the validity of these patents, including prior art references from Xerox and IBM. The USPTO acknowledged that the evidence raised "substantial new question[s] of patentability," and opened an investigation into the validity of Microsoft's FAT patents.[40]
On 2004-09-30 the USPTO rejected all claims of U.S. Patent 5,579,517, based primarily on evidence provided by PUBPAT. Dan Ravicher, the foundation's executive director, said, "The Patent Office has simply confirmed what we already knew for some time now, Microsoft's FAT patent is bogus."
According to the PUBPAT press release, "Microsoft still has the opportunity to respond to the Patent Office's rejection. Typically, third party requests for re-examination, like the one filed by PUBPAT, are successful in having the subject patent either narrowed or completely revoked roughly 70% of the time."
On 2005-10-05 the Patent Office announced that, following the re-examination process, it had again rejected all claims of patent 5,579,517, and it additionally found U.S. Patent 5,758,352 invalid on the grounds that the patent had incorrect assignees.
Finally, on 2006-01-10 the Patent Office ruled that features of Microsoft's implementation of the FAT system were "novel and non-obvious", reversing both earlier non-final decisions
Patent infringement lawsuit
In February 2009, Microsoft filed a patent infringement lawsuit against TomTom alleging that the device maker's products infringe on patents related to FAT32 filesystem. As some TomTom products are based on Linux, this marked the first time that Microsoft tried to enforce its patents against the Linux platformThe lawsuit was settled out of court the following month with an agreement that Microsoft be given access to four of TomTom's patents, that TomTom will drop support for the FAT32 filesystem from its products, and that in return Microsoft not seek legal action against TomTom for the five year duration of the settlement agreement
NTFS (New Technology File System)is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7.
NTFS supersedes the FAT file system as the preferred file system for Microsoft’s Windows operating systems. NTFS has several improvements over FAT and HPFS (High Performance File System) such as improved support for metadata and the use of advanced data structures to improve performance, reliability, and disk space utilization, plus additional extensions such as security access control lists (ACL) and file system journaling.
History
In the mid 1980s, Microsoft and IBM formed a joint project to create the next generation graphical operating system. The result of the project was OS/2, but eventually Microsoft and IBM disagreed on many important issues and separated. OS/2 remained an IBM project. Microsoft started to work on Windows NT. The OS/2 filesystem HPFS contained several important new features. When Microsoft created their new operating system, they borrowed many of these concepts for NTFS. Probably as a result of this common ancestry, HPFS and NTFS share the same disk partition identification type code (07). Sharing an ID is unusual since there were dozens of available codes, and other major filesystems have their own code. FAT has more than nine (one each for FAT12, FAT16, FAT32, etc.). Algorithms which identify the filesystem in a partition type 07 must perform additional checks. It is also clear that NTFS owes some of its architectural design to Files-11 used by VMS. This is hardly surprising since Dave Cutler was the main lead for both VMS and Windows NT.
Versions
NTFS has five released versions:
• v1.0 with NT 3.1 released mid-1993
• v1.1 with NT 3.5 released fall 1994
• v1.2 with NT 3.51 (mid-1995) and NT 4 (mid-1996) (occasionally referred to as "NTFS 4.0", because OS version is 4.0)
• v3.0 from Windows 2000 ("NTFS V5.0")
• v3.1 from Windows XP (autumn 2001; "NTFS V5.1), Windows Server 2003 (spring 2003; occasionally "NTFS V5.2), Windows Vista (mid-2005) (occasionally "NTFS V6.0), Windows Server 2008, Windows 7.
V1.0 and V1.1 (and newer) are incompatible: that is, volumes written by NT 3.5x cannot be read by NT 3.1 until an update on the NT 3.5x CD is applied to NT 3.1, which also adds FAT long file name support. V1.2 supports compressed files, named streams, ACL-based security, etc. V3.0 added disk quotas, encryption, sparse files, reparse points, update sequence number (USN) journaling, the $Extend folder and its files, and reorganized security descriptors so that multiple files which use the same security setting can share the same descriptor. V3.1 expanded the Master File Table (MFT) entries with redundant MFT record number (useful for recovering damaged MFT files).
Windows Vista introduced Transactional NTFS, NTFS symbolic links, partition shrinking and self-healing functionality though these features owe more to additional functionality of the operating system than the file system itself.
Features
NTFS v3.0 includes several new features over its predecessors: sparse file support, disk usage quotas, reparse points, distributed link tracking, and file-level encryption, also known as the Encrypting File System (EFS).
USN Journal
The USN Journal (Update Sequence Number Journal) is a system management feature that records changes to all files, streams and directories on the volume, as well as their various attributes and security settings.
It is a critical functionality of NTFS (a feature that FAT/FAT32 does not provide) for ensuring that its internal complex data structures (notably the volume allocation bitmap, or data moves performed by the defragmentation API, the modifications to MFT records such as moves of some variable-length attributes stored in MFT records and attribute lists, or updates to the shared security descriptors, or to the boot sector and its local mirrors where the last USN transaction committed on the volume is stored) and indices (for directories and security descriptors) will remain consistent in case of system crashes, and allow easy rollback of uncommitted changes to these critical data structures when the volume will be remounted.
In later versions of Windows, the USN journal has extended to trace the state of other transactional operations on other parts of the NTFS filesystem, such as the VSS shadow copies of system files with copy-on-write semantics, or the implementation of Transactional NTFS and of distributed filesystems (see below).
Hard links and short filenames
Originally included to support the POSIX subsystem in Windows NT hard links are similar to directory junctions, but used for files instead of directories. Hard links can only be applied to files on the same volume since an additional filename record is added to the file's MFT record. Short (8.3) filenames are also implemented as additional filename records that don't have separate directory entries. Hard links also have the behavior that changing the size or attributes of a file may not update the directory entries of other links until they are opened.
Alternate data streams (ADS)
Alternate data streams allow more than one data stream to be associated with a filename, using the filename format "filename:streamname" (e.g., "text.txt:extrastream"). Alternate streams are not listed in Windows Explorer, and their size is not included in the file's size. Only the main stream of a file is preserved when it is copied to a FAT-formatted USB drive, attached to an e-mail, or uploaded to a website. As a result, using alternate streams for critical data may cause problems. NTFS Streams were introduced in Windows NT 3.1, to enable Services for Macintosh (SFM) to store Macintosh resource forks. Although current versions of Windows Server no longer include SFM, third-party Apple Filing Protocol (AFP) products (such as Group Logic's ExtremeZ-IP) still use this feature of the file system.
Malware has used alternate data streams to hide its code; ome malware scanners and other special tools now check for data in alternate streams. Microsoft provides a tool called Streams to allow users to view streams on a selected volume.
Very small ADS are also added within Internet Explorer (and now also other browsers) to mark files that have been downloaded from external sites: they may be unsafe to run locally and the local shell will require confirmation from the user before opening them. When the user indicates that he no longer wants this confirmation dialog, this ADS is simply dropped from the MFT entry for downloaded files.
Some media players have also tried to use ADS to store custom metadata to media files, in order to organize the collections, without modifying the effective data content of the media files themselves (using embedded tags when they are supported by the media file formats such as MPEG and OGG containers); these metadata may be displayed in the Windows Explorer as extra information columns, with the help of a registered Windows Shell extension that can parse them, but most media players prefer to use their own separate database instead of ADS for storing these information (notably because ADS are visible to all users of these files, instead of being managed with distinct per-user security settings and having their values defined according to user preferences).
Sparse files
Sparse files are files which contain sparse data sets, data mostly filled with zeros. Database applications, for instance, sometimes use sparse files. Because of this, Microsoft has implemented support for efficient storage of sparse files by allowing an application to specify regions of empty (zero) data. An application that reads a sparse file reads it in the normal manner with the file system calculating what data should be returned based upon the file offset. As with compressed files, the actual sizes of sparse files are not taken into account when determining quota limits.
File compression
NTFS compresses files using a variant of the LZ77 algorithm. Although read–write access to compressed files is transparent, Microsoft recommends avoiding compression on server systems and/or network shares holding roaming profiles because it puts a considerable load on the processor. Single-user systems with limited hard disk space can benefit from NTFS compression. The slowest link in a computer is not the CPU but the speed of the hard drive, so NTFS compression allows the limited, slow storage space to be better used, in terms of both space and (often) speed. NTFS compression can also serve as a replacement for sparse files when a program (e. g. a download manager) is not able to create files without content as sparse files.
Volume Shadow Copy
The Volume Shadow Copy Service (VSS) keeps historical versions of files and folders on NTFS volumes by copying old, newly-overwritten data to shadow copy (copy-on-write). The old file data is overlaid on the new when the user requests a revert to an earlier version. This also allows data backup programs to archive files currently in use by the file system. On heavily loaded systems, Microsoft recommends setting up a shadow copy volume on a separate disk. To ensure consistent recovery in case of system crashes, the VSS also uses the USN journal to mark local transactions and ensure that committed changes to the system files will be effectively recovered after system restart when the NTFS volume will be remounted, or safely rolled back to an older version if the new version was not fully recorded before actual commits before closing the modified file. However, these VSS shadows are not coordinated globally on multiple files or volumes, except when using a transaction coordinator (see below). They can just be used to ensure that older versions will remain accessible during backup operations, for getting consistent system images in those backups.
Transactional NTFS
As of Windows Vista, applications can use Transactional NTFS to group changes to files together into a transaction. The transaction will guarantee that all changes happen, or none of them do, and it will guarantee that applications outside the transaction will not see the changes until the precise instant they are committed. It uses the similar techniques as those used for Volume Shadow Copies (i.e. copy-on-write) to ensure that overwritten data can be safely rolled back, and the UFS journaling log to mark the transactions that have still not been committed, or those that have been committed but still not fully applied (in case of system crash during a commit by one of the participants).
However, in a transactional-enabled filesystem, this can be used temporarily for all other files needed for any kind of partition, as long as the transaction is not committed, than just system files that are permanently marked with copy-on-write semantics and that are implicitly modified within their own local transactions.
The copy-on-write technique is however modified in order to allow efficient rollbacks and avoid the creation of fragmentation in the filesystem used by possibly many participants: the old data may be not overwritten immediately but kept where it is (notably when it is currently locked by someone else for consistent reads in its own transactions); in that case, only the new uncommitted data is kept in a temporary shadow (rather than the copy-on-write old data), that will be finally applied using normal VSS copy-on-write when the transaction will be committed by the writer. In addition, these temporary shadows for new data, only seen by the participating processes that have their own uncommitted data, are not necessarily immediately written to disk, but may just be maintained in memory or swapped out for later commits. Transaction NTFS does not restrict transactions to just the local NTFS volume, but also includes other transactional data or operations in other locations such as data stored in separate volumes, the local registry, or SQL databases, or the current states of system services or remote services.
These transactions are coordinated network-wide with all participants using a specific service, the Distributed Transactions Coordinator (DTC), to ensure that all participants will receive same commit state, and to transport the changes that have been validated by any participant (so that the others can invalidate their local caches for old data or rollback their ongoing uncommitted changes). Transactional NTFS allows, for example, the creation of network-wide consistent distributed filesystems, including with their local live or offline caches.
Encrypting File System (EFS)
EFS provides strong and user-transparent encryption of any file or folder on an NTFS volume. EFS works in conjunction with the EFS service, Microsoft's CryptoAPI and the EFS File System Run-Time Library (FSRTL). EFS works by encrypting a file with a bulk symmetric key (also known as the File Encryption Key, or FEK), which is used because it takes a relatively small amount of time to encrypt and decrypt large amounts of data than if an asymmetric key cipher is used. The symmetric key that is used to encrypt the file is then encrypted with a public key that is associated with the user who encrypted the file, and this encrypted data is stored in an alternate data stream of the encrypted file. To decrypt the file, the file system uses the private key of the user to decrypt the symmetric key that is stored in the file header. It then uses the symmetric key to decrypt the file. Because this is done at the file system level, it is transparent to the user. Also, in case of a user losing access to their key, support for additional decryption keys has been built in to the EFS system, so that a recovery agent can still access the files if needed. NTFS-provided encryption and compression are mutually exclusive—NTFS can be used for one and a third-party tool for the other.
The support of EFS is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Quotas
Disk quotas were introduced in NTFS v3. They allow the administrator of a computer that runs a version of Windows that supports NTFS to set a threshold of disk space that users may use. It also allows administrators to keep track of how much disk space each user is using. An administrator may specify a certain level of disk space that a user may use before they receive a warning, and then deny access to the user once they hit their upper limit of space. Disk quotas do not take into account NTFS's transparent file-compression, should this be enabled. Applications that query the amount of free space will also see the amount of free space left to the user who has a quota applied to them.
The support of disk quotas is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Reparse points
This feature was introduced in NTFS v3 Reparse points are used by associating a reparse tag in the user space attribute of a file or directory. When the object manager (see Windows NT line executive) parses a file system name lookup and encounters a reparse attribute, it knows to reparse the name lookup, passing the user controlled reparse data to every file system filter driver that is loaded into Windows 2000. Each filter driver examines the reparse data to see whether it is associated with that reparse point, and if that filter driver determines a match, then it intercepts the file system call and executes its special functionality. Reparse points are used to implement Volume Mount Points, Directory Junctions, Hierarchical Storage Management, Native Structured Storage, Single Instance Storage, and Symbolic Links
Volume mount points
Volume mount points are similar to Unix mount points, where the root of another file system is attached to a directory In NTFS, this allows additional file systems to be mounted without requiring a separate drive letter (such as C: or D:) for each
Once a volume has been mounted on top of an existing directory of another volume, the contents previously listed in that directory become invisible and are replaced by the content of the root directory of the mounted volume. The mounted volume could still have its own drive letter assigned separately. The file system does not allow volumes to be mutually mounted on each other. Volume mount points can be made to be either persistent (remounted automatically after system reboot) or not persistent (must be manually remounted after reboot
Mounted volumes may use other file systems than just NTFS; notably they may be remote shared directories, possibly with their own security settings and remapping of access rights according to the remote file system policy
Directory junctions
Similar to volume mount points, however directory junctions reference other directories in the file system instead of other volumes. For instance, the directory C:\exampledir with a directory junction attribute that contains a link to D:\linkeddir will automatically refer to the directory D:\linkeddir when it is accessed by a user-mode application. This function is conceptually similar to symbolic links to directories in Unix, except that the target in NTFS must always be another directory (typical Unix file systems allow the target of a symbolic link to be any type of file) and have the semantics of a hardlink (i.e., they must be immediately resolvable when they are createdDirectory joins (which can be created with the command MKLINK /J junctionName targetDirectory and removed with RMDIR junctionName from a console prompt) are persistent, and resolved on the server side as they share the same security realm of the local system or domain on which the parent volume is mounted and the same security settings for its contents as the content of the target directory; however the junction itself may have distinct security settings. Unlinking a directory junction join does not delete files in the target directory Note that some directory junctions are installed by default on Windows Vista, for compatibility with previous versions of Windows, such as Documents and Settings in the root directory of the system drive, which links to the Users physical directory in the root directory of the same volume. However they are hidden by default, and their security settings are set up so that the Windows Explorer will refuse to open them from within the Shell or in most applications, except for the local built-in SYSTEM user or the local Administrators group (both user accounts are used by system software installers). This additional security restriction has probably been made to avoid users of finding apparent duplicate files in the joined directories and deleting them by error, because the semantics of directory junctions is not the same as hardlinks: the reference counting is not used on the target contents and not even on the referenced container itself
Directory junctions are soft links (they will persist even if the target directory is removed), working as a limited form of symbolic links (with an additional restriction on the location of the target), but it is an optimized version which allows faster processing of the reparse point with which they are implemented, with less overhead than the newer NTFS symbolic links, and can be resolved on the server side (when they are found in remote shared directories
Symbolic links
Symbolic links (or soft links) were introduced in Windows Vista. Symbolic links are resolved on the client side. So when a symbolic link is shared, the target is subject to the access restrictions on the client, and not the server
Symbolic links can be created either to files (created with MKLINK symLink targetFilename) or to directories (created with MKLINK /D symLinkD targetDirectory), but the semantic of the link must be provided with the created link. The target however need not exist or be available when the symbolic link is created: when the symbolic link will be accessed and the target will be checked for availability, NTFS will also check if it has the correct type (file or directory); it will return a not-found error if the existing target has the wrong type
They can also reference shared directories on remote hosts or files and subdirectories within shared directories: their target is not mounted immediately at boot, but only temporarily on demand while opening them with the OpenFile() or CreateFile() API. Their definition is persistent on the NTFS volume where they are created (all types of symbolic links can be removed as if they were files, using DEL symLink from a command line prompt or batch
Single Instance Storage (SIS)
When there are several directories that have different, but similar, files, some of these files may have identical content. Single instance storage allows identical files to be merged to one file and create references to that merged file. SIS consists of a file system filter that manages copies, modification and merges to files; and a user space service (or groveler) that searches for files that are identical and need merging. SIS was mainly designed for remote installation servers as these may have multiple installation images that contain many identical files; SIS allows these to be consolidated but, unlike for example hard links, each file remains distinct; changes to one copy of a file will leave others unaltered. This is similar to copy-on-write, which is a technique by which memory copying is not really done until one copy is modified.
Hierarchical Storage Management (HSM)
Hierarchical Storage Management is a means of transferring files that are not used for some period of time to less expensive storage media. When the file is next accessed, the reparse point on that file determines that it is needed and retrieves it from storage
Native Structured Storage (NSS)
NSS was an ActiveX document storage technology that has since been discontinued by Microsoft It allowed ActiveX Documents to be stored in the same multi-stream format that ActiveX uses internally. An NSS file system filter was loaded and used to process the multiple streams transparently to the application, and when the file was transferred to a non-NTFS formatted disk volume it would also transfer the multiple streams into a single stream.
Interoperability
Details on the implementation's internals are not released, which makes it difficult for third-party vendors to provide tools to handle NTFS.
Linux
The ability to read and write to NTFS is provided by the NTFS-3G driver. It is included in most Linux distributions. Other outdated and mostly read-only solutions exist as well:
• Linux kernel 2.2: Kernel versions 2.2.0 and later include the ability to read NTFS partitions
• Linux kernel 2.6: Kernel versions 2.6.0 and later contain a driver written by Anton Altaparmakov (University of Cambridge) and Richard Russon. It supports file read, overwrite and resize.
• NTFSMount: A read/write userspace NTFS driver. It provides read-write access to NTFS, excluding writing compressed and encrypted files, changing file ownership, and access rights.
• Tuxera NTFS: High-performance read/write commercial kernel driver, mainly targeted for embedded devices from Tuxera Ltd which also develops the open source NTFS-3G driver.
• NTFS for Linux: A commercial driver with full read/write support available as free and non-free download(s) from Paragon Software Group.
• Captive NTFS: A 'wrapping' driver which uses Windows' own driver, ntfs.sys.
Note that all three userspace drivers, namely NTFSMount, NTFS-3G and Captive NTFS, are built on the Filesystem in Userspace (FUSE), a Linux kernel module tasked with bridging userspace and kernel code to save and retrieve data. All drivers listed above (except Tuxera NTFS and Paragon NTFS for Linux) are open source (GPL). Due to the complexity of internal NTFS structures, both the built-in 2.6.14 kernel driver and the FUSE drivers disallow changes to the volume that are considered unsafe, to avoid corruption.
Mac OS X
Mac OS X v10.3 and later include read-only support for NTFS-formatted partitions. The GPL-licensed NTFS-3G also works on Mac OS X through FUSE and allows reading and writing to NTFS partitions. A performance enhanced commercial version, called Tuxera NTFS for Mac, is also available from the NTFS-3G developers. NTFS write support has been discovered in Mac OS X 10.6, but has not been activated as of version 10.6.1, although hacks do exist to enable the functionality.
Microsoft Windows
While the different NTFS versions are for the most part fully forward- and backward-compatible, there are technical considerations for mounting newer NTFS volumes in older versions of Microsoft Windows. This affects dual-booting, and external portable hard drives.
For example, attempting to use an NTFS partition with "Previous Versions" (a.k.a. Volume Shadow Copy) on an operating system that doesn't support it, will result in the contents of those previous versions being lost.
Others
eComStation, and FreeBSD offer read-only NTFS support (there is a beta NTFS driver that allows write/delete for eComStation, but is generally considered unsafe). A free third-party tool for BeOS, which was based on NTFS-3G, allows full NTFS read and write. NTFS-3G also works on Linux, Mac OS X, FreeBSD, NetBSD, Solaris, QNX and Haiku, in addition to Linux, through FUSE A free for personal use read/write driver for MS-DOS called "NTFS4DOS" also exists.
Compatibility with FAT
Microsoft currently provides a tool (convert.exe) to convert HPFS (only on Windows NT 3), FAT16 and, on Windows 2000 and higher, FAT32 to NTFS, but not the other way around.
Resizing
Various third-party tools are all capable of safely resizing NTFS partitions. Microsoft added the ability to shrink or expand a partition with Windows Vista, but this capability is limited because it will not relocate page file fragments or files that have been marked as unmovable. So shrinking requires relocating or disabling any page file, the index of Windows Search, and any Shadow Copy used by System Restore. Using a 3rd-party tool is an easier option.
Universal time
For historical reasons, the versions of Windows that do not support NTFS all keep time internally as local zone time, and therefore so do all file systems other than NTFS that are supported by current versions of Windows. However, Windows NT and its descendants keep internal timestamps as UTC and make the appropriate conversions for display purposes. Therefore, NTFS timestamps are in UTC. This means that when files are copied or moved between NTFS and non-NTFS partitions, the OS needs to convert timestamps on the fly. But if some files are moved when daylight saving time (DST) is in effect, and other files are moved when standard time is in effect, there can be some ambiguities in the conversions. As a result, especially shortly after one of the days on which local zone time changes, users may observe that some files have timestamps that are incorrect by one hour. Due to the differences in implementation of DST between the northern and southern hemispheres, this can result in a potential timestamp error of up to 4 hours in any given 12 months.
Internals
In NTFS, all file data—file name, creation date, access permissions, and contents—are stored as metadata in the Master File Table. This abstract approach allowed easy addition of file system features during Windows NT's development — an interesting example is the addition of fields for indexing used by the Active Directory software.
NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names, etc.). This means UTF-16 codepoints are supported, but the file system does not check whether a sequence is valid UTF-16 (it allows any sequence of short values, not restricted to those in the Unicode standard).
Internally, NTFS uses B+ trees to index file system data. Although complex to implement, this allows faster file look up times in most cases. A file system journal is used to guarantee the integrity of the file system metadata but not individual files' content. Systems using NTFS are known to have improved reliability compared to FAT file systems. The Master File Table (MFT) contains metadata about every file, directory, and metafile on an NTFS volume. It includes filenames, locations, size, and permissions. Its structure supports algorithms which minimize disk fragmentation. A directory entry consists of a filename and a "file ID" which is the record number representing the file in the Master File Table. The file ID also contains a reuse count to detect stale references. While this strongly resembles the W_FID of Files-11, other NTFS structures radically differ.
Metafiles
NTFS contains several files which define and organize the file system. In all respects, most of these files are structured like any other user file ($Volume being the most peculiar), but are not of direct interest to file system clients. These metafiles define files, back up critical file system data, buffer file system changes, manage free space allocation, satisfy BIOS expectations, track bad allocation units, and store security and disk space usage information. All content is in an unnamed data stream, unless otherwise indicated.
Segment Number
File Name
Purpose
0 $MFT Describes all files on the volume, including file names, timestamps, stream names, and lists of cluster numbers where data streams reside, indexes, security identifiers, and file attributes like "read only", "compressed", "encrypted", etc.
1 $MFTMirr Duplicate of the first vital entries of $MFT, usually 4 entries (4 KB).
2 $LogFile Contains transaction log of file system metadata changes.
3 $Volume Contains information about the volume, namely the volume object identifier, volume label, file system version, and volume flags (mounted, chkdsk requested, requested $LogFile resize, mounted on NT 4, volume serial number updating, structure upgrade request). This data is not stored in a data stream, but in special MFT attributes: If present, a volume object ID is stored in an $OBJECT_ID record; the volume label is stored in a $VOLUME_NAME record, and the remaining volume data is in a $VOLUME_INFORMATION record. Note: volume serial number is stored in file $Boot (below).
4 $AttrDef A table of MFT attributes which associates numeric identifiers with names.
5 . Root directory. Directory data is stored in $INDEX_ROOT and $INDEX_ALLOCATION attributes both named $I30.
6 $Bitmap An array of bit entries: each bit indicates whether its corresponding cluster is used (allocated) or free (available for allocation).
7 $Boot Volume boot record. This file is always located at the first clusters on the volume. It contains bootstrap code (see NTLDR/BOOTMGR) and a BIOS parameter block including a volume serial number and cluster numbers of $MFT and $MFTMirr. $Boot is usually 8192 bytes long.
8 $BadClus A file which contains all the clusters marked as having bad sectors. This file simplifies cluster management by the chkdsk utility, both as a place to put newly discovered bad sectors, and for identifying unreferenced clusters. This file contains two data streams, even on volumes with no bad sectors: an unnamed stream contains bad sectors—it is zero length for perfect volumes; the second stream is named $Bad and contains all clusters on the volume not in the first stream.
9 $Secure Access control list database which reduces overhead having many identical ACLs stored with each file, by uniquely storing these ACLs in this database only (contains two indices $SII: perhaps Security ID Index and $SDH: Security Descriptor Hash which index the stream named $SDS containing actual ACL table).
10 $UpCase A table of unicode uppercase characters for ensuring case insensitivity in Win32 and DOS namespaces.
11 $Extend A filesystem directory containing various optional extensions, such as $Quota, $ObjId, $Reparse or $UsnJrnl.
12 ... 23 Reserved for $MFT extension entries.
usually 24 $Extend\$Quota Holds disk quota information. Contains two index roots, named $O and $Q.
usually 25 $Extend\$ObjId Holds distributed link tracking information. Contains an index root and allocation named $O.
usually 26 $Extend\$Reparse Holds reparse point data (such as symbolic links). Contains an index root and allocation named $R.
27 ... file.ext Beginning of regular file entries.
These metafiles are treated specially by Windows and are difficult to directly view: special purpose-built tools are needed.
From MFT records to attribute lists, attributes, and streams
For each file (or directory) described in the MFT record, there's a linear repository of stream descriptors (also named attributes), packed together in a variable-length record (also named an attributes list), with extra padding to fill the fixed 1KB size of every MFT record, and that fully describes the effective streams associated with that file.
Each stream (or attribute) itself has a single type (internally just a fixed-size integer in the stored descriptor, but most often handled in applications using an equivalent symbolic name in the FileOpen() or FileCreate() API call), a single optional stream name (completely unrelated to the effective filenames), plus optional associated data for that stream. For NTFS, the standard data of files, or the index data for directories are handled the same way as other data for alternate data streams, or for standard attributes. They are just one of the attributes stored in one or several attribute lists.
• For each file described in the MFT record (or in the non-resident respository of stream descriptors, see below), the stream descriptors identified by their (stream type value, stream name) must be unique. Additionally, NTFS has some ordering constraints for these descriptors.
• There's a predefined null stream type, used to indicate the end of the list of stream descriptors in the streams repository for that file. It must be present as the last stream descriptor in each stream repository (all other storage space available after it will be ignored and just consists in padding bytes to match the record size in the MFT or a cluster size in a non-resident streams repository).
• Some stream types are required and must be present in each MFT record, except unused records that are just indicated by a stream with null stream type.
o This is the case for the standard attributes that are stored as a fixed-size record and containing the timestamps and other basic single-bit attributes (compatible with those managed by FAT/FAT32 in DOS or Windows 95/98 applications).
• Some stream types cannot have a name and must remain anonymous.
o This is the case for the standard attributes, or for the preferred NTFS "filename" stream type, or the "short filename" stream type, when it is also present (for compatibility with DOS-like applications, see below). It is also possible for a file to only contain a short filename, in which case it will be the preferred one, as listed in the Windows Explorer.
o The filename streams stored in the streams repository do not make the file immediately accessible through the hierarchical filesystem. In fact, all the filenames must be indexed separately in at least one separate directory on the same volume, with its own MTF entry and its own security descriptors and attributes, that will reference the MFT entry number for that file. This allows the same file or directory to be "hardlinked" several times from several containers on the same volume, possibly with distinct filenames.
• The default data stream of a regular file is a stream of type $DATA but with an anonymous name, and the ADS's are similar but must be named.
• On the opposite, the default data stream of directories has a distinct type, but are not anonymous: they have a stream name ("$I30" in NTFS 3+) that reflects its indexing format.
Resident vs. non-resident data streams
To optimize the storage and reduce the I/O overhead for the very common case of streams with very small associated data, NTFS prefers to place this data within the stream descriptor (if the size of the stream descriptor does not then exceed the maximum size of the MFT record or the maximum size of a single entry within an non-resident stream repository, see below), instead of using the MFT entry space to list clusters containing the data; in that case, the stream descriptor will not store the data directly but will just store an allocation map pointing to the actual data stored elsewhere on the volume. When the stream data can be accessed directly from within the stream descriptor, it is called "resident data" by computer forensics workers. The amount of data which fits is highly dependent on the file's characteristics, but 700 to 800 bytes is common in single-stream files with non-lengthy filenames and no ACLs.
• Some stream descriptors (such as the preferred filename, the basic file attributes, or the main allocation map for each non-resident stream) cannot be made non-resident.
• Encrypted-by-NTFS, sparse data streams, or compressed data streams cannot be made resident.
• The format of the allocation map for non-resident streams depends on its capability of supporting sparse data storage. In the current implementation of NTFS, once a non-resident stream data has been marked and converted as sparse, it cannot be reverted to non-sparse data, so it cannot become resident again, unless this data is fully truncated, discarding the sparse allocation map completely.
• When a non-resident data stream is too much fragmented, so that its effective allocation map cannot fit entirely within the MFT record, the allocation map may be also stored as an non-resident stream, with just a small resident stream containing the indirect allocation map to the effective non-resident allocation map of the non-resident data stream.
• When there are too many streams for a file (including ADS's, extended attributes, or security descriptors), so that their descriptors cannot fit all within the MFT record, a non-resident stream may also be used to store an additional repository for the other stream descriptors (except those few small streams that cannot be non-resident), using the same format as the one used in the MFT record, but without the space constraints of the MFT record.
The NTFS filesystem driver will sometimes attempt to relocate the data of some of these non-resident streams into the streams repository, and will also attempt to relocate the stream descriptors stored in a non-resident repository back to the stream repository of the MFT record, based on priority and preferred ordering rules, and size constraints.
Since resident files do not directly occupy clusters ("allocation units"), it is possible for an NTFS volume to contain more files on a volume than there are clusters. For example, an 80 GB (74.5 GB) partition NTFS formats with 19,543,064 clusters of 4 KB. Subtracting system files (64 MB log file, a 2,442,888-byte $Bitmap file, and about 25 clusters of fixed overhead) leaves 19,526,158 clusters free for files and indices. Since there are four MFT records per cluster, this volume theoretically could hold almost 4 × 19,526,158 = 78,104,632 resident files.
Limitations
The following are a few limitations of NTFS:
File Names
File names are limited to 255 UTF-16 code words. Certain names are reserved in the volume root directory and cannot be used for files. These are: $MFT, $MFTMirr, $LogFile, $Volume, $AttrDef, . (dot), $Bitmap, $Boot, $BadClus, $Secure, $Upcase, and $Extend; . (dot) and $Extend are both directories; the others are files. The NT kernel limits full paths to 32,767 UTF-16 code words.
Maximum Volume Size
In theory, the maximum NTFS volume size is 264−1 clusters. However, the maximum NTFS volume size as implemented in Windows XP Professional is 232−1 clusters. For example, using 64 KB clusters, the maximum NTFS volume size is 256 TB minus 64 KB. Using the default cluster size of 4 KB, the maximum NTFS volume size is 16 TB minus 4 KB. (Both of these are vastly higher than the 128 GB limit lifted in Windows XP SP1.) Because partition tables on master boot record (MBR) disks only support partition sizes up to 2 TB, dynamic or GPT volumes must be used to create NTFS volumes over 2 TB. Booting from a GPT volume to a Windows environment requires a system with EFI and 64-bit support.
Maximum File Size
Theoretical: 16 EB minus 1 KB (264 − 210 or 18,446,744,073,709,550,592 bytes). Implementation: 16 TB minus 64 KB (244 − 216 or 17,592,185,978,880 bytes)
Alternate Data Streams
Windows system calls may handle alternate data streams. Depending on the operating system, utility and remote file system, a file transfer might silently strip data streams. A safe way of copying or moving files is to use the BackupRead and BackupWrite system calls, which allow programs to enumerate streams, to verify whether each stream should be written to the destination volume and to knowingly skip offending streams.
File Allocation Table or FAT is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of its relative simplicity. Performance of FAT compares poorly to most other file systems as it uses overly simplistic data structures, making file operations time-consuming, and makes poor use of disk space in situations where many small files are present.
For floppy disks, the FAT has been standardized as ECMA-107 and ISO/IEC 9293. Those standards include only FAT12 and FAT16 without long filename support; long filenames with FAT is partially patented.
The FAT file system is relatively straightforward technically and is supported by virtually all existing operating systems for personal computers. This makes it a useful format for solid-state memory cards and a convenient way to share data between operating systems.
History
The FAT file system was developed by Bill Gates and Marc McDonald during 1976–1977 It was the primary file system for various operating systems including DR-DOS, FreeDOS, MS-DOS, OS/2 (v1.1) and Microsoft Windows (up until Windows Me).
The FAT file system was created for managing disks in Microsoft Standalone Disk BASIC. In August 1980 Tim Paterson incorporated FAT into his 86-DOS operating system for the S-100 8086 CPU boards the file system was the main difference between 86-DOS and its predecessor, CP/M.
The name originates from the usage of a table which centralizes the information about which areas belong to files, are free or possibly unusable, and where each file is stored on the disk. To limit the size of the table, disk space is allocated to files in contiguous groups of hardware sectors called clusters. As disk drives have evolved, the maximum number of clusters has dramatically increased, and so the number of bits used to identify each cluster has grown. The successive major versions of the FAT format are named after the number of table element bits: 12, 16, and 32. The FAT standard has also been expanded in other ways while preserving backward compatibility with existing software.
FAT12
The initial version of FAT is now referred to as FAT12. Designed as a file system for floppy disks, it limited cluster addresses to 12-bit values, which not only limited the cluster count to 4078, but made FAT manipulation tricky with the PC's 8-bit and 16-bit registers. (Under Linux, FAT12 is limited to 4084 clusters.) The disk's size is stored as a 16-bit count of sectors, which limited the size to 32 MB FAT12 was used by several manufacturers with different physical formats, but a typical floppy disk at the time was 5.25-inch, single-sided, 40 tracks, with 8 sectors per track, resulting in a capacity of 160 KB for both the system areas and files. The FAT12 limitations exceeded this capacity by a factor of ten or more.
By convention, all the control structures were organized to fit inside the first track, thus avoiding head movement during read and write operations, although this varied depending on the manufacturer and physical format of the disk. At the time FAT12 was introduced, DOS did not support hierarchical directories, and the maximum number of files was typically limited to a few dozen. Hierarchical directories were introduced in MS-DOS version 2.0A limitation which was not addressed until much later was that any bad sector in the control structures area, track 0, could prevent the disk from being usable. The DOS formatting tool rejected such disks completely. Bad sectors were allowed only in the file area, where they made the entire holding cluster unusable as well. FAT12 remains in use on all common floppy disks, including 1.44MB ones.
Initial FAT16
In 1984, IBM released the PC AT, which featured a 20 MB hard disk. Microsoft introduced MS-DOS 3.0 in parallel. (The earlier PC XT was the first PC with a hard drive from IBM, and MS-DOS 2.0 supported that hard drive with FAT12.) Cluster addresses were increased to 16-bit, allowing for up to 65,517 clusters per volume, and consequently much greater file system sizes, at least in theory. However, the maximum possible number of sectors and the maximum (partition, rather than disk) size of 32 MB did not change. Therefore, although technically already "FAT16", this format was not what today is commonly understood as FAT16. With the initial implementation of FAT16 not actually providing for larger partition sizes than FAT12, the early benefit of FAT16 was to enable the use of smaller clusters, making disk usage more efficient, particularly for files several hundred bytes in size, which were far more common at the time. Also, the introduction of FAT16 actually did bring an increase in the maximum partition size under MS-DOS, since the implementation of FAT12 for hard disks in MS-DOS 2.0 was limited to 15 MB. (That is, the initial FAT16 did not support larger drives than FAT12, but MS-DOS 3.0 using FAT16 did support larger drives than MS-DOS 2.0 using FAT12, by a factor of two)
A 20 MB hard disk formatted under MS-DOS 3.0 was not accessible by the older MS-DOS 2.0. (This was because MS-DOS 2.0 did not support version 3.0's FAT-16 and because it did not support hard disk partitions over 15 MB in size.) Of course, MS-DOS 3.0 could still access MS-DOS 2.0 style 8 KB-cluster partitions.
MS-DOS 3.0 also introduced support for high-density 1.2 MB 5.25" diskettes, which notably had 15 sectors per track, hence more space for the FATs. This probably prompted a dubious optimization of the cluster size, which went down from 2 sectors to just 1. The net effect was that high density diskettes were significantly slower than older double density ones
Extended partition and logical drives
Apart from improving the structure of the FAT file system itself, a parallel development allowing an increase in the maximum possible FAT size was the introduction of multiple FAT partitions. Originally DOS was only prepared to handle one FAT partition, although it came with documentation and programming tools for the creation of installable device drivers to handle multiple partitions, and third-party suppliers quickly provided the missing software. Aside from that, partitions were used for sharing the disk between operating systems, typically DOS and Xenix at the time. Extra DOS partitions could not be used as boot partitions, because the installable device drivers were loaded (in config.sys) only after the first part of the DOS boot process. Later, third party tools became available that replaced the DOS master boot record (MBR) and directly loaded non-DOS drivers before DOS: such systems generally came with careful warnings that without the 3rd party software, the disk would not be compatible with DOS. Simply allowing several identical-looking DOS partitions could lead to naming problems: behaviour if more than one partition was marked active was undocumented (although well defined), as was the behaviour if there was more than one hard disk in the computer (which was machine dependent), or if the system was booted from a diskette.
To allow the use of more FAT partitions in a compatible way, a new partition type was introduced (in MS-DOS 3.2, January 1986), the extended partition, which is a container for additional partitions called logical drives. Originally only one logical drive was possible, permitting hard disks up to 64 MB. In MS-DOS 3.3 (August 1987) this limit was increased to 24 drives, equal to the maximum number of available letters for drive names (A and B being reserved for the first two floppy drives, at least one of which many, if not most, systems of the era were equipped with; where only one was installed, B always simulated a second drive using A). Logical drives are described by on-disk structures which closely resemble the Master Boot Record (MBR) of the disk (which describes the primary partitions), likely to simplify the implementation. Though some believe these partitions were nested in a way analogous to Russian matryoshka dolls, that isn't the case. They are stored as a row of separate blocks within a single box; these blocks are often referred to as being chained together, by the links in their extended boot record (EBR) sectors. Only one extended partition is allowed. Under MS-DOS, logical drives are not bootable, and the extended partition can only be created after the primary FAT partition, which removes all ambiguity but also eliminates the possibility of booting several DOS versions from the same hard disk. (A few systems other than MS-DOS can boot logical drives, and partitions can be created in any order using third party formatting tools.)
A useful side-effect of the extended partition scheme was to significantly increase the maximum number of partitions possible on a PC hard disk beyond the four which could be described by the MBR alone.
Prior to the introduction of extended partitions, some hard disk controllers (which at that time were usually separate option boards) could make large hard disks appear at the hardware interface level as two separate disks. Otherwise, DOS "Block Device Drivers" were used to access the other 3 possible partitions on a disk.
Final FAT16
Finally in November 1987, Compaq DOS 3.31 (an OEM version of MS-DOS 3.3 released by Compaq with their machines) introduced what is today called the FAT16 format, with the expansion of the 16-bit disk sector count to 32 bits. The result was initially called the DOS 3.31 Large File System. Although the on-disk changes were minor, the entire DOS disk driver had to be converted to use 32-bit sector numbers, a task complicated by the fact that it was written in 16-bit assembly language.
In 1988 this improvement became more generally available through MS-DOS 4.0 and OS/2 1.1. The limit on partition size was dictated by the 8-bit signed count of sectors per cluster, which had a maximum power-of-two value of 64. With the standard hard disk sector size of 512 bytes, this gives a maximum of 32 KB clusters, thereby fixing the "definitive" limit for the FAT16 partition size at 2 gigabytes. On magneto-optical media, which can have 1 or 2 KB sectors instead of 1/2 KB, this size limit is proportionally larger.
Much later, Windows NT increased the maximum cluster size to 64 KB by considering the sectors-per-cluster count as unsigned. However, the resulting format was not compatible with any other FAT implementation of the time, and it generated greater internal fragmentation. Windows 98 also supported reading and writing this variant, but its disk utilities did not work with it.
The number of root directory entries available is determined when the volume is formatted, and is stored in a 16-bit signed field, defining an absolute limit of 32767 entries (32736, a multiple of 32, in practice). For historical reasons, FAT12 and FAT16 media generally use 512 root directory entries on non-floppy media. Other sizes may be incompatible with some software or devices (entries being file and/or folder names in the original 8.3 format Some third party tools like mkdosfs allow the user to set this parameter
Long file names
One of the user experience goals for the designers of Windows 95 was the ability to use long filenames (LFNs—up to 255 UTF-16 code points long), in addition to classic 8.3 filenames. LFNs were implemented using a workaround in the way directory entries are laid out (see below).
The version of the file system with this extension is usually known as VFAT after the Windows 95 virtual device driver, also known as "Virtual FAT" in Microsoft's documentation. Interestingly, the VFAT driver actually appeared before Windows 95, in Windows for Workgroups 3.11, but was only used for implementing 32-bit file access and did not support long file names.
In Windows NT, support for long filenames on FAT started from version 3.5. OS/2 added long filename support to FAT using extended attributes (EA) before the introduction of VFAT; thus, VFAT long filenames are invisible to OS/2, and EA long filenames are invisible to Windows.
FAT32
In order to overcome size limit of FAT16, while at the same time allowing DOS real mode code to handle the format, and without reducing available conventional memory unnecessarily, Microsoft implemented a next generation, known as FAT32. Cluster values are represented by 32-bit numbers, of which 28 bits are used to hold the cluster number, for a maximum of approximately 268 million (228) clusters. This allows for drive sizes of up to 8 terabytes with 32KB clusters, but the boot sector uses a 32-bit field for the sector count, limiting volume size to 2 TB on a hard disk with 512 byte sectors.
On Windows 95/98, due to the version of Microsoft's SCANDISK utility included with these operating systems being a 16-bit application, the FAT structure is not allowed to grow beyond around 4.2 million (< 222) clusters, placing the volume limit at 127.53 GiB. A limitation in original versions of Windows 98/98SE's Fdisk utility causes it to incorrectly report disk sizes over 64 GB. A corrected version is available from Microsoft, but it cannot partition drives larger than 512GB. The Windows 2000/XP installation program and filesystem creation tool imposes a limitation of 32 GB However, both systems can read and write to FAT32 file systems of any size. This limitation is by design and according to Microsoft was imposed because many tasks on a very large FAT32 file system become slow and inefficient. This limitation can be bypassed by using third-party formatting utilities. Windows Me supports the FAT32 file system without any limits. However, similarly to Windows 95/98/98SE there is no native support for 48-bit LBA in Windows ME, meaning that the maximum disk size for ATA disks is 127.6 GiB, the maximum size of an ATA disk using the previous long-standard 28-bit LBA.
FAT32 was introduced with Windows 95 OSR2, although reformatting was needed to use it, and DriveSpace 3 (the version that came with Windows 95 OSR2 and Windows 98) never supported it. Windows 98 introduced a utility to convert existing hard disks from FAT16 to FAT32 without loss of data. In the NT line, native support for FAT32 arrived in Windows 2000. A free FAT32 driver for Windows NT 4.0 was available from Winternals, a company later acquired by Microsoft. Since the acquisition the driver is no longer officially available.
The maximum possible size for a file on a FAT32 volume is 4 GB minus 1 byte (232−1 bytes). Video applications, large databases, and some other software easily exceed this limit. Larger files require another formatting type such as NTFS.
Fragmentation
The FAT file system does not contain mechanisms which prevent newly written files from becoming scattered across the partition. Other file systems, like HPFS, use free space bitmaps that indicate used and available clusters, which could then be quickly looked up in order to find free contiguous areas (improved in exFAT). Another solution is the linkage of all free clusters into one or more lists (as is done in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead to performance penalties with large disks.
In fact, computing free disk space on FAT is one of the most resource intensive operations, as it requires reading the entire FAT linearly. A possible justification suggested by Microsoft's Raymond Chen for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a "DIR" operation, which always displays the free disk space as the last line. Displaying this line took longer and longer as the number of clusters increased.
The High Performance File System (HPFS) divides disk space into bands, which have their own free space bitmap, where multiple files opened for simultaneous write could be expanded separately.
Some of the perceived problems with fragmentation resulted from operating system and hardware limitations.
The single-tasking DOS and the traditionally single-tasking PC hard disk architecture (only 1 outstanding input/output request at a time, no DMA transfers) did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks.
Similarly, write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a crash, made easier by the lack of hardware protection between applications and the system.
MS-DOS also did not offer a system call which would allow applications to make sure a particular file has been completely written to disk in the presence of deferred writes (cf. fsync in Unix or DosBufReset in OS/2). Disk caches on MS-DOS were operating on disk block level and were not aware of higher-level structures of the file system. In this situation, cheating with regard to the real progress of a disk operation was most dangerous.
Modern operating systems have introduced these optimizations to FAT partitions, but optimizations can still produce unwanted artifacts in case of a system crash. A Windows NT system will allocate space to files on FAT in advance, selecting large contiguous areas, but in case of a crash, files which were being appended will appear larger than they were ever written into, with dozens of random kilobytes at the end.
With the large cluster sizes, 16 or 32K, forced by larger FAT32 partitions, the external fragmentation becomes somewhat less significant, and internal fragmentation, i.e. disk space waste (since files are rarely exact multiples of cluster size), starts to be a problem as well, especially when there are a great many small files.
Design
Overview
The following is an overview of the order of structures in a FAT partition or disk:
Contents Boot
Sector FS Information
Sector
(FAT32 only) More reserved
sectors
(optional) File
Allocation
Table #1 File
Allocation
Table #2 Root
Directory
(FAT12/16 only) Data Region (for files and directories) ...
(To end of partition or disk)
Size in sectors (number of reserved sectors) (number of FATs)*(sectors per FAT) (number of root entries*32)/Bytes per sector NumberOfClusters*SectorsPerCluster
A FAT file system is composed of four different sections.
1. The Reserved sectors, located at the very beginning. The first reserved sector (sector 0) is the Boot Sector (aka Partition Boot Record). It includes an area called the BIOS Parameter Block (with some basic file system information, in particular its type, and pointers to the location of the other sections) and usually contains the operating system's boot loader code. The total count of reserved sectors is indicated by a field inside the Boot Sector. Important information from the Boot Sector is accessible through an operating system structure called the Drive Parameter Block in DOS and OS/2. For FAT32 file systems, the reserved sectors include a File System Information Sector at sector 1 and a Backup Boot Sector at Sector 6.
2. The FAT Region. This typically contains two copies (may vary) of the File Allocation Table for the sake of redundancy checking, although the extra copy is rarely used, even by disk repair utilities. These are maps of the Data Region, indicating which clusters are used by files and directories.
3. The Root Directory Region. This is a Directory Table that stores information about the files and directories located in the root directory. It is only used with FAT12 and FAT16, and imposes on the root directory a fixed maximum size which is pre-allocated at creation of this volume. FAT32 stores the root directory in the Data Region, along with files and other directories, allowing it to grow without such a constraint. Thus, for FAT32, the Data Region starts here.
4. The Data Region. This is where the actual file and directory data is stored and takes up most of the partition. The size of files and subdirectories can be increased arbitrarily (as long as there are free clusters) by simply adding more links to the file's chain in the FAT. Note however, that files are allocated in units of clusters, so if a 1 KB file resides in a 32 KB cluster, 31 KB are wasted. FAT32 typically commences the Root Directory Table in cluster number 2: the first cluster of the Data Region.
FAT uses little endian format for entries in the header and the FAT(s).
Boot Sector
It is important to note that the first sector on a device isn't necessarily the boot sector. For partitioned devices (such as hard drives), the first sector is the Master Boot Record. On un-partitioned devices (eg. floppy disk) the first sector is the Volume Boot Record.
Common structure of the first 36 bytes used by all FAT versions:
Byte Offset Length (bytes) Description
0x00 3 Jump instruction. This instruction will be executed and will skip past the rest of the (non-executable) header if the partition is booted from. See Volume Boot Record. If the jump is two-byte near jmp it is followed by a NOP instruction.
0x03 8 OEM Name (padded with spaces). This value determines in which system disk was formatted. MS-DOS checks this field to determine which other parts of the boot record can be relied on. Common values are IBM 3.3 (with two spaces between the "IBM" and the "3.3"), MSDOS5.0, MSWIN4.1 and mkdosfs.
0x0b 2 Bytes per sector. A common value is 512, especially for file systems on IDE (or compatible) disks. The BIOS Parameter Block starts here.
0x0d 1 Sectors per cluster. Allowed values are powers of two from 1 to 128. However, the value must not be such that the number of bytes per cluster becomes greater than 32
0x0e 2 Reserved sector count. The number of sectors before the first FAT in the file system image. Should be 1 for FAT12/FAT16. Usually 32 for FAT32.
0x10 1 Number of file allocation tables. Almost always 2.
0x11 2 Maximum number of root directory entries. Only used on FAT12 and FAT16, where the root directory is handled specially. Should be 0 for FAT32. This value should always be such that the root directory ends on a sector boundary (i.e. such that its size becomes a multiple of the sector size). 224 is typical for floppy disks.
0x13 2 Total sectors (if zero, use 4 byte value at offset 0x20)
0x15 1 Media descriptor
0xF0 3.5" Double Sided, 80 tracks per side, 18 or 36 sectors per track (1.44MB or 2.88MB). 5.25" Double Sided, 80 tracks per side, 15 sectors per track (1.2MB). Used also for other media types.
0xF8 Fixed disk (i.e. Hard disk).
0xF9 3.5" Double sided, 80 tracks per side, 9 sectors per track (720K). 5.25" Double sided, 80 tracks per side, 15 sectors per track (1.2MB)
0xFA 5.25" Single sided, 80 tracks per side, 8 sectors per track (320K)
0xFB 3.5" Double sided, 80 tracks per side, 8 sectors per track (640K)
0xFC 5.25" Single sided, 40 tracks per side, 9 sectors per track (180K)
0xFD 5.25" Double sided, 40 tracks per side, 9 sectors per track (360K). Also used for 8".
0xFE 5.25" Single sided, 40 tracks per side, 8 sectors per track (160K). Also used for 8".
0xFF 5.25" Double sided, 40 tracks per side, 8 sectors per track (320K)
Same value of media descriptor should be repeated as first byte of each copy of FAT. Certain operating systems (MSX-DOS version 1.0) ignore boot sector parameters altogether and use media descriptor value from the first byte of FAT to determine file system parameters.
0x16 2 Sectors per File Allocation Table for FAT12/FAT16
0x18 2 Sectors per track
0x1a 2 Number of heads
0x1c 4 Count of hidden sectors preceding the partition that contains this FAT volume. This field should always be zero on media that are not partitioned.
0x20 4 Total sectors (if greater than 65535; otherwise, see offset 0x13)
Extended BIOS Parameter Block
Further structure used by FAT12 and FAT16, also known as Extended BIOS Parameter Block:
Byte Offset Length (bytes) Description
0x24 1 Physical drive number (0x00 for removable media, 0x80 for hard disks)
0x25 1 Reserved ("current head")
In Windows NT bit 0 is a dirty flag to request chkdsk at boot time. bit 1 requests surface scan too.
0x26 1 Extended boot signature. (Should be 0x29. Indicates that the following 3 entries exist.)
0x27 4 ID (serial number)
0x2b 11 Volume Label, padded with blanks (0x20).
0x36 8 FAT file system type, padded with blanks (0x20), e.g.: "FAT12 ", "FAT16 ". This is not meant to be used to determine drive type, however, some utilities use it in this way.
0x3e 448 Operating system boot code
0x1FE 2 Boot sector signature (0x55 0xAA)
The boot sector is portrayed here as found on e.g. an OS/2 1.3 boot diskette. Earlier versions used a shorter BIOS Parameter Block and their boot code would start earlier (for example at offset 0x2b in OS/2 1.1).
Further structure used by FAT32:
Byte Offset Length (bytes) Description
0x24 4 Sectors per file allocation table
0x28 2 FAT Flags (Only used during a conversion from a FAT12/16 volume.)
0x2a 2 Version (Defined as 0)
0x2c 4 Cluster number of root directory start
0x30 2 Sector number of FS Information Sector
0x32 2 Sector number of a copy of this boot sector (0 if no backup copy exists)
0x34 12 Reserved
0x40 1 Physical Drive Number (see FAT12/16 BPB at offset 0x24)
0x41 1 Reserved (see FAT12/16 BPB at offset 0x25)
0x42 1 Extended boot signature. (see FAT12/16 BPB at offset 0x26)
0x43 4 ID (serial number)
0x47 11 Volume Label
0x52 8 FAT file system type: "FAT32 "
0x5a 420 Operating system boot code
0x1FE 2 Boot sector signature (0x55 0xAA)
Exceptions
The implementation of FAT used in MS-DOS for the Apricot PC had a different boot sector layout, to accommodate that computer's non-IBM compatible BIOS. The jump instruction and OEM name were omitted, and the MS-DOS file system parameters (offsets 0x0B - 0x17 in the standard sector) were located at offset 0x50. Later versions of Apricot MS-DOS gained the ability to read and write disks with the standard boot sector in addition to those with the Apricot one.
DOS Plus on the BBC Master 512 did not use conventional boot sectors at all. Data disks omitted the boot sector and began with a single copy of the FAT (the first byte of the FAT was used to determine disk capacity) while boot disks began with a miniature ADFS file system containing the boot loader, followed by a single FAT. It could also access standard PC disks formatted to 180 KB or 360 KB, again using the first byte of the FAT to determine the capacity.
FS Information Sector
The "FS Information Sector" was introduced in FAT32[30] for speeding up access times of certain operations (in particular, getting the amount of free space). It is located at a sector number specified in the boot record at position 0x30 (usually sector 1, immediately after the boot record).
Byte Offset Length (bytes) Description
0x00 4 FS information sector signature (0x52 0x52 0x61 0x41 / "RRaA")
0x04 480 Reserved (byte values are 0x00)
0x1e4 4 FS information sector signature (0x72 0x72 0x41 0x61 / "rrAa")
0x1e8 4 Number of free clusters on the drive, or -1 if unknown
0x1ec 4 Number of the most recently allocated cluster
0x1f0 14 Reserved (byte values are 0x00)
0x1fe 2 FS information sector signature (0x55 0xAA)
File Allocation Table
A partition is divided up into identically sized clusters, small blocks of contiguous space. Cluster sizes vary depending on the type of FAT file system being used and the size of the partition, typically cluster sizes lie somewhere between 2 KB and 32 KB. Each file may occupy one or more of these clusters depending on its size; thus, a file is represented by a chain of these clusters (referred to as a singly linked list). However these clusters are not necessarily stored adjacent to one another on the disk's surface but are often instead fragmented throughout the Data Region.
The File Allocation Table (FAT) is a list of entries that map to each cluster on the partition. Each entry records one of five things:
• the cluster number of the next cluster in a chain
• a special end of clusterchain (EOC) entry that indicates the end of a chain
• a special entry to mark a bad cluster
• a special entry to mark a reserved cluster[citation needed]
• a zero to note that the cluster is unused
Each version of the FAT file system uses a different size for FAT entries. Smaller numbers result in a smaller FAT table, but waste space in large partitions by needing to allocate in large clusters. The FAT12 file system uses 12 bits per FAT entry, thus two entries span 3 bytes. It is consistently little-endian: if you consider the 3 bytes as one little-endian 24-bit number, the 12 least significant bits are the first entry and the 12 most significant bits are the second.
In the FAT32 file system, FAT entries are 32 bits, but only 28 of these are actually used; the 4 most significant bits are reserved.
FAT entry values:
FAT12 FAT16 FAT32 Description
0x000 0x0000 0x00000000 Free Cluster
0x001 0x0001 0x00000001 Reserved value; do not use
0x002–0xFEF 0x0002–0xFFEF 0x00000002–0x0FFFFFEF Used cluster; value points to next cluster
0xFF0–0xFF6 0xFFF0–0xFFF6 0x0FFFFFF0–0x0FFFFFF6 Reserved values; do not use[28].
0xFF7 0xFFF7 0x0FFFFFF7 Bad sector in cluster or reserved cluster
0xFF8–0xFFF 0xFFF8–0xFFFF 0x0FFFFFF8–0x0FFFFFFF Last cluster in file
Note that FAT32 uses only 28 bits of the 32 possible bits. The upper 4 bits are usually zero (as indicated in the table above) but are reserved and should be left untouched.
The first cluster of the Data Region is cluster #2. That leaves the first two entries of the FAT unused. In the first byte of the first entry a copy of the media descriptor is stored. The remaining 8 bits (if FAT16), or 20 bits (if FAT32) of this entry are 1. In the second entry the end-of-cluster-chain marker is stored. The high order two bits of the second entry are sometimes, in the case of FAT16 and FAT32, used for dirty volume management: high order bit 1: last shutdown was clean; next highest bit 1: during the previous mount no disk I/O errors were detected.[31]
Directory table
A directory table is a special type of file that represents a directory (also known as a folder). Each file or directory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the date and time of creation, the address of the first cluster of the file/directory's data and finally the size of the file/directory. Aside from the Root Directory Table in FAT12 and FAT16 file systems, which occupies the special Root Directory Region location, all Directory Tables are stored in the Data Region. The actual number of entries in a directory stored in the Data Region can grow by adding another cluster to the chain in the FAT.
Note that before each entry there can be "fake entries" to support the Long File Name. (See further down the article).
Legal characters for DOS file names include the following:
• Upper case letters A–Z
• Numbers 0–9
• Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the file name, also filenames with space in them could not be used on the DOS command line prior to Windows 95 because of the lack of a suitable escaping system)
• ! # $ % & ' ( ) - @ ^ _ ` { } ~
• Values 128–255
This excludes the following ASCII characters:
• " * / : < > ? \ |
Windows/MSDOS has no shell escape character
• + , . ; = [ ]
They are allowed in long file names only.
• Lower case letters a–z
Stored as A–Z. Allowed in long file names.
• Control characters 0–31
• Value 127 (DEL)
The DOS file names are in the OEM character set.
Directory entries, both in the Root Directory Region and in subdirectories, are of the following format (see also 8.3 filename):
Byte Offset Length Description
0x00 8 DOS file name (padded with spaces)
The first byte can have the following special values:
0x00 Entry is available and no subsequent entry is in use
0x05 Initial character is actually 0xE5.
0x2E 'Dot' entry; either '.' or '..'
0xE5 Entry has been previously erased and is available. File undelete utilities must replace this character with a regular character as part of the undeletion process.
0x08 3 DOS file extension (padded with spaces)
0x0b 1 File Attributes
Bit Mask Description
0 0x01 Read Only
1 0x02 Hidden
2 0x04 System
3 0x08 Volume Label
4 0x10 Subdirectory
5 0x20 Archive
6 0x40 Device (internal use only, never found on disk)
7 0x80 Unused
An attribute value of 0x0F is used to designate a long file name entry.
0x0c 1 Reserved; two bits are used by NT and later versions to encode case information (see below); otherwise 0[32]
0x0d 1 Create time, fine resolution: 10ms units, values from 0 to 199.
0x0e 2 Create time. The hour, minute and second are encoded according to the following bitmap:
Bits Description
15-11 Hours (0-23)
10-5 Minutes (0-59)
4-0 Seconds/2 (0-29)
Note that the seconds is recorded only to a 2 second resolution. Finer resolution for file creation is found at offset 0x0d.
0x10 2 Create date. The year, month and day are encoded according to the following bitmap:
Bits Description
15-9 Year (0 = 1980, 127 = 2107)
8-5 Month (1 = January, 12 = December)
4-0 Day (1 - 31)
0x12 2 Last access date; see offset 0x10 for description.
0x14 2 EA-Index (used by OS/2 and NT) in FAT12 and FAT16, High 2 bytes of first cluster number in FAT32
0x16 2 Last modified time; see offset 0x0e for description.
0x18 2 Last modified date; see offset 0x10 for description.
0x1a 2 First cluster in FAT12 and FAT16. Low 2 bytes of first cluster in FAT32. Entries with the Volume Label flag, subdirectory ".." pointing to root, and empty files with size 0 should have first cluster 0.
0x1c 4 File size in bytes. Entries with the Volume Label or Subdirectory flag set should have a size of 0.
Clusters are numbered from a cluster offset as defined above and the FilestartCluster is in 0x1a. This would mean the first data segment X can be calculated using the Boot Sector fields:
For FAT32
FileStartSector = ReservedSectors(0x0e) + (NumofFAT(0x10) * Sectors2FAT(0x24)) + ((X − 2) * SectorsPerCluster(0x0d))
For FAT16/12
FileStartSector = ReservedSectors(0x0e) + (NumofFAT(0x10) * Sectors2FAT(0x16)) + (MaxRootEntry(0x11) * 32 / BytesPerSector(0x0b)) + ((X − 2) * SectorsPerCluster(0x0d))
Long file names
Long File Names (LFN) are stored on a FAT file system using a trick—adding (possibly multiple) additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS.
Older versions of PC-DOS mistake LFN names in the root directory for the volume label, and are likely to display an incorrect label.
Each phony entry can contain up to 13 UTF-16 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UTF-16 characters.[32]
After the last UTF-16 character, a 0x00 0x00 is added. Other not used characters are filled with 0xFF 0xFF.
LFN entries use the following format:
Byte Offset Length Description
0x00 1 Sequence Number
0x01 10 Name characters (five UTF-16 characters)
0x0b 1 Attributes (always 0x0F)
0x0c 1 Reserved (always 0x00)
0x0d 1 Checksum of DOS file name
0x0e 12 Name characters (six UTF-16 characters)
0x1a 2 First cluster (always 0x0000)
0x1c 4 Name characters (two UTF-16 characters)
If there are multiple LFN entries, required to represent a file name, firstly comes the last LFN entry (the last part of the filename). The sequence number here also has bit 7 (0x40) checked (this means the last LFN entry. However it's the first entry got when reading the directory file). The last LFN entry has the biggest sequence number which decreases in following entries. The first LFN entry has sequence number 1. Bit 8 (0x80) of the sequence number is used to indicate that the entry is deleted.
For example if we have filename "File with very long filename.ext" it would be formatted like this:
Sequence number Entry data
0x43 "me.ext"
0x02 "y long filena"
0x01 "File with ver"
??? Normal 8.3 entry
A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (Note that pFcbName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with spaces (ASCII 0x20) char. For example, "Readme.txt" would be "README TXT".)
unsigned char lfn_checksum(const unsigned char *pFcbName)
{
int i;
unsigned char sum=0;
for (i=11; i; i--)
sum = ((sum & 1) << 7) + (sum >> 1) + *pFcbName++;
return sum;
}
If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice-versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions such as XP. Instead, two bits in byte 0x0c of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support this. This creates a backwards-compatibility problem with older Windows versions (95, 98, ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.
Third-party extensions
Before Microsoft added support for long filenames and creation/access time stamps, bytes 0x0C–0x15 of the directory entry were used by alternative operating systems to store additional metadata. These included:
Byte Offset Length System Description
0x0C 2 RISC OS
File type, 0x000 - 0xFFF
0x0C 1 DOS Plus
User-defined file attributes F1-F4
Bit Mask Description
7 0x80 F1
6 0x40 F2
5 0x20 F3
4 0x10 F4
0x0C 1 MSX-DOS 2
For a deleted file, the original first character of the filename.
0x0D 1 DR-DOS
For a deleted file, the original first character of the filename.
0x0E 2 DR-DOS and FlexOS
Encrypted file password
0x0E 2 ANDOS
File address in the memory
0x10 4 DR-DOS 7 For a deleted file, its original file time and date; deleted files have their normal time and date fields set to the time of deletion
0x12 2 DR-DOS 6 and FlexOS File owner ID
0x14 2 DR-DOS and FlexOS File permissions bitmap (execute permissions are only used by FlexOS):
Bit Mask Description
0 0x0001 Owner delete requires password
1 0x0002 Owner execute requires password
2 0x0004 Owner write requires password
3 0x0008 Owner read requires password
4 0x0010 Group delete requires password
5 0x0020 Group execute requires password
6 0x0040 Group write requires password
7 0x0080 Group read requires password
8 0x0100 World delete requires password
9 0x0200 World execute requires password
10 0x0400 World write requires password
11 0x0800 World read requires password
FAT licensing
Microsoft applied for, and was granted, a series of patents for key parts of the FAT file system in the mid-1990s. Being almost universally compatible and well-understood, FAT is frequently chosen as an interchange format for flash media used in digital cameras and PDAs.
On December 3, 2003 Microsoft announced[34] it would be offering licenses for use of its FAT specification and "associated intellectual property", at the cost of a US$0.25 royalty per unit sold, with a $250,000 maximum royalty per license agreement.[35]
To this end, Microsoft cited four patents on the FAT file system as the basis of its intellectual property claims. All four pertain to long-filename extensions to FAT first seen in Windows 95:
• U.S. Patent 5,745,902 - Method and system for accessing a file using file names having different file name formats. Filed July 6, 1992. This covered a means of generating and associating a short, 8.3 filename with long one (for example, "Microsoft.txt" with "MICROS~1.TXT") and a means of enumerating conflicting short filenames (for example, "MICROS~2.TXT" and "MICROS~3.TXT"). It is unclear whether this patent would cover an implementation of FAT without explicit long filename capabilities. Hard links in Unix file systems do not appear to be prior art: deleting a FAT file via its long name will also remove its short name. Renaming a file to a "short" name also updates the long file name for coherency; similarly, renaming a file to a "long" name will allocate a new "short" name. In NTFS, hard links and dual names are separate concepts and each hard link has two names. Finally, at the API level, both names are always provided together when a directory lookup is requested from the system; they do not appear as two separate files and do not have to be "matched" to determine unique files.
• U.S. Patent 5,579,517 - Common name space for long and short filenames. Filed for on 1995-04-24. This covers the method of chaining together multiple consecutive 8.3 named directory entries to hold long filenames, with some of the entries specially marked to prevent their confusing older, long filename-unaware FAT implementations.
o The Public Patent Foundation successfully challenged this patent; the claims were rejected[36] on 2004-09-14, due to prior disclosure[37] of the claimed techniques in patents U.S. Patent 5,307,494 and U.S. Patent 5,367,671. This decision was later overturned by the Patent Office on 2006-01-10.
• U.S. Patent 5,758,352 - Common name space for long and short filenames. Filed on 1996-09-05. This is very similar to 5,579,517.
o The Public Patent Foundation successfully challenged this patent (USPTO); The USPTO rejected this patent on 2005-10-05, on the grounds that "the six assignees names were incorrect".[38][39] This decision was also later overturned by the Patent Office on 2006-01-10.
• U.S. Patent 6,286,013 - Method and system for providing a common name space for long and short file names in an operating system. Filed on 1997-01-28. This makes claims on the methods used when Windows 95, Windows 98 and Windows Me expose long filenames to their MS-DOS compatibility layer. It does not appear to affect any non-Microsoft FAT implementations.
Many technical commentators[who?] have concluded that these patents only cover FAT implementations that include support for long filenames, and that removable solid state media and consumer devices only using short names would be unaffected.
Additionally, in the document "Microsoft Extensible Firmware Initiative FAT 32 File System Specification, FAT: General Overview of On-Disk Format" published by Microsoft (version 1.03, 2000-12-06), Microsoft specifically grants a number of rights, which many readers have interpreted as permitting operating system vendors to implement FAT.
Microsoft is not the only company to have applied for patents for parts of the FAT file system. Other patents affecting FAT include:
• U.S. Patent 5,367,671 - System for accessing extended object attribute (EA) data through file name or EA handle linkages in path tables. Filed on 1990-09-25 by Barry A. Feigenbaum and Felix Miro of IBM, this makes claims on the methods used by OS/2, Windows NT, and Linux for storing extended attribute data in the "EA DATA. SF" file.
Appeal
As there was widespread call for these patents to be re-examined, the Public Patent Foundation (PUBPAT) submitted evidence to the US Patent and Trade Office (USPTO) disputing the validity of these patents, including prior art references from Xerox and IBM. The USPTO acknowledged that the evidence raised "substantial new question[s] of patentability," and opened an investigation into the validity of Microsoft's FAT patents.[40]
On 2004-09-30 the USPTO rejected all claims of U.S. Patent 5,579,517, based primarily on evidence provided by PUBPAT. Dan Ravicher, the foundation's executive director, said, "The Patent Office has simply confirmed what we already knew for some time now, Microsoft's FAT patent is bogus."
According to the PUBPAT press release, "Microsoft still has the opportunity to respond to the Patent Office's rejection. Typically, third party requests for re-examination, like the one filed by PUBPAT, are successful in having the subject patent either narrowed or completely revoked roughly 70% of the time."
On 2005-10-05 the Patent Office announced that, following the re-examination process, it had again rejected all claims of patent 5,579,517, and it additionally found U.S. Patent 5,758,352 invalid on the grounds that the patent had incorrect assignees.
Finally, on 2006-01-10 the Patent Office ruled that features of Microsoft's implementation of the FAT system were "novel and non-obvious", reversing both earlier non-final decisions
Patent infringement lawsuit
In February 2009, Microsoft filed a patent infringement lawsuit against TomTom alleging that the device maker's products infringe on patents related to FAT32 filesystem. As some TomTom products are based on Linux, this marked the first time that Microsoft tried to enforce its patents against the Linux platformThe lawsuit was settled out of court the following month with an agreement that Microsoft be given access to four of TomTom's patents, that TomTom will drop support for the FAT32 filesystem from its products, and that in return Microsoft not seek legal action against TomTom for the five year duration of the settlement agreement
Posted by kittu at 12:11 AM 0 comments
FAT and NTFS file systems ...
NTFS (New Technology File System)is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7.
NTFS supersedes the FAT file system as the preferred file system for Microsoft’s Windows operating systems. NTFS has several improvements over FAT and HPFS (High Performance File System) such as improved support for metadata and the use of advanced data structures to improve performance, reliability, and disk space utilization, plus additional extensions such as security access control lists (ACL) and file system journaling.
History
In the mid 1980s, Microsoft and IBM formed a joint project to create the next generation graphical operating system. The result of the project was OS/2, but eventually Microsoft and IBM disagreed on many important issues and separated. OS/2 remained an IBM project. Microsoft started to work on Windows NT. The OS/2 filesystem HPFS contained several important new features. When Microsoft created their new operating system, they borrowed many of these concepts for NTFS. Probably as a result of this common ancestry, HPFS and NTFS share the same disk partition identification type code (07). Sharing an ID is unusual since there were dozens of available codes, and other major filesystems have their own code. FAT has more than nine (one each for FAT12, FAT16, FAT32, etc.). Algorithms which identify the filesystem in a partition type 07 must perform additional checks. It is also clear that NTFS owes some of its architectural design to Files-11 used by VMS. This is hardly surprising since Dave Cutler was the main lead for both VMS and Windows NT.
Versions
NTFS has five released versions:
• v1.0 with NT 3.1 released mid-1993
• v1.1 with NT 3.5 released fall 1994
• v1.2 with NT 3.51 (mid-1995) and NT 4 (mid-1996) (occasionally referred to as "NTFS 4.0", because OS version is 4.0)
• v3.0 from Windows 2000 ("NTFS V5.0")
• v3.1 from Windows XP (autumn 2001; "NTFS V5.1), Windows Server 2003 (spring 2003; occasionally "NTFS V5.2), Windows Vista (mid-2005) (occasionally "NTFS V6.0), Windows Server 2008, Windows 7.
V1.0 and V1.1 (and newer) are incompatible: that is, volumes written by NT 3.5x cannot be read by NT 3.1 until an update on the NT 3.5x CD is applied to NT 3.1, which also adds FAT long file name support. V1.2 supports compressed files, named streams, ACL-based security, etc. V3.0 added disk quotas, encryption, sparse files, reparse points, update sequence number (USN) journaling, the $Extend folder and its files, and reorganized security descriptors so that multiple files which use the same security setting can share the same descriptor. V3.1 expanded the Master File Table (MFT) entries with redundant MFT record number (useful for recovering damaged MFT files).
Windows Vista introduced Transactional NTFS, NTFS symbolic links, partition shrinking and self-healing functionality though these features owe more to additional functionality of the operating system than the file system itself.
Features
NTFS v3.0 includes several new features over its predecessors: sparse file support, disk usage quotas, reparse points, distributed link tracking, and file-level encryption, also known as the Encrypting File System (EFS).
USN Journal
The USN Journal (Update Sequence Number Journal) is a system management feature that records changes to all files, streams and directories on the volume, as well as their various attributes and security settings.
It is a critical functionality of NTFS (a feature that FAT/FAT32 does not provide) for ensuring that its internal complex data structures (notably the volume allocation bitmap, or data moves performed by the defragmentation API, the modifications to MFT records such as moves of some variable-length attributes stored in MFT records and attribute lists, or updates to the shared security descriptors, or to the boot sector and its local mirrors where the last USN transaction committed on the volume is stored) and indices (for directories and security descriptors) will remain consistent in case of system crashes, and allow easy rollback of uncommitted changes to these critical data structures when the volume will be remounted.
In later versions of Windows, the USN journal has extended to trace the state of other transactional operations on other parts of the NTFS filesystem, such as the VSS shadow copies of system files with copy-on-write semantics, or the implementation of Transactional NTFS and of distributed filesystems (see below).
Hard links and short filenames
Originally included to support the POSIX subsystem in Windows NT hard links are similar to directory junctions, but used for files instead of directories. Hard links can only be applied to files on the same volume since an additional filename record is added to the file's MFT record. Short (8.3) filenames are also implemented as additional filename records that don't have separate directory entries. Hard links also have the behavior that changing the size or attributes of a file may not update the directory entries of other links until they are opened.
Alternate data streams (ADS)
Alternate data streams allow more than one data stream to be associated with a filename, using the filename format "filename:streamname" (e.g., "text.txt:extrastream"). Alternate streams are not listed in Windows Explorer, and their size is not included in the file's size. Only the main stream of a file is preserved when it is copied to a FAT-formatted USB drive, attached to an e-mail, or uploaded to a website. As a result, using alternate streams for critical data may cause problems. NTFS Streams were introduced in Windows NT 3.1, to enable Services for Macintosh (SFM) to store Macintosh resource forks. Although current versions of Windows Server no longer include SFM, third-party Apple Filing Protocol (AFP) products (such as Group Logic's ExtremeZ-IP) still use this feature of the file system.
Malware has used alternate data streams to hide its code; ome malware scanners and other special tools now check for data in alternate streams. Microsoft provides a tool called Streams to allow users to view streams on a selected volume.
Very small ADS are also added within Internet Explorer (and now also other browsers) to mark files that have been downloaded from external sites: they may be unsafe to run locally and the local shell will require confirmation from the user before opening them. When the user indicates that he no longer wants this confirmation dialog, this ADS is simply dropped from the MFT entry for downloaded files.
Some media players have also tried to use ADS to store custom metadata to media files, in order to organize the collections, without modifying the effective data content of the media files themselves (using embedded tags when they are supported by the media file formats such as MPEG and OGG containers); these metadata may be displayed in the Windows Explorer as extra information columns, with the help of a registered Windows Shell extension that can parse them, but most media players prefer to use their own separate database instead of ADS for storing these information (notably because ADS are visible to all users of these files, instead of being managed with distinct per-user security settings and having their values defined according to user preferences).
Sparse files
Sparse files are files which contain sparse data sets, data mostly filled with zeros. Database applications, for instance, sometimes use sparse files. Because of this, Microsoft has implemented support for efficient storage of sparse files by allowing an application to specify regions of empty (zero) data. An application that reads a sparse file reads it in the normal manner with the file system calculating what data should be returned based upon the file offset. As with compressed files, the actual sizes of sparse files are not taken into account when determining quota limits.
File compression
NTFS compresses files using a variant of the LZ77 algorithm. Although read–write access to compressed files is transparent, Microsoft recommends avoiding compression on server systems and/or network shares holding roaming profiles because it puts a considerable load on the processor. Single-user systems with limited hard disk space can benefit from NTFS compression. The slowest link in a computer is not the CPU but the speed of the hard drive, so NTFS compression allows the limited, slow storage space to be better used, in terms of both space and (often) speed. NTFS compression can also serve as a replacement for sparse files when a program (e. g. a download manager) is not able to create files without content as sparse files.
Volume Shadow Copy
The Volume Shadow Copy Service (VSS) keeps historical versions of files and folders on NTFS volumes by copying old, newly-overwritten data to shadow copy (copy-on-write). The old file data is overlaid on the new when the user requests a revert to an earlier version. This also allows data backup programs to archive files currently in use by the file system. On heavily loaded systems, Microsoft recommends setting up a shadow copy volume on a separate disk. To ensure consistent recovery in case of system crashes, the VSS also uses the USN journal to mark local transactions and ensure that committed changes to the system files will be effectively recovered after system restart when the NTFS volume will be remounted, or safely rolled back to an older version if the new version was not fully recorded before actual commits before closing the modified file. However, these VSS shadows are not coordinated globally on multiple files or volumes, except when using a transaction coordinator (see below). They can just be used to ensure that older versions will remain accessible during backup operations, for getting consistent system images in those backups.
Transactional NTFS
As of Windows Vista, applications can use Transactional NTFS to group changes to files together into a transaction. The transaction will guarantee that all changes happen, or none of them do, and it will guarantee that applications outside the transaction will not see the changes until the precise instant they are committed. It uses the similar techniques as those used for Volume Shadow Copies (i.e. copy-on-write) to ensure that overwritten data can be safely rolled back, and the UFS journaling log to mark the transactions that have still not been committed, or those that have been committed but still not fully applied (in case of system crash during a commit by one of the participants).
However, in a transactional-enabled filesystem, this can be used temporarily for all other files needed for any kind of partition, as long as the transaction is not committed, than just system files that are permanently marked with copy-on-write semantics and that are implicitly modified within their own local transactions.
The copy-on-write technique is however modified in order to allow efficient rollbacks and avoid the creation of fragmentation in the filesystem used by possibly many participants: the old data may be not overwritten immediately but kept where it is (notably when it is currently locked by someone else for consistent reads in its own transactions); in that case, only the new uncommitted data is kept in a temporary shadow (rather than the copy-on-write old data), that will be finally applied using normal VSS copy-on-write when the transaction will be committed by the writer. In addition, these temporary shadows for new data, only seen by the participating processes that have their own uncommitted data, are not necessarily immediately written to disk, but may just be maintained in memory or swapped out for later commits. Transaction NTFS does not restrict transactions to just the local NTFS volume, but also includes other transactional data or operations in other locations such as data stored in separate volumes, the local registry, or SQL databases, or the current states of system services or remote services.
These transactions are coordinated network-wide with all participants using a specific service, the Distributed Transactions Coordinator (DTC), to ensure that all participants will receive same commit state, and to transport the changes that have been validated by any participant (so that the others can invalidate their local caches for old data or rollback their ongoing uncommitted changes). Transactional NTFS allows, for example, the creation of network-wide consistent distributed filesystems, including with their local live or offline caches.
Encrypting File System (EFS)
EFS provides strong and user-transparent encryption of any file or folder on an NTFS volume. EFS works in conjunction with the EFS service, Microsoft's CryptoAPI and the EFS File System Run-Time Library (FSRTL). EFS works by encrypting a file with a bulk symmetric key (also known as the File Encryption Key, or FEK), which is used because it takes a relatively small amount of time to encrypt and decrypt large amounts of data than if an asymmetric key cipher is used. The symmetric key that is used to encrypt the file is then encrypted with a public key that is associated with the user who encrypted the file, and this encrypted data is stored in an alternate data stream of the encrypted file. To decrypt the file, the file system uses the private key of the user to decrypt the symmetric key that is stored in the file header. It then uses the symmetric key to decrypt the file. Because this is done at the file system level, it is transparent to the user. Also, in case of a user losing access to their key, support for additional decryption keys has been built in to the EFS system, so that a recovery agent can still access the files if needed. NTFS-provided encryption and compression are mutually exclusive—NTFS can be used for one and a third-party tool for the other.
The support of EFS is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Quotas
Disk quotas were introduced in NTFS v3. They allow the administrator of a computer that runs a version of Windows that supports NTFS to set a threshold of disk space that users may use. It also allows administrators to keep track of how much disk space each user is using. An administrator may specify a certain level of disk space that a user may use before they receive a warning, and then deny access to the user once they hit their upper limit of space. Disk quotas do not take into account NTFS's transparent file-compression, should this be enabled. Applications that query the amount of free space will also see the amount of free space left to the user who has a quota applied to them.
The support of disk quotas is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Reparse points
This feature was introduced in NTFS v3 Reparse points are used by associating a reparse tag in the user space attribute of a file or directory. When the object manager (see Windows NT line executive) parses a file system name lookup and encounters a reparse attribute, it knows to reparse the name lookup, passing the user controlled reparse data to every file system filter driver that is loaded into Windows 2000. Each filter driver examines the reparse data to see whether it is associated with that reparse point, and if that filter driver determines a match, then it intercepts the file system call and executes its special functionality. Reparse points are used to implement Volume Mount Points, Directory Junctions, Hierarchical Storage Management, Native Structured Storage, Single Instance Storage, and Symbolic Links
Volume mount points
Volume mount points are similar to Unix mount points, where the root of another file system is attached to a directory In NTFS, this allows additional file systems to be mounted without requiring a separate drive letter (such as C: or D:) for each
Once a volume has been mounted on top of an existing directory of another volume, the contents previously listed in that directory become invisible and are replaced by the content of the root directory of the mounted volume. The mounted volume could still have its own drive letter assigned separately. The file system does not allow volumes to be mutually mounted on each other. Volume mount points can be made to be either persistent (remounted automatically after system reboot) or not persistent (must be manually remounted after reboot
Mounted volumes may use other file systems than just NTFS; notably they may be remote shared directories, possibly with their own security settings and remapping of access rights according to the remote file system policy
Directory junctions
Similar to volume mount points, however directory junctions reference other directories in the file system instead of other volumes. For instance, the directory C:\exampledir with a directory junction attribute that contains a link to D:\linkeddir will automatically refer to the directory D:\linkeddir when it is accessed by a user-mode application. This function is conceptually similar to symbolic links to directories in Unix, except that the target in NTFS must always be another directory (typical Unix file systems allow the target of a symbolic link to be any type of file) and have the semantics of a hardlink (i.e., they must be immediately resolvable when they are createdDirectory joins (which can be created with the command MKLINK /J junctionName targetDirectory and removed with RMDIR junctionName from a console prompt) are persistent, and resolved on the server side as they share the same security realm of the local system or domain on which the parent volume is mounted and the same security settings for its contents as the content of the target directory; however the junction itself may have distinct security settings. Unlinking a directory junction join does not delete files in the target directory Note that some directory junctions are installed by default on Windows Vista, for compatibility with previous versions of Windows, such as Documents and Settings in the root directory of the system drive, which links to the Users physical directory in the root directory of the same volume. However they are hidden by default, and their security settings are set up so that the Windows Explorer will refuse to open them from within the Shell or in most applications, except for the local built-in SYSTEM user or the local Administrators group (both user accounts are used by system software installers). This additional security restriction has probably been made to avoid users of finding apparent duplicate files in the joined directories and deleting them by error, because the semantics of directory junctions is not the same as hardlinks: the reference counting is not used on the target contents and not even on the referenced container itself
Directory junctions are soft links (they will persist even if the target directory is removed), working as a limited form of symbolic links (with an additional restriction on the location of the target), but it is an optimized version which allows faster processing of the reparse point with which they are implemented, with less overhead than the newer NTFS symbolic links, and can be resolved on the server side (when they are found in remote shared directories
Symbolic links
Symbolic links (or soft links) were introduced in Windows Vista. Symbolic links are resolved on the client side. So when a symbolic link is shared, the target is subject to the access restrictions on the client, and not the server
Symbolic links can be created either to files (created with MKLINK symLink targetFilename) or to directories (created with MKLINK /D symLinkD targetDirectory), but the semantic of the link must be provided with the created link. The target however need not exist or be available when the symbolic link is created: when the symbolic link will be accessed and the target will be checked for availability, NTFS will also check if it has the correct type (file or directory); it will return a not-found error if the existing target has the wrong type
They can also reference shared directories on remote hosts or files and subdirectories within shared directories: their target is not mounted immediately at boot, but only temporarily on demand while opening them with the OpenFile() or CreateFile() API. Their definition is persistent on the NTFS volume where they are created (all types of symbolic links can be removed as if they were files, using DEL symLink from a command line prompt or batch
Single Instance Storage (SIS)
When there are several directories that have different, but similar, files, some of these files may have identical content. Single instance storage allows identical files to be merged to one file and create references to that merged file. SIS consists of a file system filter that manages copies, modification and merges to files; and a user space service (or groveler) that searches for files that are identical and need merging. SIS was mainly designed for remote installation servers as these may have multiple installation images that contain many identical files; SIS allows these to be consolidated but, unlike for example hard links, each file remains distinct; changes to one copy of a file will leave others unaltered. This is similar to copy-on-write, which is a technique by which memory copying is not really done until one copy is modified.
Hierarchical Storage Management (HSM)
Hierarchical Storage Management is a means of transferring files that are not used for some period of time to less expensive storage media. When the file is next accessed, the reparse point on that file determines that it is needed and retrieves it from storage
Native Structured Storage (NSS)
NSS was an ActiveX document storage technology that has since been discontinued by Microsoft It allowed ActiveX Documents to be stored in the same multi-stream format that ActiveX uses internally. An NSS file system filter was loaded and used to process the multiple streams transparently to the application, and when the file was transferred to a non-NTFS formatted disk volume it would also transfer the multiple streams into a single stream.
Interoperability
Details on the implementation's internals are not released, which makes it difficult for third-party vendors to provide tools to handle NTFS.
Linux
The ability to read and write to NTFS is provided by the NTFS-3G driver. It is included in most Linux distributions. Other outdated and mostly read-only solutions exist as well:
• Linux kernel 2.2: Kernel versions 2.2.0 and later include the ability to read NTFS partitions
• Linux kernel 2.6: Kernel versions 2.6.0 and later contain a driver written by Anton Altaparmakov (University of Cambridge) and Richard Russon. It supports file read, overwrite and resize.
• NTFSMount: A read/write userspace NTFS driver. It provides read-write access to NTFS, excluding writing compressed and encrypted files, changing file ownership, and access rights.
• Tuxera NTFS: High-performance read/write commercial kernel driver, mainly targeted for embedded devices from Tuxera Ltd which also develops the open source NTFS-3G driver.
• NTFS for Linux: A commercial driver with full read/write support available as free and non-free download(s) from Paragon Software Group.
• Captive NTFS: A 'wrapping' driver which uses Windows' own driver, ntfs.sys.
Note that all three userspace drivers, namely NTFSMount, NTFS-3G and Captive NTFS, are built on the Filesystem in Userspace (FUSE), a Linux kernel module tasked with bridging userspace and kernel code to save and retrieve data. All drivers listed above (except Tuxera NTFS and Paragon NTFS for Linux) are open source (GPL). Due to the complexity of internal NTFS structures, both the built-in 2.6.14 kernel driver and the FUSE drivers disallow changes to the volume that are considered unsafe, to avoid corruption.
Mac OS X
Mac OS X v10.3 and later include read-only support for NTFS-formatted partitions. The GPL-licensed NTFS-3G also works on Mac OS X through FUSE and allows reading and writing to NTFS partitions. A performance enhanced commercial version, called Tuxera NTFS for Mac, is also available from the NTFS-3G developers. NTFS write support has been discovered in Mac OS X 10.6, but has not been activated as of version 10.6.1, although hacks do exist to enable the functionality.
Microsoft Windows
While the different NTFS versions are for the most part fully forward- and backward-compatible, there are technical considerations for mounting newer NTFS volumes in older versions of Microsoft Windows. This affects dual-booting, and external portable hard drives.
For example, attempting to use an NTFS partition with "Previous Versions" (a.k.a. Volume Shadow Copy) on an operating system that doesn't support it, will result in the contents of those previous versions being lost.
Others
eComStation, and FreeBSD offer read-only NTFS support (there is a beta NTFS driver that allows write/delete for eComStation, but is generally considered unsafe). A free third-party tool for BeOS, which was based on NTFS-3G, allows full NTFS read and write. NTFS-3G also works on Linux, Mac OS X, FreeBSD, NetBSD, Solaris, QNX and Haiku, in addition to Linux, through FUSE A free for personal use read/write driver for MS-DOS called "NTFS4DOS" also exists.
Compatibility with FAT
Microsoft currently provides a tool (convert.exe) to convert HPFS (only on Windows NT 3), FAT16 and, on Windows 2000 and higher, FAT32 to NTFS, but not the other way around.
Resizing
Various third-party tools are all capable of safely resizing NTFS partitions. Microsoft added the ability to shrink or expand a partition with Windows Vista, but this capability is limited because it will not relocate page file fragments or files that have been marked as unmovable. So shrinking requires relocating or disabling any page file, the index of Windows Search, and any Shadow Copy used by System Restore. Using a 3rd-party tool is an easier option.
Universal time
For historical reasons, the versions of Windows that do not support NTFS all keep time internally as local zone time, and therefore so do all file systems other than NTFS that are supported by current versions of Windows. However, Windows NT and its descendants keep internal timestamps as UTC and make the appropriate conversions for display purposes. Therefore, NTFS timestamps are in UTC. This means that when files are copied or moved between NTFS and non-NTFS partitions, the OS needs to convert timestamps on the fly. But if some files are moved when daylight saving time (DST) is in effect, and other files are moved when standard time is in effect, there can be some ambiguities in the conversions. As a result, especially shortly after one of the days on which local zone time changes, users may observe that some files have timestamps that are incorrect by one hour. Due to the differences in implementation of DST between the northern and southern hemispheres, this can result in a potential timestamp error of up to 4 hours in any given 12 months.
Internals
In NTFS, all file data—file name, creation date, access permissions, and contents—are stored as metadata in the Master File Table. This abstract approach allowed easy addition of file system features during Windows NT's development — an interesting example is the addition of fields for indexing used by the Active Directory software.
NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names, etc.). This means UTF-16 codepoints are supported, but the file system does not check whether a sequence is valid UTF-16 (it allows any sequence of short values, not restricted to those in the Unicode standard).
Internally, NTFS uses B+ trees to index file system data. Although complex to implement, this allows faster file look up times in most cases. A file system journal is used to guarantee the integrity of the file system metadata but not individual files' content. Systems using NTFS are known to have improved reliability compared to FAT file systems. The Master File Table (MFT) contains metadata about every file, directory, and metafile on an NTFS volume. It includes filenames, locations, size, and permissions. Its structure supports algorithms which minimize disk fragmentation. A directory entry consists of a filename and a "file ID" which is the record number representing the file in the Master File Table. The file ID also contains a reuse count to detect stale references. While this strongly resembles the W_FID of Files-11, other NTFS structures radically differ.
Metafiles
NTFS contains several files which define and organize the file system. In all respects, most of these files are structured like any other user file ($Volume being the most peculiar), but are not of direct interest to file system clients. These metafiles define files, back up critical file system data, buffer file system changes, manage free space allocation, satisfy BIOS expectations, track bad allocation units, and store security and disk space usage information. All content is in an unnamed data stream, unless otherwise indicated.
Segment Number
File Name
Purpose
0 $MFT Describes all files on the volume, including file names, timestamps, stream names, and lists of cluster numbers where data streams reside, indexes, security identifiers, and file attributes like "read only", "compressed", "encrypted", etc.
1 $MFTMirr Duplicate of the first vital entries of $MFT, usually 4 entries (4 KB).
2 $LogFile Contains transaction log of file system metadata changes.
3 $Volume Contains information about the volume, namely the volume object identifier, volume label, file system version, and volume flags (mounted, chkdsk requested, requested $LogFile resize, mounted on NT 4, volume serial number updating, structure upgrade request). This data is not stored in a data stream, but in special MFT attributes: If present, a volume object ID is stored in an $OBJECT_ID record; the volume label is stored in a $VOLUME_NAME record, and the remaining volume data is in a $VOLUME_INFORMATION record. Note: volume serial number is stored in file $Boot (below).
4 $AttrDef A table of MFT attributes which associates numeric identifiers with names.
5 . Root directory. Directory data is stored in $INDEX_ROOT and $INDEX_ALLOCATION attributes both named $I30.
6 $Bitmap An array of bit entries: each bit indicates whether its corresponding cluster is used (allocated) or free (available for allocation).
7 $Boot Volume boot record. This file is always located at the first clusters on the volume. It contains bootstrap code (see NTLDR/BOOTMGR) and a BIOS parameter block including a volume serial number and cluster numbers of $MFT and $MFTMirr. $Boot is usually 8192 bytes long.
8 $BadClus A file which contains all the clusters marked as having bad sectors. This file simplifies cluster management by the chkdsk utility, both as a place to put newly discovered bad sectors, and for identifying unreferenced clusters. This file contains two data streams, even on volumes with no bad sectors: an unnamed stream contains bad sectors—it is zero length for perfect volumes; the second stream is named $Bad and contains all clusters on the volume not in the first stream.
9 $Secure Access control list database which reduces overhead having many identical ACLs stored with each file, by uniquely storing these ACLs in this database only (contains two indices $SII: perhaps Security ID Index and $SDH: Security Descriptor Hash which index the stream named $SDS containing actual ACL table).
10 $UpCase A table of unicode uppercase characters for ensuring case insensitivity in Win32 and DOS namespaces.
11 $Extend A filesystem directory containing various optional extensions, such as $Quota, $ObjId, $Reparse or $UsnJrnl.
12 ... 23 Reserved for $MFT extension entries.
usually 24 $Extend\$Quota Holds disk quota information. Contains two index roots, named $O and $Q.
usually 25 $Extend\$ObjId Holds distributed link tracking information. Contains an index root and allocation named $O.
usually 26 $Extend\$Reparse Holds reparse point data (such as symbolic links). Contains an index root and allocation named $R.
27 ... file.ext Beginning of regular file entries.
These metafiles are treated specially by Windows and are difficult to directly view: special purpose-built tools are needed.
From MFT records to attribute lists, attributes, and streams
For each file (or directory) described in the MFT record, there's a linear repository of stream descriptors (also named attributes), packed together in a variable-length record (also named an attributes list), with extra padding to fill the fixed 1KB size of every MFT record, and that fully describes the effective streams associated with that file.
Each stream (or attribute) itself has a single type (internally just a fixed-size integer in the stored descriptor, but most often handled in applications using an equivalent symbolic name in the FileOpen() or FileCreate() API call), a single optional stream name (completely unrelated to the effective filenames), plus optional associated data for that stream. For NTFS, the standard data of files, or the index data for directories are handled the same way as other data for alternate data streams, or for standard attributes. They are just one of the attributes stored in one or several attribute lists.
• For each file described in the MFT record (or in the non-resident respository of stream descriptors, see below), the stream descriptors identified by their (stream type value, stream name) must be unique. Additionally, NTFS has some ordering constraints for these descriptors.
• There's a predefined null stream type, used to indicate the end of the list of stream descriptors in the streams repository for that file. It must be present as the last stream descriptor in each stream repository (all other storage space available after it will be ignored and just consists in padding bytes to match the record size in the MFT or a cluster size in a non-resident streams repository).
• Some stream types are required and must be present in each MFT record, except unused records that are just indicated by a stream with null stream type.
o This is the case for the standard attributes that are stored as a fixed-size record and containing the timestamps and other basic single-bit attributes (compatible with those managed by FAT/FAT32 in DOS or Windows 95/98 applications).
• Some stream types cannot have a name and must remain anonymous.
o This is the case for the standard attributes, or for the preferred NTFS "filename" stream type, or the "short filename" stream type, when it is also present (for compatibility with DOS-like applications, see below). It is also possible for a file to only contain a short filename, in which case it will be the preferred one, as listed in the Windows Explorer.
o The filename streams stored in the streams repository do not make the file immediately accessible through the hierarchical filesystem. In fact, all the filenames must be indexed separately in at least one separate directory on the same volume, with its own MTF entry and its own security descriptors and attributes, that will reference the MFT entry number for that file. This allows the same file or directory to be "hardlinked" several times from several containers on the same volume, possibly with distinct filenames.
• The default data stream of a regular file is a stream of type $DATA but with an anonymous name, and the ADS's are similar but must be named.
• On the opposite, the default data stream of directories has a distinct type, but are not anonymous: they have a stream name ("$I30" in NTFS 3+) that reflects its indexing format.
Resident vs. non-resident data streams
To optimize the storage and reduce the I/O overhead for the very common case of streams with very small associated data, NTFS prefers to place this data within the stream descriptor (if the size of the stream descriptor does not then exceed the maximum size of the MFT record or the maximum size of a single entry within an non-resident stream repository, see below), instead of using the MFT entry space to list clusters containing the data; in that case, the stream descriptor will not store the data directly but will just store an allocation map pointing to the actual data stored elsewhere on the volume. When the stream data can be accessed directly from within the stream descriptor, it is called "resident data" by computer forensics workers. The amount of data which fits is highly dependent on the file's characteristics, but 700 to 800 bytes is common in single-stream files with non-lengthy filenames and no ACLs.
• Some stream descriptors (such as the preferred filename, the basic file attributes, or the main allocation map for each non-resident stream) cannot be made non-resident.
• Encrypted-by-NTFS, sparse data streams, or compressed data streams cannot be made resident.
• The format of the allocation map for non-resident streams depends on its capability of supporting sparse data storage. In the current implementation of NTFS, once a non-resident stream data has been marked and converted as sparse, it cannot be reverted to non-sparse data, so it cannot become resident again, unless this data is fully truncated, discarding the sparse allocation map completely.
• When a non-resident data stream is too much fragmented, so that its effective allocation map cannot fit entirely within the MFT record, the allocation map may be also stored as an non-resident stream, with just a small resident stream containing the indirect allocation map to the effective non-resident allocation map of the non-resident data stream.
• When there are too many streams for a file (including ADS's, extended attributes, or security descriptors), so that their descriptors cannot fit all within the MFT record, a non-resident stream may also be used to store an additional repository for the other stream descriptors (except those few small streams that cannot be non-resident), using the same format as the one used in the MFT record, but without the space constraints of the MFT record.
The NTFS filesystem driver will sometimes attempt to relocate the data of some of these non-resident streams into the streams repository, and will also attempt to relocate the stream descriptors stored in a non-resident repository back to the stream repository of the MFT record, based on priority and preferred ordering rules, and size constraints.
Since resident files do not directly occupy clusters ("allocation units"), it is possible for an NTFS volume to contain more files on a volume than there are clusters. For example, an 80 GB (74.5 GB) partition NTFS formats with 19,543,064 clusters of 4 KB. Subtracting system files (64 MB log file, a 2,442,888-byte $Bitmap file, and about 25 clusters of fixed overhead) leaves 19,526,158 clusters free for files and indices. Since there are four MFT records per cluster, this volume theoretically could hold almost 4 × 19,526,158 = 78,104,632 resident files.
Limitations
The following are a few limitations of NTFS:
File Names
File names are limited to 255 UTF-16 code words. Certain names are reserved in the volume root directory and cannot be used for files. These are: $MFT, $MFTMirr, $LogFile, $Volume, $AttrDef, . (dot), $Bitmap, $Boot, $BadClus, $Secure, $Upcase, and $Extend; . (dot) and $Extend are both directories; the others are files. The NT kernel limits full paths to 32,767 UTF-16 code words.
Maximum Volume Size
In theory, the maximum NTFS volume size is 264−1 clusters. However, the maximum NTFS volume size as implemented in Windows XP Professional is 232−1 clusters. For example, using 64 KB clusters, the maximum NTFS volume size is 256 TB minus 64 KB. Using the default cluster size of 4 KB, the maximum NTFS volume size is 16 TB minus 4 KB. (Both of these are vastly higher than the 128 GB limit lifted in Windows XP SP1.) Because partition tables on master boot record (MBR) disks only support partition sizes up to 2 TB, dynamic or GPT volumes must be used to create NTFS volumes over 2 TB. Booting from a GPT volume to a Windows environment requires a system with EFI and 64-bit support.
Maximum File Size
Theoretical: 16 EB minus 1 KB (264 − 210 or 18,446,744,073,709,550,592 bytes). Implementation: 16 TB minus 64 KB (244 − 216 or 17,592,185,978,880 bytes)
Alternate Data Streams
Windows system calls may handle alternate data streams. Depending on the operating system, utility and remote file system, a file transfer might silently strip data streams. A safe way of copying or moving files is to use the BackupRead and BackupWrite system calls, which allow programs to enumerate streams, to verify whether each stream should be written to the destination volume and to knowingly skip offending streams.
File Allocation Table or FAT is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of its relative simplicity. Performance of FAT compares poorly to most other file systems as it uses overly simplistic data structures, making file operations time-consuming, and makes poor use of disk space in situations where many small files are present.
For floppy disks, the FAT has been standardized as ECMA-107 and ISO/IEC 9293. Those standards include only FAT12 and FAT16 without long filename support; long filenames with FAT is partially patented.
The FAT file system is relatively straightforward technically and is supported by virtually all existing operating systems for personal computers. This makes it a useful format for solid-state memory cards and a convenient way to share data between operating systems.
History
The FAT file system was developed by Bill Gates and Marc McDonald during 1976–1977 It was the primary file system for various operating systems including DR-DOS, FreeDOS, MS-DOS, OS/2 (v1.1) and Microsoft Windows (up until Windows Me).
The FAT file system was created for managing disks in Microsoft Standalone Disk BASIC. In August 1980 Tim Paterson incorporated FAT into his 86-DOS operating system for the S-100 8086 CPU boards the file system was the main difference between 86-DOS and its predecessor, CP/M.
The name originates from the usage of a table which centralizes the information about which areas belong to files, are free or possibly unusable, and where each file is stored on the disk. To limit the size of the table, disk space is allocated to files in contiguous groups of hardware sectors called clusters. As disk drives have evolved, the maximum number of clusters has dramatically increased, and so the number of bits used to identify each cluster has grown. The successive major versions of the FAT format are named after the number of table element bits: 12, 16, and 32. The FAT standard has also been expanded in other ways while preserving backward compatibility with existing software.
FAT12
The initial version of FAT is now referred to as FAT12. Designed as a file system for floppy disks, it limited cluster addresses to 12-bit values, which not only limited the cluster count to 4078, but made FAT manipulation tricky with the PC's 8-bit and 16-bit registers. (Under Linux, FAT12 is limited to 4084 clusters.) The disk's size is stored as a 16-bit count of sectors, which limited the size to 32 MB FAT12 was used by several manufacturers with different physical formats, but a typical floppy disk at the time was 5.25-inch, single-sided, 40 tracks, with 8 sectors per track, resulting in a capacity of 160 KB for both the system areas and files. The FAT12 limitations exceeded this capacity by a factor of ten or more.
By convention, all the control structures were organized to fit inside the first track, thus avoiding head movement during read and write operations, although this varied depending on the manufacturer and physical format of the disk. At the time FAT12 was introduced, DOS did not support hierarchical directories, and the maximum number of files was typically limited to a few dozen. Hierarchical directories were introduced in MS-DOS version 2.0A limitation which was not addressed until much later was that any bad sector in the control structures area, track 0, could prevent the disk from being usable. The DOS formatting tool rejected such disks completely. Bad sectors were allowed only in the file area, where they made the entire holding cluster unusable as well. FAT12 remains in use on all common floppy disks, including 1.44MB ones.
Initial FAT16
In 1984, IBM released the PC AT, which featured a 20 MB hard disk. Microsoft introduced MS-DOS 3.0 in parallel. (The earlier PC XT was the first PC with a hard drive from IBM, and MS-DOS 2.0 supported that hard drive with FAT12.) Cluster addresses were increased to 16-bit, allowing for up to 65,517 clusters per volume, and consequently much greater file system sizes, at least in theory. However, the maximum possible number of sectors and the maximum (partition, rather than disk) size of 32 MB did not change. Therefore, although technically already "FAT16", this format was not what today is commonly understood as FAT16. With the initial implementation of FAT16 not actually providing for larger partition sizes than FAT12, the early benefit of FAT16 was to enable the use of smaller clusters, making disk usage more efficient, particularly for files several hundred bytes in size, which were far more common at the time. Also, the introduction of FAT16 actually did bring an increase in the maximum partition size under MS-DOS, since the implementation of FAT12 for hard disks in MS-DOS 2.0 was limited to 15 MB. (That is, the initial FAT16 did not support larger drives than FAT12, but MS-DOS 3.0 using FAT16 did support larger drives than MS-DOS 2.0 using FAT12, by a factor of two)
A 20 MB hard disk formatted under MS-DOS 3.0 was not accessible by the older MS-DOS 2.0. (This was because MS-DOS 2.0 did not support version 3.0's FAT-16 and because it did not support hard disk partitions over 15 MB in size.) Of course, MS-DOS 3.0 could still access MS-DOS 2.0 style 8 KB-cluster partitions.
MS-DOS 3.0 also introduced support for high-density 1.2 MB 5.25" diskettes, which notably had 15 sectors per track, hence more space for the FATs. This probably prompted a dubious optimization of the cluster size, which went down from 2 sectors to just 1. The net effect was that high density diskettes were significantly slower than older double density ones
Extended partition and logical drives
Apart from improving the structure of the FAT file system itself, a parallel development allowing an increase in the maximum possible FAT size was the introduction of multiple FAT partitions. Originally DOS was only prepared to handle one FAT partition, although it came with documentation and programming tools for the creation of installable device drivers to handle multiple partitions, and third-party suppliers quickly provided the missing software. Aside from that, partitions were used for sharing the disk between operating systems, typically DOS and Xenix at the time. Extra DOS partitions could not be used as boot partitions, because the installable device drivers were loaded (in config.sys) only after the first part of the DOS boot process. Later, third party tools became available that replaced the DOS master boot record (MBR) and directly loaded non-DOS drivers before DOS: such systems generally came with careful warnings that without the 3rd party software, the disk would not be compatible with DOS. Simply allowing several identical-looking DOS partitions could lead to naming problems: behaviour if more than one partition was marked active was undocumented (although well defined), as was the behaviour if there was more than one hard disk in the computer (which was machine dependent), or if the system was booted from a diskette.
To allow the use of more FAT partitions in a compatible way, a new partition type was introduced (in MS-DOS 3.2, January 1986), the extended partition, which is a container for additional partitions called logical drives. Originally only one logical drive was possible, permitting hard disks up to 64 MB. In MS-DOS 3.3 (August 1987) this limit was increased to 24 drives, equal to the maximum number of available letters for drive names (A and B being reserved for the first two floppy drives, at least one of which many, if not most, systems of the era were equipped with; where only one was installed, B always simulated a second drive using A). Logical drives are described by on-disk structures which closely resemble the Master Boot Record (MBR) of the disk (which describes the primary partitions), likely to simplify the implementation. Though some believe these partitions were nested in a way analogous to Russian matryoshka dolls, that isn't the case. They are stored as a row of separate blocks within a single box; these blocks are often referred to as being chained together, by the links in their extended boot record (EBR) sectors. Only one extended partition is allowed. Under MS-DOS, logical drives are not bootable, and the extended partition can only be created after the primary FAT partition, which removes all ambiguity but also eliminates the possibility of booting several DOS versions from the same hard disk. (A few systems other than MS-DOS can boot logical drives, and partitions can be created in any order using third party formatting tools.)
A useful side-effect of the extended partition scheme was to significantly increase the maximum number of partitions possible on a PC hard disk beyond the four which could be described by the MBR alone.
Prior to the introduction of extended partitions, some hard disk controllers (which at that time were usually separate option boards) could make large hard disks appear at the hardware interface level as two separate disks. Otherwise, DOS "Block Device Drivers" were used to access the other 3 possible partitions on a disk.
Final FAT16
Finally in November 1987, Compaq DOS 3.31 (an OEM version of MS-DOS 3.3 released by Compaq with their machines) introduced what is today called the FAT16 format, with the expansion of the 16-bit disk sector count to 32 bits. The result was initially called the DOS 3.31 Large File System. Although the on-disk changes were minor, the entire DOS disk driver had to be converted to use 32-bit sector numbers, a task complicated by the fact that it was written in 16-bit assembly language.
In 1988 this improvement became more generally available through MS-DOS 4.0 and OS/2 1.1. The limit on partition size was dictated by the 8-bit signed count of sectors per cluster, which had a maximum power-of-two value of 64. With the standard hard disk sector size of 512 bytes, this gives a maximum of 32 KB clusters, thereby fixing the "definitive" limit for the FAT16 partition size at 2 gigabytes. On magneto-optical media, which can have 1 or 2 KB sectors instead of 1/2 KB, this size limit is proportionally larger.
Much later, Windows NT increased the maximum cluster size to 64 KB by considering the sectors-per-cluster count as unsigned. However, the resulting format was not compatible with any other FAT implementation of the time, and it generated greater internal fragmentation. Windows 98 also supported reading and writing this variant, but its disk utilities did not work with it.
The number of root directory entries available is determined when the volume is formatted, and is stored in a 16-bit signed field, defining an absolute limit of 32767 entries (32736, a multiple of 32, in practice). For historical reasons, FAT12 and FAT16 media generally use 512 root directory entries on non-floppy media. Other sizes may be incompatible with some software or devices (entries being file and/or folder names in the original 8.3 format Some third party tools like mkdosfs allow the user to set this parameter
Long file names
One of the user experience goals for the designers of Windows 95 was the ability to use long filenames (LFNs—up to 255 UTF-16 code points long), in addition to classic 8.3 filenames. LFNs were implemented using a workaround in the way directory entries are laid out (see below).
The version of the file system with this extension is usually known as VFAT after the Windows 95 virtual device driver, also known as "Virtual FAT" in Microsoft's documentation. Interestingly, the VFAT driver actually appeared before Windows 95, in Windows for Workgroups 3.11, but was only used for implementing 32-bit file access and did not support long file names.
In Windows NT, support for long filenames on FAT started from version 3.5. OS/2 added long filename support to FAT using extended attributes (EA) before the introduction of VFAT; thus, VFAT long filenames are invisible to OS/2, and EA long filenames are invisible to Windows.
FAT32
In order to overcome size limit of FAT16, while at the same time allowing DOS real mode code to handle the format, and without reducing available conventional memory unnecessarily, Microsoft implemented a next generation, known as FAT32. Cluster values are represented by 32-bit numbers, of which 28 bits are used to hold the cluster number, for a maximum of approximately 268 million (228) clusters. This allows for drive sizes of up to 8 terabytes with 32KB clusters, but the boot sector uses a 32-bit field for the sector count, limiting volume size to 2 TB on a hard disk with 512 byte sectors.
On Windows 95/98, due to the version of Microsoft's SCANDISK utility included with these operating systems being a 16-bit application, the FAT structure is not allowed to grow beyond around 4.2 million (< 222) clusters, placing the volume limit at 127.53 GiB. A limitation in original versions of Windows 98/98SE's Fdisk utility causes it to incorrectly report disk sizes over 64 GB. A corrected version is available from Microsoft, but it cannot partition drives larger than 512GB. The Windows 2000/XP installation program and filesystem creation tool imposes a limitation of 32 GB However, both systems can read and write to FAT32 file systems of any size. This limitation is by design and according to Microsoft was imposed because many tasks on a very large FAT32 file system become slow and inefficient. This limitation can be bypassed by using third-party formatting utilities. Windows Me supports the FAT32 file system without any limits. However, similarly to Windows 95/98/98SE there is no native support for 48-bit LBA in Windows ME, meaning that the maximum disk size for ATA disks is 127.6 GiB, the maximum size of an ATA disk using the previous long-standard 28-bit LBA.
FAT32 was introduced with Windows 95 OSR2, although reformatting was needed to use it, and DriveSpace 3 (the version that came with Windows 95 OSR2 and Windows 98) never supported it. Windows 98 introduced a utility to convert existing hard disks from FAT16 to FAT32 without loss of data. In the NT line, native support for FAT32 arrived in Windows 2000. A free FAT32 driver for Windows NT 4.0 was available from Winternals, a company later acquired by Microsoft. Since the acquisition the driver is no longer officially available.
The maximum possible size for a file on a FAT32 volume is 4 GB minus 1 byte (232−1 bytes). Video applications, large databases, and some other software easily exceed this limit. Larger files require another formatting type such as NTFS.
Fragmentation
The FAT file system does not contain mechanisms which prevent newly written files from becoming scattered across the partition. Other file systems, like HPFS, use free space bitmaps that indicate used and available clusters, which could then be quickly looked up in order to find free contiguous areas (improved in exFAT). Another solution is the linkage of all free clusters into one or more lists (as is done in Unix file systems). Instead, the FAT has to be scanned as an array to find free clusters, which can lead to performance penalties with large disks.
In fact, computing free disk space on FAT is one of the most resource intensive operations, as it requires reading the entire FAT linearly. A possible justification suggested by Microsoft's Raymond Chen for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a "DIR" operation, which always displays the free disk space as the last line. Displaying this line took longer and longer as the number of clusters increased.
The High Performance File System (HPFS) divides disk space into bands, which have their own free space bitmap, where multiple files opened for simultaneous write could be expanded separately.
Some of the perceived problems with fragmentation resulted from operating system and hardware limitations.
The single-tasking DOS and the traditionally single-tasking PC hard disk architecture (only 1 outstanding input/output request at a time, no DMA transfers) did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks.
Similarly, write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a crash, made easier by the lack of hardware protection between applications and the system.
MS-DOS also did not offer a system call which would allow applications to make sure a particular file has been completely written to disk in the presence of deferred writes (cf. fsync in Unix or DosBufReset in OS/2). Disk caches on MS-DOS were operating on disk block level and were not aware of higher-level structures of the file system. In this situation, cheating with regard to the real progress of a disk operation was most dangerous.
Modern operating systems have introduced these optimizations to FAT partitions, but optimizations can still produce unwanted artifacts in case of a system crash. A Windows NT system will allocate space to files on FAT in advance, selecting large contiguous areas, but in case of a crash, files which were being appended will appear larger than they were ever written into, with dozens of random kilobytes at the end.
With the large cluster sizes, 16 or 32K, forced by larger FAT32 partitions, the external fragmentation becomes somewhat less significant, and internal fragmentation, i.e. disk space waste (since files are rarely exact multiples of cluster size), starts to be a problem as well, especially when there are a great many small files.
Design
Overview
The following is an overview of the order of structures in a FAT partition or disk:
Contents Boot
Sector FS Information
Sector
(FAT32 only) More reserved
sectors
(optional) File
Allocation
Table #1 File
Allocation
Table #2 Root
Directory
(FAT12/16 only) Data Region (for files and directories) ...
(To end of partition or disk)
Size in sectors (number of reserved sectors) (number of FATs)*(sectors per FAT) (number of root entries*32)/Bytes per sector NumberOfClusters*SectorsPerCluster
A FAT file system is composed of four different sections.
1. The Reserved sectors, located at the very beginning. The first reserved sector (sector 0) is the Boot Sector (aka Partition Boot Record). It includes an area called the BIOS Parameter Block (with some basic file system information, in particular its type, and pointers to the location of the other sections) and usually contains the operating system's boot loader code. The total count of reserved sectors is indicated by a field inside the Boot Sector. Important information from the Boot Sector is accessible through an operating system structure called the Drive Parameter Block in DOS and OS/2. For FAT32 file systems, the reserved sectors include a File System Information Sector at sector 1 and a Backup Boot Sector at Sector 6.
2. The FAT Region. This typically contains two copies (may vary) of the File Allocation Table for the sake of redundancy checking, although the extra copy is rarely used, even by disk repair utilities. These are maps of the Data Region, indicating which clusters are used by files and directories.
3. The Root Directory Region. This is a Directory Table that stores information about the files and directories located in the root directory. It is only used with FAT12 and FAT16, and imposes on the root directory a fixed maximum size which is pre-allocated at creation of this volume. FAT32 stores the root directory in the Data Region, along with files and other directories, allowing it to grow without such a constraint. Thus, for FAT32, the Data Region starts here.
4. The Data Region. This is where the actual file and directory data is stored and takes up most of the partition. The size of files and subdirectories can be increased arbitrarily (as long as there are free clusters) by simply adding more links to the file's chain in the FAT. Note however, that files are allocated in units of clusters, so if a 1 KB file resides in a 32 KB cluster, 31 KB are wasted. FAT32 typically commences the Root Directory Table in cluster number 2: the first cluster of the Data Region.
FAT uses little endian format for entries in the header and the FAT(s).
Boot Sector
It is important to note that the first sector on a device isn't necessarily the boot sector. For partitioned devices (such as hard drives), the first sector is the Master Boot Record. On un-partitioned devices (eg. floppy disk) the first sector is the Volume Boot Record.
Common structure of the first 36 bytes used by all FAT versions:
Byte Offset Length (bytes) Description
0x00 3 Jump instruction. This instruction will be executed and will skip past the rest of the (non-executable) header if the partition is booted from. See Volume Boot Record. If the jump is two-byte near jmp it is followed by a NOP instruction.
0x03 8 OEM Name (padded with spaces). This value determines in which system disk was formatted. MS-DOS checks this field to determine which other parts of the boot record can be relied on. Common values are IBM 3.3 (with two spaces between the "IBM" and the "3.3"), MSDOS5.0, MSWIN4.1 and mkdosfs.
0x0b 2 Bytes per sector. A common value is 512, especially for file systems on IDE (or compatible) disks. The BIOS Parameter Block starts here.
0x0d 1 Sectors per cluster. Allowed values are powers of two from 1 to 128. However, the value must not be such that the number of bytes per cluster becomes greater than 32
0x0e 2 Reserved sector count. The number of sectors before the first FAT in the file system image. Should be 1 for FAT12/FAT16. Usually 32 for FAT32.
0x10 1 Number of file allocation tables. Almost always 2.
0x11 2 Maximum number of root directory entries. Only used on FAT12 and FAT16, where the root directory is handled specially. Should be 0 for FAT32. This value should always be such that the root directory ends on a sector boundary (i.e. such that its size becomes a multiple of the sector size). 224 is typical for floppy disks.
0x13 2 Total sectors (if zero, use 4 byte value at offset 0x20)
0x15 1 Media descriptor
0xF0 3.5" Double Sided, 80 tracks per side, 18 or 36 sectors per track (1.44MB or 2.88MB). 5.25" Double Sided, 80 tracks per side, 15 sectors per track (1.2MB). Used also for other media types.
0xF8 Fixed disk (i.e. Hard disk).
0xF9 3.5" Double sided, 80 tracks per side, 9 sectors per track (720K). 5.25" Double sided, 80 tracks per side, 15 sectors per track (1.2MB)
0xFA 5.25" Single sided, 80 tracks per side, 8 sectors per track (320K)
0xFB 3.5" Double sided, 80 tracks per side, 8 sectors per track (640K)
0xFC 5.25" Single sided, 40 tracks per side, 9 sectors per track (180K)
0xFD 5.25" Double sided, 40 tracks per side, 9 sectors per track (360K). Also used for 8".
0xFE 5.25" Single sided, 40 tracks per side, 8 sectors per track (160K). Also used for 8".
0xFF 5.25" Double sided, 40 tracks per side, 8 sectors per track (320K)
Same value of media descriptor should be repeated as first byte of each copy of FAT. Certain operating systems (MSX-DOS version 1.0) ignore boot sector parameters altogether and use media descriptor value from the first byte of FAT to determine file system parameters.
0x16 2 Sectors per File Allocation Table for FAT12/FAT16
0x18 2 Sectors per track
0x1a 2 Number of heads
0x1c 4 Count of hidden sectors preceding the partition that contains this FAT volume. This field should always be zero on media that are not partitioned.
0x20 4 Total sectors (if greater than 65535; otherwise, see offset 0x13)
Extended BIOS Parameter Block
Further structure used by FAT12 and FAT16, also known as Extended BIOS Parameter Block:
Byte Offset Length (bytes) Description
0x24 1 Physical drive number (0x00 for removable media, 0x80 for hard disks)
0x25 1 Reserved ("current head")
In Windows NT bit 0 is a dirty flag to request chkdsk at boot time. bit 1 requests surface scan too.
0x26 1 Extended boot signature. (Should be 0x29. Indicates that the following 3 entries exist.)
0x27 4 ID (serial number)
0x2b 11 Volume Label, padded with blanks (0x20).
0x36 8 FAT file system type, padded with blanks (0x20), e.g.: "FAT12 ", "FAT16 ". This is not meant to be used to determine drive type, however, some utilities use it in this way.
0x3e 448 Operating system boot code
0x1FE 2 Boot sector signature (0x55 0xAA)
The boot sector is portrayed here as found on e.g. an OS/2 1.3 boot diskette. Earlier versions used a shorter BIOS Parameter Block and their boot code would start earlier (for example at offset 0x2b in OS/2 1.1).
Further structure used by FAT32:
Byte Offset Length (bytes) Description
0x24 4 Sectors per file allocation table
0x28 2 FAT Flags (Only used during a conversion from a FAT12/16 volume.)
0x2a 2 Version (Defined as 0)
0x2c 4 Cluster number of root directory start
0x30 2 Sector number of FS Information Sector
0x32 2 Sector number of a copy of this boot sector (0 if no backup copy exists)
0x34 12 Reserved
0x40 1 Physical Drive Number (see FAT12/16 BPB at offset 0x24)
0x41 1 Reserved (see FAT12/16 BPB at offset 0x25)
0x42 1 Extended boot signature. (see FAT12/16 BPB at offset 0x26)
0x43 4 ID (serial number)
0x47 11 Volume Label
0x52 8 FAT file system type: "FAT32 "
0x5a 420 Operating system boot code
0x1FE 2 Boot sector signature (0x55 0xAA)
Exceptions
The implementation of FAT used in MS-DOS for the Apricot PC had a different boot sector layout, to accommodate that computer's non-IBM compatible BIOS. The jump instruction and OEM name were omitted, and the MS-DOS file system parameters (offsets 0x0B - 0x17 in the standard sector) were located at offset 0x50. Later versions of Apricot MS-DOS gained the ability to read and write disks with the standard boot sector in addition to those with the Apricot one.
DOS Plus on the BBC Master 512 did not use conventional boot sectors at all. Data disks omitted the boot sector and began with a single copy of the FAT (the first byte of the FAT was used to determine disk capacity) while boot disks began with a miniature ADFS file system containing the boot loader, followed by a single FAT. It could also access standard PC disks formatted to 180 KB or 360 KB, again using the first byte of the FAT to determine the capacity.
FS Information Sector
The "FS Information Sector" was introduced in FAT32[30] for speeding up access times of certain operations (in particular, getting the amount of free space). It is located at a sector number specified in the boot record at position 0x30 (usually sector 1, immediately after the boot record).
Byte Offset Length (bytes) Description
0x00 4 FS information sector signature (0x52 0x52 0x61 0x41 / "RRaA")
0x04 480 Reserved (byte values are 0x00)
0x1e4 4 FS information sector signature (0x72 0x72 0x41 0x61 / "rrAa")
0x1e8 4 Number of free clusters on the drive, or -1 if unknown
0x1ec 4 Number of the most recently allocated cluster
0x1f0 14 Reserved (byte values are 0x00)
0x1fe 2 FS information sector signature (0x55 0xAA)
File Allocation Table
A partition is divided up into identically sized clusters, small blocks of contiguous space. Cluster sizes vary depending on the type of FAT file system being used and the size of the partition, typically cluster sizes lie somewhere between 2 KB and 32 KB. Each file may occupy one or more of these clusters depending on its size; thus, a file is represented by a chain of these clusters (referred to as a singly linked list). However these clusters are not necessarily stored adjacent to one another on the disk's surface but are often instead fragmented throughout the Data Region.
The File Allocation Table (FAT) is a list of entries that map to each cluster on the partition. Each entry records one of five things:
• the cluster number of the next cluster in a chain
• a special end of clusterchain (EOC) entry that indicates the end of a chain
• a special entry to mark a bad cluster
• a special entry to mark a reserved cluster[citation needed]
• a zero to note that the cluster is unused
Each version of the FAT file system uses a different size for FAT entries. Smaller numbers result in a smaller FAT table, but waste space in large partitions by needing to allocate in large clusters. The FAT12 file system uses 12 bits per FAT entry, thus two entries span 3 bytes. It is consistently little-endian: if you consider the 3 bytes as one little-endian 24-bit number, the 12 least significant bits are the first entry and the 12 most significant bits are the second.
In the FAT32 file system, FAT entries are 32 bits, but only 28 of these are actually used; the 4 most significant bits are reserved.
FAT entry values:
FAT12 FAT16 FAT32 Description
0x000 0x0000 0x00000000 Free Cluster
0x001 0x0001 0x00000001 Reserved value; do not use
0x002–0xFEF 0x0002–0xFFEF 0x00000002–0x0FFFFFEF Used cluster; value points to next cluster
0xFF0–0xFF6 0xFFF0–0xFFF6 0x0FFFFFF0–0x0FFFFFF6 Reserved values; do not use[28].
0xFF7 0xFFF7 0x0FFFFFF7 Bad sector in cluster or reserved cluster
0xFF8–0xFFF 0xFFF8–0xFFFF 0x0FFFFFF8–0x0FFFFFFF Last cluster in file
Note that FAT32 uses only 28 bits of the 32 possible bits. The upper 4 bits are usually zero (as indicated in the table above) but are reserved and should be left untouched.
The first cluster of the Data Region is cluster #2. That leaves the first two entries of the FAT unused. In the first byte of the first entry a copy of the media descriptor is stored. The remaining 8 bits (if FAT16), or 20 bits (if FAT32) of this entry are 1. In the second entry the end-of-cluster-chain marker is stored. The high order two bits of the second entry are sometimes, in the case of FAT16 and FAT32, used for dirty volume management: high order bit 1: last shutdown was clean; next highest bit 1: during the previous mount no disk I/O errors were detected.[31]
Directory table
A directory table is a special type of file that represents a directory (also known as a folder). Each file or directory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the date and time of creation, the address of the first cluster of the file/directory's data and finally the size of the file/directory. Aside from the Root Directory Table in FAT12 and FAT16 file systems, which occupies the special Root Directory Region location, all Directory Tables are stored in the Data Region. The actual number of entries in a directory stored in the Data Region can grow by adding another cluster to the chain in the FAT.
Note that before each entry there can be "fake entries" to support the Long File Name. (See further down the article).
Legal characters for DOS file names include the following:
• Upper case letters A–Z
• Numbers 0–9
• Space (though trailing spaces in either the base name or the extension are considered to be padding and not a part of the file name, also filenames with space in them could not be used on the DOS command line prior to Windows 95 because of the lack of a suitable escaping system)
• ! # $ % & ' ( ) - @ ^ _ ` { } ~
• Values 128–255
This excludes the following ASCII characters:
• " * / : < > ? \ |
Windows/MSDOS has no shell escape character
• + , . ; = [ ]
They are allowed in long file names only.
• Lower case letters a–z
Stored as A–Z. Allowed in long file names.
• Control characters 0–31
• Value 127 (DEL)
The DOS file names are in the OEM character set.
Directory entries, both in the Root Directory Region and in subdirectories, are of the following format (see also 8.3 filename):
Byte Offset Length Description
0x00 8 DOS file name (padded with spaces)
The first byte can have the following special values:
0x00 Entry is available and no subsequent entry is in use
0x05 Initial character is actually 0xE5.
0x2E 'Dot' entry; either '.' or '..'
0xE5 Entry has been previously erased and is available. File undelete utilities must replace this character with a regular character as part of the undeletion process.
0x08 3 DOS file extension (padded with spaces)
0x0b 1 File Attributes
Bit Mask Description
0 0x01 Read Only
1 0x02 Hidden
2 0x04 System
3 0x08 Volume Label
4 0x10 Subdirectory
5 0x20 Archive
6 0x40 Device (internal use only, never found on disk)
7 0x80 Unused
An attribute value of 0x0F is used to designate a long file name entry.
0x0c 1 Reserved; two bits are used by NT and later versions to encode case information (see below); otherwise 0[32]
0x0d 1 Create time, fine resolution: 10ms units, values from 0 to 199.
0x0e 2 Create time. The hour, minute and second are encoded according to the following bitmap:
Bits Description
15-11 Hours (0-23)
10-5 Minutes (0-59)
4-0 Seconds/2 (0-29)
Note that the seconds is recorded only to a 2 second resolution. Finer resolution for file creation is found at offset 0x0d.
0x10 2 Create date. The year, month and day are encoded according to the following bitmap:
Bits Description
15-9 Year (0 = 1980, 127 = 2107)
8-5 Month (1 = January, 12 = December)
4-0 Day (1 - 31)
0x12 2 Last access date; see offset 0x10 for description.
0x14 2 EA-Index (used by OS/2 and NT) in FAT12 and FAT16, High 2 bytes of first cluster number in FAT32
0x16 2 Last modified time; see offset 0x0e for description.
0x18 2 Last modified date; see offset 0x10 for description.
0x1a 2 First cluster in FAT12 and FAT16. Low 2 bytes of first cluster in FAT32. Entries with the Volume Label flag, subdirectory ".." pointing to root, and empty files with size 0 should have first cluster 0.
0x1c 4 File size in bytes. Entries with the Volume Label or Subdirectory flag set should have a size of 0.
Clusters are numbered from a cluster offset as defined above and the FilestartCluster is in 0x1a. This would mean the first data segment X can be calculated using the Boot Sector fields:
For FAT32
FileStartSector = ReservedSectors(0x0e) + (NumofFAT(0x10) * Sectors2FAT(0x24)) + ((X − 2) * SectorsPerCluster(0x0d))
For FAT16/12
FileStartSector = ReservedSectors(0x0e) + (NumofFAT(0x10) * Sectors2FAT(0x16)) + (MaxRootEntry(0x11) * 32 / BytesPerSector(0x0b)) + ((X − 2) * SectorsPerCluster(0x0d))
Long file names
Long File Names (LFN) are stored on a FAT file system using a trick—adding (possibly multiple) additional entries into the directory before the normal file entry. The additional entries are marked with the Volume Label, System, Hidden, and Read Only attributes (yielding 0x0F), which is a combination that is not expected in the MS-DOS environment, and therefore ignored by MS-DOS programs and third-party utilities. Notably, a directory containing only volume labels is considered as empty and is allowed to be deleted; such a situation appears if files created with long names are deleted from plain DOS.
Older versions of PC-DOS mistake LFN names in the root directory for the volume label, and are likely to display an incorrect label.
Each phony entry can contain up to 13 UTF-16 characters (26 bytes) by using fields in the record which contain file size or time stamps (but not the starting cluster field, for compatibility with disk utilities, the starting cluster field is set to a value of 0. See 8.3 filename for additional explanations). Up to 20 of these 13-character entries may be chained, supporting a maximum length of 255 UTF-16 characters.[32]
After the last UTF-16 character, a 0x00 0x00 is added. Other not used characters are filled with 0xFF 0xFF.
LFN entries use the following format:
Byte Offset Length Description
0x00 1 Sequence Number
0x01 10 Name characters (five UTF-16 characters)
0x0b 1 Attributes (always 0x0F)
0x0c 1 Reserved (always 0x00)
0x0d 1 Checksum of DOS file name
0x0e 12 Name characters (six UTF-16 characters)
0x1a 2 First cluster (always 0x0000)
0x1c 4 Name characters (two UTF-16 characters)
If there are multiple LFN entries, required to represent a file name, firstly comes the last LFN entry (the last part of the filename). The sequence number here also has bit 7 (0x40) checked (this means the last LFN entry. However it's the first entry got when reading the directory file). The last LFN entry has the biggest sequence number which decreases in following entries. The first LFN entry has sequence number 1. Bit 8 (0x80) of the sequence number is used to indicate that the entry is deleted.
For example if we have filename "File with very long filename.ext" it would be formatted like this:
Sequence number Entry data
0x43 "me.ext"
0x02 "y long filena"
0x01 "File with ver"
??? Normal 8.3 entry
A checksum also allows verification of whether a long file name matches the 8.3 name; such a mismatch could occur if a file was deleted and re-created using DOS in the same directory position. The checksum is calculated using the algorithm below. (Note that pFcbName is a pointer to the name as it appears in a regular directory entry, i.e. the first eight characters are the filename, and the last three are the extension. The dot is implicit. Any unused space in the filename is padded with spaces (ASCII 0x20) char. For example, "Readme.txt" would be "README TXT".)
unsigned char lfn_checksum(const unsigned char *pFcbName)
{
int i;
unsigned char sum=0;
for (i=11; i; i--)
sum = ((sum & 1) << 7) + (sum >> 1) + *pFcbName++;
return sum;
}
If a filename contains only lowercase letters, or is a combination of a lowercase basename with an uppercase extension, or vice-versa; and has no special characters, and fits within the 8.3 limits, a VFAT entry is not created on Windows NT and later versions such as XP. Instead, two bits in byte 0x0c of the directory entry are used to indicate that the filename should be considered as entirely or partially lowercase. Specifically, bit 4 means lowercase extension and bit 3 lowercase basename, which allows for combinations such as "example.TXT" or "HELLO.txt" but not "Mixed.txt". Few other operating systems support this. This creates a backwards-compatibility problem with older Windows versions (95, 98, ME) that see all-uppercase filenames if this extension has been used, and therefore can change the name of a file when it is transported, such as on a USB flash drive. Current 2.6.x versions of Linux will recognize this extension when reading (source: kernel 2.6.18 /fs/fat/dir.c and fs/vfat/namei.c); the mount option shortname determines whether this feature is used when writing.
Third-party extensions
Before Microsoft added support for long filenames and creation/access time stamps, bytes 0x0C–0x15 of the directory entry were used by alternative operating systems to store additional metadata. These included:
Byte Offset Length System Description
0x0C 2 RISC OS
File type, 0x000 - 0xFFF
0x0C 1 DOS Plus
User-defined file attributes F1-F4
Bit Mask Description
7 0x80 F1
6 0x40 F2
5 0x20 F3
4 0x10 F4
0x0C 1 MSX-DOS 2
For a deleted file, the original first character of the filename.
0x0D 1 DR-DOS
For a deleted file, the original first character of the filename.
0x0E 2 DR-DOS and FlexOS
Encrypted file password
0x0E 2 ANDOS
File address in the memory
0x10 4 DR-DOS 7 For a deleted file, its original file time and date; deleted files have their normal time and date fields set to the time of deletion
0x12 2 DR-DOS 6 and FlexOS File owner ID
0x14 2 DR-DOS and FlexOS File permissions bitmap (execute permissions are only used by FlexOS):
Bit Mask Description
0 0x0001 Owner delete requires password
1 0x0002 Owner execute requires password
2 0x0004 Owner write requires password
3 0x0008 Owner read requires password
4 0x0010 Group delete requires password
5 0x0020 Group execute requires password
6 0x0040 Group write requires password
7 0x0080 Group read requires password
8 0x0100 World delete requires password
9 0x0200 World execute requires password
10 0x0400 World write requires password
11 0x0800 World read requires password
FAT licensing
Microsoft applied for, and was granted, a series of patents for key parts of the FAT file system in the mid-1990s. Being almost universally compatible and well-understood, FAT is frequently chosen as an interchange format for flash media used in digital cameras and PDAs.
On December 3, 2003 Microsoft announced[34] it would be offering licenses for use of its FAT specification and "associated intellectual property", at the cost of a US$0.25 royalty per unit sold, with a $250,000 maximum royalty per license agreement.[35]
To this end, Microsoft cited four patents on the FAT file system as the basis of its intellectual property claims. All four pertain to long-filename extensions to FAT first seen in Windows 95:
• U.S. Patent 5,745,902 - Method and system for accessing a file using file names having different file name formats. Filed July 6, 1992. This covered a means of generating and associating a short, 8.3 filename with long one (for example, "Microsoft.txt" with "MICROS~1.TXT") and a means of enumerating conflicting short filenames (for example, "MICROS~2.TXT" and "MICROS~3.TXT"). It is unclear whether this patent would cover an implementation of FAT without explicit long filename capabilities. Hard links in Unix file systems do not appear to be prior art: deleting a FAT file via its long name will also remove its short name. Renaming a file to a "short" name also updates the long file name for coherency; similarly, renaming a file to a "long" name will allocate a new "short" name. In NTFS, hard links and dual names are separate concepts and each hard link has two names. Finally, at the API level, both names are always provided together when a directory lookup is requested from the system; they do not appear as two separate files and do not have to be "matched" to determine unique files.
• U.S. Patent 5,579,517 - Common name space for long and short filenames. Filed for on 1995-04-24. This covers the method of chaining together multiple consecutive 8.3 named directory entries to hold long filenames, with some of the entries specially marked to prevent their confusing older, long filename-unaware FAT implementations.
o The Public Patent Foundation successfully challenged this patent; the claims were rejected[36] on 2004-09-14, due to prior disclosure[37] of the claimed techniques in patents U.S. Patent 5,307,494 and U.S. Patent 5,367,671. This decision was later overturned by the Patent Office on 2006-01-10.
• U.S. Patent 5,758,352 - Common name space for long and short filenames. Filed on 1996-09-05. This is very similar to 5,579,517.
o The Public Patent Foundation successfully challenged this patent (USPTO); The USPTO rejected this patent on 2005-10-05, on the grounds that "the six assignees names were incorrect".[38][39] This decision was also later overturned by the Patent Office on 2006-01-10.
• U.S. Patent 6,286,013 - Method and system for providing a common name space for long and short file names in an operating system. Filed on 1997-01-28. This makes claims on the methods used when Windows 95, Windows 98 and Windows Me expose long filenames to their MS-DOS compatibility layer. It does not appear to affect any non-Microsoft FAT implementations.
Many technical commentators[who?] have concluded that these patents only cover FAT implementations that include support for long filenames, and that removable solid state media and consumer devices only using short names would be unaffected.
Additionally, in the document "Microsoft Extensible Firmware Initiative FAT 32 File System Specification, FAT: General Overview of On-Disk Format" published by Microsoft (version 1.03, 2000-12-06), Microsoft specifically grants a number of rights, which many readers have interpreted as permitting operating system vendors to implement FAT.
Microsoft is not the only company to have applied for patents for parts of the FAT file system. Other patents affecting FAT include:
• U.S. Patent 5,367,671 - System for accessing extended object attribute (EA) data through file name or EA handle linkages in path tables. Filed on 1990-09-25 by Barry A. Feigenbaum and Felix Miro of IBM, this makes claims on the methods used by OS/2, Windows NT, and Linux for storing extended attribute data in the "EA DATA. SF" file.
Appeal
As there was widespread call for these patents to be re-examined, the Public Patent Foundation (PUBPAT) submitted evidence to the US Patent and Trade Office (USPTO) disputing the validity of these patents, including prior art references from Xerox and IBM. The USPTO acknowledged that the evidence raised "substantial new question[s] of patentability," and opened an investigation into the validity of Microsoft's FAT patents.[40]
On 2004-09-30 the USPTO rejected all claims of U.S. Patent 5,579,517, based primarily on evidence provided by PUBPAT. Dan Ravicher, the foundation's executive director, said, "The Patent Office has simply confirmed what we already knew for some time now, Microsoft's FAT patent is bogus."
According to the PUBPAT press release, "Microsoft still has the opportunity to respond to the Patent Office's rejection. Typically, third party requests for re-examination, like the one filed by PUBPAT, are successful in having the subject patent either narrowed or completely revoked roughly 70% of the time."
On 2005-10-05 the Patent Office announced that, following the re-examination process, it had again rejected all claims of patent 5,579,517, and it additionally found U.S. Patent 5,758,352 invalid on the grounds that the patent had incorrect assignees.
Finally, on 2006-01-10 the Patent Office ruled that features of Microsoft's implementation of the FAT system were "novel and non-obvious", reversing both earlier non-final decisions
Patent infringement lawsuit
In February 2009, Microsoft filed a patent infringement lawsuit against TomTom alleging that the device maker's products infringe on patents related to FAT32 filesystem. As some TomTom products are based on Linux, this marked the first time that Microsoft tried to enforce its patents against the Linux platformThe lawsuit was settled out of court the following month with an agreement that Microsoft be given access to four of TomTom's patents, that TomTom will drop support for the FAT32 filesystem from its products, and that in return Microsoft not seek legal action against TomTom for the five year duration of the settlement agreement
Subscribe to:
Posts (Atom)