Different file system, its main goal is to solve the problem of disk space management, while providing high efficiency, safety. If in a distributed environment, with a corresponding distributed file system. Ext series on Linux, Windows on Fat and NTFS. Shown under a linux file system structure.
Where VFS (Virtual File System) file system is a Linux Kernel module is a simple look Adapter, the lower the next shield the differences between different file systems, and for the operating system provides a unified interface.
The middle part is the realization of the different file system.
Further down is the Buffer Cache and Driver.
File system structure
A variety of different file system implementations, performance, manageability, reliability, etc. are different. The following is Linux Ext2 (Ext3) file system of a general structure.
Boot Block storage of the boot process.
Super Block store the entire file system, some global parameters, such as: volume name, status, block size, block number. His mount in the file system is read into memory when the umount when released.
The figure describes the Ext2 file system in three important data structures and their relationship.
Inode: Inode is the file system is the most important structure. Figure, which he recorded all the information related to the document, that is, we often say that the meta information. Including: file type, permissions, owner, size, atime and so on. Inode which also holds the actual file content information points to the index. Which this index in several categories:
- Direct Index: directly to the actual content of the information, the public 12. So if a file system block size is 1k, then the direct index to the contents of a maximum of 12k
- Indirect index
- Two indirect index
- Three indirect indexes
Directory represents a file system directory, including all of the current directory Inode information. Of which only two messages per line, one file name, one is the corresponding Inode. Note, Directory is not a special file system structure, he actually is a file with its own Inode, but the information content of the documents inside it, including those seen above, the file name and Inode correspondence. Below:
Data Block that is stored the contents of the file blocks of time. Data Block size of the disk must be a multiple of block size, the disk is generally 512 bytes, Data Block is generally 1K, 2K, 4K.
Buffer & Cache
Although the Buffer and Cache together, but in the actual process and the Cache Buffer is completely different. Write Buffer for the terms of the general, also called "buffer zone", the buffer allows multiple small pieces of data can be combined into one large data block, write-once; Cache general and for the time, also known as "cache" to avoid frequent disk read. Shown as the free Linux command, which is also the Cache Buffer and to distinguish between these two parts are considered in the free memory.
Buffer Cache cache, in essence, is the same with all of the cache, the data structure is also similar to the one below shows VxSF Buffer Cache structure.
This data structure and the buffer memcached and how similar to the Oracle SGA. The left side of the hash chain to complete the data block addressing, at the top of the list records the data block state.
Buffer vs Direct I / O
Buffer Cache file system and indeed in some cases improve speed, but will also bring some negative impact on the other hand. On the one hand the file system adds an intermediate layer, on the other hand, when the Cache improper use, configuration, or some services can not be bad to get the benefits when the cache, cache will become a burden.
Cache for business: serial large data services, such as: NFS, FTP.
Cache is not suitable for business: random IO operations. Such as: Oracle, a small file reads.
Block device, character device, raw device
These things look very dizzy, looking for some information but could not find very accurate description.
From the hardware point of view,
- Block device is to block (such as disk sector) send and receive data in units of the equipment, they support the buffering and random access (not necessarily in order to read the block, but can access any block at any time), and other features. Block devices include hard drives, CD-ROM, and RAM disk.
- Character devices do not have a physical address to the media. Character devices and tape devices, including serial ports, only by the character to read the data in these devices.
From the perspective of the operating system (corresponding to the operating system device file types b and c),
# Ls-l / dev / * lv
brw ------- 1 root system 22, 2 May 15 2007 lv
crw ------- 2 root system 22, 2 May 15 2007 rlv
- Block device can support buffer and random read and write. That read and write, can be any length of data. Minimum 1byte. On the block device, you can successfully run the following command: dd if = / dev / zero of = / dev/vg01/lv bs = 1 count = 1. Namely: write a byte in the device. Hardware does not support this operation (minimum 512), this time, the operating system to complete a first reading (such as 1K, the operating system read and write the smallest unit of the hardware device supports multiple data blocks), then Change the 1k on the data, and then written to the device.
- Character device can only support a fixed length of data read and write, where length is the minimum operating system to support reading and writing units, such as 1K, so the block device the buffer function, where there is no need to complete the user's own . Because read and write without any buffer, then the Executive dd if = / dev / zero of = / dev/vg01/lv bs = 1 count = 1, this command will fail, because here the bs (block size) is too small, the system can not support. If you do dd if = / dev / zero of = / dev/vg01/lv bs = 1024 count = 1, you can succeed. OS block size here is a kernel parameter.
As, by contrast, is more direct in the use of character devices, block devices and more flexible. File systems are generally built on the block device, while in pursuit of high performance, high character device is the better choice, such as the Oracle of the bare device.
Raw device, also known as bare partition is not formatted, not a file system storage space. Can write the binary content, but the content format, including information organization and other issues, people need to use it to complete. File system is built on top of the bare device, and complete the raw device space management.
CIO is parallel IO (Concurrent IO). In the file system, when a file is accessed simultaneously by multiple processes, there have been Inode competition. In general, use shared read locks, namely: number of read operations can be carried out concurrently, and use the exclusive write lock. Writing process, when the lock is occupied, all other operations are blocked. Therefore, when this occurs, the performance of the entire application will be greatly reduced. Figure:
CIO is to solve this problem. And encroaching upon the CIO to bring raw device performance improvement. When the file system support and open the CIO when the CIO, CIO will open the default file system Direct IO, namely: Buffer for IO operations without direct operation of the underlying data. Because the data without Buffer, in the file system level, no need to consider the problem of data consistency, therefore, read and write operations can execute in parallel.
Data stored in the end, when all operations are serial execution, CIO put this thing to pay a the bottom of the driver.