Talk about the IO (d) - File System

2011-01-09  来源:本站原创  分类:Java  人气:72 

Talk about the IO (d) - File System
Talk about the IO (d) - File System

Different file system, its main goal is to solve the problem of disk space management, while providing high efficiency, safety. If in a distributed environment, with a corresponding distributed file system. Ext series on Linux, Windows on Fat and NTFS. Shown under a linux file system structure.

Where VFS (Virtual File System) file system is a Linux Kernel module is a simple look Adapter, the lower the next shield the differences between different file systems, and for the operating system provides a unified interface.

The middle part is the realization of the different file system.

Further down is the Buffer Cache and Driver.

Talk about the IO (d) - File System

File system structure

A variety of different file system implementations, performance, manageability, reliability, etc. are different. The following is Linux Ext2 (Ext3) file system of a general structure.

Talk about the IO (d) - File System

Boot Block storage of the boot process.

Super Block store the entire file system, some global parameters, such as: volume name, status, block size, block number. His mount in the file system is read into memory when the umount when released.

Talk about the IO (d) - File System

The figure describes the Ext2 file system in three important data structures and their relationship.

Inode: Inode is the file system is the most important structure. Figure, which he recorded all the information related to the document, that is, we often say that the meta information. Including: file type, permissions, owner, size, atime and so on. Inode which also holds the actual file content information points to the index. Which this index in several categories:

  • Direct Index: directly to the actual content of the information, the public 12. So if a file system block size is 1k, then the direct index to the contents of a maximum of 12k
  • Indirect index
  • Two indirect index
  • Three indirect indexes


Talk about the IO (d) - File System

Directory represents a file system directory, including all of the current directory Inode information. Of which only two messages per line, one file name, one is the corresponding Inode. Note, Directory is not a special file system structure, he actually is a file with its own Inode, but the information content of the documents inside it, including those seen above, the file name and Inode correspondence. Below:

Talk about the IO (d) - File System

Data Block that is stored the contents of the file blocks of time. Data Block size of the disk must be a multiple of block size, the disk is generally 512 bytes, Data Block is generally 1K, 2K, 4K.

Buffer Cache

Buffer & Cache

Although the Buffer and Cache together, but in the actual process and the Cache Buffer is completely different. Write Buffer for the terms of the general, also called "buffer zone", the buffer allows multiple small pieces of data can be combined into one large data block, write-once; Cache general and for the time, also known as "cache" to avoid frequent disk read. Shown as the free Linux command, which is also the Cache Buffer and to distinguish between these two parts are considered in the free memory.

Talk about the IO (d) - File System

Buffer Cache

Buffer Cache cache, in essence, is the same with all of the cache, the data structure is also similar to the one below shows VxSF Buffer Cache structure.

Talk about the IO (d) - File System

This data structure and the buffer memcached and how similar to the Oracle SGA. The left side of the hash chain to complete the data block addressing, at the top of the list records the data block state.

Buffer vs Direct I / O

Buffer Cache file system and indeed in some cases improve speed, but will also bring some negative impact on the other hand. On the one hand the file system adds an intermediate layer, on the other hand, when the Cache improper use, configuration, or some services can not be bad to get the benefits when the cache, cache will become a burden.

Cache for business: serial large data services, such as: NFS, FTP.

Cache is not suitable for business: random IO operations. Such as: Oracle, a small file reads.

Block device, character device, raw device

These things look very dizzy, looking for some information but could not find very accurate description.

From the hardware point of view,

  • Block device is to block (such as disk sector) send and receive data in units of the equipment, they support the buffering and random access (not necessarily in order to read the block, but can access any block at any time), and other features. Block devices include hard drives, CD-ROM, and RAM disk.
  • Character devices do not have a physical address to the media. Character devices and tape devices, including serial ports, only by the character to read the data in these devices.

From the perspective of the operating system (corresponding to the operating system device file types b and c),

# Ls-l / dev / * lv

brw ------- 1 root system 22, 2 May 15 2007 lv

crw ------- 2 root system 22, 2 May 15 2007 rlv

  • Block device can support buffer and random read and write. That read and write, can be any length of data. Minimum 1byte. On the block device, you can successfully run the following command: dd if = / dev / zero of = / dev/vg01/lv bs = 1 count = 1. Namely: write a byte in the device. Hardware does not support this operation (minimum 512), this time, the operating system to complete a first reading (such as 1K, the operating system read and write the smallest unit of the hardware device supports multiple data blocks), then Change the 1k on the data, and then written to the device.
  • Character device can only support a fixed length of data read and write, where length is the minimum operating system to support reading and writing units, such as 1K, so the block device the buffer function, where there is no need to complete the user's own . Because read and write without any buffer, then the Executive dd if = / dev / zero of = / dev/vg01/lv bs = 1 count = 1, this command will fail, because here the bs (block size) is too small, the system can not support. If you do dd if = / dev / zero of = / dev/vg01/lv bs = 1024 count = 1, you can succeed. OS block size here is a kernel parameter.

As, by contrast, is more direct in the use of character devices, block devices and more flexible. File systems are generally built on the block device, while in pursuit of high performance, high character device is the better choice, such as the Oracle of the bare device.

Raw device

Raw device, also known as bare partition is not formatted, not a file system storage space. Can write the binary content, but the content format, including information organization and other issues, people need to use it to complete. File system is built on top of the bare device, and complete the raw device space management.


CIO is parallel IO (Concurrent IO). In the file system, when a file is accessed simultaneously by multiple processes, there have been Inode competition. In general, use shared read locks, namely: number of read operations can be carried out concurrently, and use the exclusive write lock. Writing process, when the lock is occupied, all other operations are blocked. Therefore, when this occurs, the performance of the entire application will be greatly reduced. Figure:

Talk about the IO (d) - File System

CIO is to solve this problem. And encroaching upon the CIO to bring raw device performance improvement. When the file system support and open the CIO when the CIO, CIO will open the default file system Direct IO, namely: Buffer for IO operations without direct operation of the underlying data. Because the data without Buffer, in the file system level, no need to consider the problem of data consistency, therefore, read and write operations can execute in parallel.

Data stored in the end, when all operations are serial execution, CIO put this thing to pay a the bottom of the driver.

Talk about the IO (d) - File System

  • Talk about the IO (d) - File System 2011-01-09

    Different file system, its main goal is to solve the problem of disk space management, while providing high efficiency, safety. If in a distributed environment, with a corresponding distributed file system. Ext series on Linux, Windows on Fat and NTF

  • Hadoop Distributed File System: Architecture and design elements 2009-04-21

    Hadoop Distributed File System: Architecture and design elements of the original: First, the premise and design goals 1, hardware errors are the norm, rather than anomalies, HDFS are probabl

  • Hadoop Distributed File System: Architecture and the Design Ⅱ (change) 2010-04-02

    5, file system metadata persistence Namenode storage HDFS metadata. For any changes to the file metadata generated operation, Namenode use a transaction log called Editlog record. For example, create a file in HDFS, Namenode will insert a record in E

  • How to import a digital certificate to the Android file system 2010-04-06

    Hi, Hello, everybody. I was under the Eclipse Android application development, development process requires the use of HTTP methods to connect to remote server, connecting the former needs to be server-side authorization and authentication through SS

  • xfs file system optimization 2010-04-07

    Online collection of xfs file system optimization program, to be a summary of their own. Xfs formatted partition mkfs.xfs first parameter: mkfs.xfs -f -i size=512 -l size=128m,lazy-count=1 -d agcount=16 /dev/sdb1 -I size = 512: The default value is 2

  • About the new generation Linux file system btrfs (change) 2010-06-23

    characteristics and use of btrfs Document options Print this page Send as e-mail this page Levels: Liu ( [email protected] ), Software Engineer, Shanghai Jiao Tong University Electronics and Communication Engineering August 20, 2009 Btrfs is called

  • (Transfer) ext4 file system of new features 2010-08-10

    This article is reproduced from: Ext4 is an important part of the kernel version 2.6.28. Introduction Ext4 Linux file system is a revolution. In many ways, Ext4 progress compared to Ext3 Ext3 is

  • Distributed File System 2010-08-12

    Gluster Lustre Ceph Moose File System MogileFS Ceph NFS pNFS Lustre look better than good-looking document, the file system to form a set of evaluation indicators, qualitative and quantitative. Qualitative indicators Flexibility HA Quantitative indic

  • [Google paper II] Google File System (on) 2010-12-17

    Reprinted please specify: of phylips @ bmy Abstract We designed and implemented the google file system, one for the large-scale distributed data intensive applications, scalable distribu

  • [Google paper II] Google File System (Central) 2010-12-17

    Reprinted please specify: of phylips @ bmy 3. System Interaction We are minimizing the operations master in the participation of all to design the system. In this context, we now describ

  • [Google paper II] Google File System (below) 2010-12-17

    Reprinted please specify: of phylips @ bmy 6. Measurement In this section, we use some small-scale tests to demonstrate the GFS architecture and implementation bottlenecks inherent in, t

  • I used Linux command of the stat - display file or file system status 2011-04-17

    I used Linux command of the stat - display file or file system status This link: (reproduced please indicate the source) Use Description stat command to display file or file system status information (dis

  • [Turn] GFS, HDFS, Blob File System Architecture Comparison 2011-06-09

    The original link address: Author: chuanhui Many distributed file systems, including the GFS, HDFS, Taobao open source TFS, Tencent for album storage TFS (Tencent FS, in order to facilitate the distinction, the

  • Analysis of the Linux file system 2011-09-05

    We first look at a simple text file is how to save: Open vim, a text editor: [[email protected] ~]# vim hello.txt Editing content is as follows: is best website for java With the command: wq to save and then list the files [[email protected] ~]

  • The Google File System中文版 2014-07-28

    摘要 我们设计并实现了Google GFS文件系统,一个面向大规模数据密集型应用的.可伸缩的分布式文件系统.GFS虽然运行在廉价的普遍硬件设备上,但是它依然了提供灾难冗余的能力,为大量客户机提供了高性能的服务. 虽然GFS的设计目标与许多传统的分布式文件系统有很多相同之处,但是,我们的设计还是以我们对自己的应用的负载情况和技术环境的分析为基础的,不管现在还是将来,GFS和早期的分布式文件系统的设想都有明显的不同.所以我们重新审视了传统文件系统在设计上的折衷选择,衍生出了完全不同的设计思路. GF

  • [分布式]The Google File System中文版 2012-09-07

    我们设计并实现了Google GFS文件系统,一个面向大规模数据密集型应用的.可伸缩的分布式文件系统.GFS虽然运行在廉价的普遍硬件设备上,但是它依然了提供灾难冗余的能力,为大量客户机提供了高性能的服务. 虽然GFS的设计目标与许多传统的分布式文件系统有很多相同之处,但是,我们的设计还是以我们对自己的应用的负载情况和技术环境的分析为基础的,不管现在还是将来,GFS和早期的分布式文件系统的设想都有明显的不同.所以我们重新审视了传统文件系统在设计上的折衷选择,衍生出了完全不同的设计思路. GFS完全

  • 谷歌三大核心技术(一)Google File System中文版 2012-09-29

    The Google File System中文版 译者:alex 摘要 我们设计并实现了Google GFS文件系统,一个面向大规模数据密集型应用的.可伸缩的分布式文件系统.GFS虽然运行在廉价的普遍硬件设备上,但是它依然了提供灾难冗余的能力,为大量客户机提供了高性能的服务. 虽然GFS的设计目标与许多传统的分布式文件系统有很多相同之处,但是,我们的设计还是以我们对自己的应用的负载情况和技术环境的分析为基础 的,不管现在还是将来,GFS和早期的分布式文件系统的设想都有明显的不同.所以我们重新审

  • Commons Virtual File System 2010-03-29

    VFS Commons Virtual File System (VFS) provides a unified access to different file system abstraction layer. This component can be configured to simultaneously connect to one or more of the file system. In the Linux operating system is relatively easy

  • hadoop getting started, the initial feelings of distributed file system 2010-03-23

    Sorry, this article is incorrect, this (20,100,225) days has been fixed. (Please note that the official Chinese documents have been left behind, please try to watch English-language document) A preparation for work: Hadoop distributed file system, ar

  • linux file system commands 2010-02-04

    Linux supports a large number of local file system type (ext2, ext3, JFS, XFS, ReiserFS, vfat, NTFS), look at / proc / filesystems file to find out the contents of the current Linux kernel can provide the support on which the file system. No matter w