summaryrefslogtreecommitdiff
path: root/target/linux/patches
diff options
context:
space:
mode:
authorWaldemar Brodkorb <wbrodkorb@conet.de>2015-06-24 18:19:31 +0200
committerWaldemar Brodkorb <wbrodkorb@conet.de>2015-06-24 18:19:31 +0200
commitbd63480810c047da6109cfcf3287d058dc0d4190 (patch)
tree9723f9fb77029a6055ca79ebf8ea42a43d5afdc5 /target/linux/patches
parent48e6b5573016036676d9df90f418ec8b083c1cff (diff)
parentd94181d688c220f6574587081a3cf84fb1b3fc42 (diff)
merge
Diffstat (limited to 'target/linux/patches')
-rw-r--r--target/linux/patches/4.0.6/aufs4.patch4022
1 files changed, 1825 insertions, 2197 deletions
diff --git a/target/linux/patches/4.0.6/aufs4.patch b/target/linux/patches/4.0.6/aufs4.patch
index 73b035010..db38c850a 100644
--- a/target/linux/patches/4.0.6/aufs4.patch
+++ b/target/linux/patches/4.0.6/aufs4.patch
@@ -1,1553 +1,6 @@
-diff -Nur linux-4.0.4.orig/Documentation/ABI/testing/debugfs-aufs linux-4.0.4/Documentation/ABI/testing/debugfs-aufs
---- linux-4.0.4.orig/Documentation/ABI/testing/debugfs-aufs 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/ABI/testing/debugfs-aufs 2015-05-30 22:11:30.000000000 +0200
-@@ -0,0 +1,50 @@
-+What: /debug/aufs/si_<id>/
-+Date: March 2009
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ Under /debug/aufs, a directory named si_<id> is created
-+ per aufs mount, where <id> is a unique id generated
-+ internally.
-+
-+What: /debug/aufs/si_<id>/plink
-+Date: Apr 2013
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ It has three lines and shows the information about the
-+ pseudo-link. The first line is a single number
-+ representing a number of buckets. The second line is a
-+ number of pseudo-links per buckets (separated by a
-+ blank). The last line is a single number representing a
-+ total number of psedo-links.
-+ When the aufs mount option 'noplink' is specified, it
-+ will show "1\n0\n0\n".
-+
-+What: /debug/aufs/si_<id>/xib
-+Date: March 2009
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ It shows the consumed blocks by xib (External Inode Number
-+ Bitmap), its block size and file size.
-+ When the aufs mount option 'noxino' is specified, it
-+ will be empty. About XINO files, see the aufs manual.
-+
-+What: /debug/aufs/si_<id>/xino0, xino1 ... xinoN
-+Date: March 2009
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ It shows the consumed blocks by xino (External Inode Number
-+ Translation Table), its link count, block size and file
-+ size.
-+ When the aufs mount option 'noxino' is specified, it
-+ will be empty. About XINO files, see the aufs manual.
-+
-+What: /debug/aufs/si_<id>/xigen
-+Date: March 2009
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ It shows the consumed blocks by xigen (External Inode
-+ Generation Table), its block size and file size.
-+ If CONFIG_AUFS_EXPORT is disabled, this entry will not
-+ be created.
-+ When the aufs mount option 'noxino' is specified, it
-+ will be empty. About XINO files, see the aufs manual.
-diff -Nur linux-4.0.4.orig/Documentation/ABI/testing/sysfs-aufs linux-4.0.4/Documentation/ABI/testing/sysfs-aufs
---- linux-4.0.4.orig/Documentation/ABI/testing/sysfs-aufs 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/ABI/testing/sysfs-aufs 2015-05-30 22:11:29.000000000 +0200
-@@ -0,0 +1,31 @@
-+What: /sys/fs/aufs/si_<id>/
-+Date: March 2009
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ Under /sys/fs/aufs, a directory named si_<id> is created
-+ per aufs mount, where <id> is a unique id generated
-+ internally.
-+
-+What: /sys/fs/aufs/si_<id>/br0, br1 ... brN
-+Date: March 2009
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ It shows the abolute path of a member directory (which
-+ is called branch) in aufs, and its permission.
-+
-+What: /sys/fs/aufs/si_<id>/brid0, brid1 ... bridN
-+Date: July 2013
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ It shows the id of a member directory (which is called
-+ branch) in aufs.
-+
-+What: /sys/fs/aufs/si_<id>/xi_path
-+Date: March 2009
-+Contact: J. R. Okajima <hooanon05g@gmail.com>
-+Description:
-+ It shows the abolute path of XINO (External Inode Number
-+ Bitmap, Translation Table and Generation Table) file
-+ even if it is the default path.
-+ When the aufs mount option 'noxino' is specified, it
-+ will be empty. About XINO files, see the aufs manual.
-diff -Nur linux-4.0.4.orig/Documentation/filesystems/aufs/design/01intro.txt linux-4.0.4/Documentation/filesystems/aufs/design/01intro.txt
---- linux-4.0.4.orig/Documentation/filesystems/aufs/design/01intro.txt 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/filesystems/aufs/design/01intro.txt 2015-05-30 22:11:30.000000000 +0200
-@@ -0,0 +1,157 @@
-+
-+# Copyright (C) 2005-2015 Junjiro R. Okajima
-+
-+Introduction
-+----------------------------------------
-+
-+aufs [ei ju: ef es] | [a u f s]
-+1. abbrev. for "advanced multi-layered unification filesystem".
-+2. abbrev. for "another unionfs".
-+3. abbrev. for "auf das" in German which means "on the" in English.
-+ Ex. "Butter aufs Brot"(G) means "butter onto bread"(E).
-+ But "Filesystem aufs Filesystem" is hard to understand.
-+
-+AUFS is a filesystem with features:
-+- multi layered stackable unification filesystem, the member directory
-+ is called as a branch.
-+- branch permission and attribute, 'readonly', 'real-readonly',
-+ 'readwrite', 'whiteout-able', 'link-able whiteout', etc. and their
-+ combination.
-+- internal "file copy-on-write".
-+- logical deletion, whiteout.
-+- dynamic branch manipulation, adding, deleting and changing permission.
-+- allow bypassing aufs, user's direct branch access.
-+- external inode number translation table and bitmap which maintains the
-+ persistent aufs inode number.
-+- seekable directory, including NFS readdir.
-+- file mapping, mmap and sharing pages.
-+- pseudo-link, hardlink over branches.
-+- loopback mounted filesystem as a branch.
-+- several policies to select one among multiple writable branches.
-+- revert a single systemcall when an error occurs in aufs.
-+- and more...
-+
-+
-+Multi Layered Stackable Unification Filesystem
-+----------------------------------------------------------------------
-+Most people already knows what it is.
-+It is a filesystem which unifies several directories and provides a
-+merged single directory. When users access a file, the access will be
-+passed/re-directed/converted (sorry, I am not sure which English word is
-+correct) to the real file on the member filesystem. The member
-+filesystem is called 'lower filesystem' or 'branch' and has a mode
-+'readonly' and 'readwrite.' And the deletion for a file on the lower
-+readonly branch is handled by creating 'whiteout' on the upper writable
-+branch.
-+
-+On LKML, there have been discussions about UnionMount (Jan Blunck,
-+Bharata B Rao and Valerie Aurora) and Unionfs (Erez Zadok). They took
-+different approaches to implement the merged-view.
-+The former tries putting it into VFS, and the latter implements as a
-+separate filesystem.
-+(If I misunderstand about these implementations, please let me know and
-+I shall correct it. Because it is a long time ago when I read their
-+source files last time).
-+
-+UnionMount's approach will be able to small, but may be hard to share
-+branches between several UnionMount since the whiteout in it is
-+implemented in the inode on branch filesystem and always
-+shared. According to Bharata's post, readdir does not seems to be
-+finished yet.
-+There are several missing features known in this implementations such as
-+- for users, the inode number may change silently. eg. copy-up.
-+- link(2) may break by copy-up.
-+- read(2) may get an obsoleted filedata (fstat(2) too).
-+- fcntl(F_SETLK) may be broken by copy-up.
-+- unnecessary copy-up may happen, for example mmap(MAP_PRIVATE) after
-+ open(O_RDWR).
-+
-+In linux-3.18, "overlay" filesystem (formerly known as "overlayfs") was
-+merged into mainline. This is another implementation of UnionMount as a
-+separated filesystem. All the limitations and known problems which
-+UnionMount are equally inherited to "overlay" filesystem.
-+
-+Unionfs has a longer history. When I started implementing a stackable
-+filesystem (Aug 2005), it already existed. It has virtual super_block,
-+inode, dentry and file objects and they have an array pointing lower
-+same kind objects. After contributing many patches for Unionfs, I
-+re-started my project AUFS (Jun 2006).
-+
-+In AUFS, the structure of filesystem resembles to Unionfs, but I
-+implemented my own ideas, approaches and enhancements and it became
-+totally different one.
-+
-+Comparing DM snapshot and fs based implementation
-+- the number of bytes to be copied between devices is much smaller.
-+- the type of filesystem must be one and only.
-+- the fs must be writable, no readonly fs, even for the lower original
-+ device. so the compression fs will not be usable. but if we use
-+ loopback mount, we may address this issue.
-+ for instance,
-+ mount /cdrom/squashfs.img /sq
-+ losetup /sq/ext2.img
-+ losetup /somewhere/cow
-+ dmsetup "snapshot /dev/loop0 /dev/loop1 ..."
-+- it will be difficult (or needs more operations) to extract the
-+ difference between the original device and COW.
-+- DM snapshot-merge may help a lot when users try merging. in the
-+ fs-layer union, users will use rsync(1).
-+
-+You may want to read my old paper "Filesystems in LiveCD"
-+(http://aufs.sourceforge.net/aufs2/report/sq/sq.pdf).
-+
-+
-+Several characters/aspects/persona of aufs
-+----------------------------------------------------------------------
-+
-+Aufs has several characters, aspects or persona.
-+1. a filesystem, callee of VFS helper
-+2. sub-VFS, caller of VFS helper for branches
-+3. a virtual filesystem which maintains persistent inode number
-+4. reader/writer of files on branches such like an application
-+
-+1. Callee of VFS Helper
-+As an ordinary linux filesystem, aufs is a callee of VFS. For instance,
-+unlink(2) from an application reaches sys_unlink() kernel function and
-+then vfs_unlink() is called. vfs_unlink() is one of VFS helper and it
-+calls filesystem specific unlink operation. Actually aufs implements the
-+unlink operation but it behaves like a redirector.
-+
-+2. Caller of VFS Helper for Branches
-+aufs_unlink() passes the unlink request to the branch filesystem as if
-+it were called from VFS. So the called unlink operation of the branch
-+filesystem acts as usual. As a caller of VFS helper, aufs should handle
-+every necessary pre/post operation for the branch filesystem.
-+- acquire the lock for the parent dir on a branch
-+- lookup in a branch
-+- revalidate dentry on a branch
-+- mnt_want_write() for a branch
-+- vfs_unlink() for a branch
-+- mnt_drop_write() for a branch
-+- release the lock on a branch
-+
-+3. Persistent Inode Number
-+One of the most important issue for a filesystem is to maintain inode
-+numbers. This is particularly important to support exporting a
-+filesystem via NFS. Aufs is a virtual filesystem which doesn't have a
-+backend block device for its own. But some storage is necessary to
-+keep and maintain the inode numbers. It may be a large space and may not
-+suit to keep in memory. Aufs rents some space from its first writable
-+branch filesystem (by default) and creates file(s) on it. These files
-+are created by aufs internally and removed soon (currently) keeping
-+opened.
-+Note: Because these files are removed, they are totally gone after
-+ unmounting aufs. It means the inode numbers are not persistent
-+ across unmount or reboot. I have a plan to make them really
-+ persistent which will be important for aufs on NFS server.
-+
-+4. Read/Write Files Internally (copy-on-write)
-+Because a branch can be readonly, when you write a file on it, aufs will
-+"copy-up" it to the upper writable branch internally. And then write the
-+originally requested thing to the file. Generally kernel doesn't
-+open/read/write file actively. In aufs, even a single write may cause a
-+internal "file copy". This behaviour is very similar to cp(1) command.
-+
-+Some people may think it is better to pass such work to user space
-+helper, instead of doing in kernel space. Actually I am still thinking
-+about it. But currently I have implemented it in kernel space.
-diff -Nur linux-4.0.4.orig/Documentation/filesystems/aufs/design/02struct.txt linux-4.0.4/Documentation/filesystems/aufs/design/02struct.txt
---- linux-4.0.4.orig/Documentation/filesystems/aufs/design/02struct.txt 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/filesystems/aufs/design/02struct.txt 2015-05-30 22:11:30.000000000 +0200
-@@ -0,0 +1,245 @@
-+
-+# Copyright (C) 2005-2015 Junjiro R. Okajima
-+
-+Basic Aufs Internal Structure
-+
-+Superblock/Inode/Dentry/File Objects
-+----------------------------------------------------------------------
-+As like an ordinary filesystem, aufs has its own
-+superblock/inode/dentry/file objects. All these objects have a
-+dynamically allocated array and store the same kind of pointers to the
-+lower filesystem, branch.
-+For example, when you build a union with one readwrite branch and one
-+readonly, mounted /au, /rw and /ro respectively.
-+- /au = /rw + /ro
-+- /ro/fileA exists but /rw/fileA
-+
-+Aufs lookup operation finds /ro/fileA and gets dentry for that. These
-+pointers are stored in a aufs dentry. The array in aufs dentry will be,
-+- [0] = NULL (because /rw/fileA doesn't exist)
-+- [1] = /ro/fileA
-+
-+This style of an array is essentially same to the aufs
-+superblock/inode/dentry/file objects.
-+
-+Because aufs supports manipulating branches, ie. add/delete/change
-+branches dynamically, these objects has its own generation. When
-+branches are changed, the generation in aufs superblock is
-+incremented. And a generation in other object are compared when it is
-+accessed. When a generation in other objects are obsoleted, aufs
-+refreshes the internal array.
-+
-+
-+Superblock
-+----------------------------------------------------------------------
-+Additionally aufs superblock has some data for policies to select one
-+among multiple writable branches, XIB files, pseudo-links and kobject.
-+See below in detail.
-+About the policies which supports copy-down a directory, see
-+wbr_policy.txt too.
-+
-+
-+Branch and XINO(External Inode Number Translation Table)
-+----------------------------------------------------------------------
-+Every branch has its own xino (external inode number translation table)
-+file. The xino file is created and unlinked by aufs internally. When two
-+members of a union exist on the same filesystem, they share the single
-+xino file.
-+The struct of a xino file is simple, just a sequence of aufs inode
-+numbers which is indexed by the lower inode number.
-+In the above sample, assume the inode number of /ro/fileA is i111 and
-+aufs assigns the inode number i999 for fileA. Then aufs writes 999 as
-+4(8) bytes at 111 * 4(8) bytes offset in the xino file.
-+
-+When the inode numbers are not contiguous, the xino file will be sparse
-+which has a hole in it and doesn't consume as much disk space as it
-+might appear. If your branch filesystem consumes disk space for such
-+holes, then you should specify 'xino=' option at mounting aufs.
-+
-+Aufs has a mount option to free the disk blocks for such holes in XINO
-+files on tmpfs or ramdisk. But it is not so effective actually. If you
-+meet a problem of disk shortage due to XINO files, then you should try
-+"tmpfs-ino.patch" (and "vfs-ino.patch" too) in aufs4-standalone.git.
-+The patch localizes the assignment inumbers per tmpfs-mount and avoid
-+the holes in XINO files.
-+
-+Also a writable branch has three kinds of "whiteout bases". All these
-+are existed when the branch is joined to aufs, and their names are
-+whiteout-ed doubly, so that users will never see their names in aufs
-+hierarchy.
-+1. a regular file which will be hardlinked to all whiteouts.
-+2. a directory to store a pseudo-link.
-+3. a directory to store an "orphan"-ed file temporary.
-+
-+1. Whiteout Base
-+ When you remove a file on a readonly branch, aufs handles it as a
-+ logical deletion and creates a whiteout on the upper writable branch
-+ as a hardlink of this file in order not to consume inode on the
-+ writable branch.
-+2. Pseudo-link Dir
-+ See below, Pseudo-link.
-+3. Step-Parent Dir
-+ When "fileC" exists on the lower readonly branch only and it is
-+ opened and removed with its parent dir, and then user writes
-+ something into it, then aufs copies-up fileC to this
-+ directory. Because there is no other dir to store fileC. After
-+ creating a file under this dir, the file is unlinked.
-+
-+Because aufs supports manipulating branches, ie. add/delete/change
-+dynamically, a branch has its own id. When the branch order changes,
-+aufs finds the new index by searching the branch id.
-+
-+
-+Pseudo-link
-+----------------------------------------------------------------------
-+Assume "fileA" exists on the lower readonly branch only and it is
-+hardlinked to "fileB" on the branch. When you write something to fileA,
-+aufs copies-up it to the upper writable branch. Additionally aufs
-+creates a hardlink under the Pseudo-link Directory of the writable
-+branch. The inode of a pseudo-link is kept in aufs super_block as a
-+simple list. If fileB is read after unlinking fileA, aufs returns
-+filedata from the pseudo-link instead of the lower readonly
-+branch. Because the pseudo-link is based upon the inode, to keep the
-+inode number by xino (see above) is essentially necessary.
-+
-+All the hardlinks under the Pseudo-link Directory of the writable branch
-+should be restored in a proper location later. Aufs provides a utility
-+to do this. The userspace helpers executed at remounting and unmounting
-+aufs by default.
-+During this utility is running, it puts aufs into the pseudo-link
-+maintenance mode. In this mode, only the process which began the
-+maintenance mode (and its child processes) is allowed to operate in
-+aufs. Some other processes which are not related to the pseudo-link will
-+be allowed to run too, but the rest have to return an error or wait
-+until the maintenance mode ends. If a process already acquires an inode
-+mutex (in VFS), it has to return an error.
-+
-+
-+XIB(external inode number bitmap)
-+----------------------------------------------------------------------
-+Addition to the xino file per a branch, aufs has an external inode number
-+bitmap in a superblock object. It is also an internal file such like a
-+xino file.
-+It is a simple bitmap to mark whether the aufs inode number is in-use or
-+not.
-+To reduce the file I/O, aufs prepares a single memory page to cache xib.
-+
-+As well as XINO files, aufs has a feature to truncate/refresh XIB to
-+reduce the number of consumed disk blocks for these files.
-+
-+
-+Virtual or Vertical Dir, and Readdir in Userspace
-+----------------------------------------------------------------------
-+In order to support multiple layers (branches), aufs readdir operation
-+constructs a virtual dir block on memory. For readdir, aufs calls
-+vfs_readdir() internally for each dir on branches, merges their entries
-+with eliminating the whiteout-ed ones, and sets it to file (dir)
-+object. So the file object has its entry list until it is closed. The
-+entry list will be updated when the file position is zero and becomes
-+obsoleted. This decision is made in aufs automatically.
-+
-+The dynamically allocated memory block for the name of entries has a
-+unit of 512 bytes (by default) and stores the names contiguously (no
-+padding). Another block for each entry is handled by kmem_cache too.
-+During building dir blocks, aufs creates hash list and judging whether
-+the entry is whiteouted by its upper branch or already listed.
-+The merged result is cached in the corresponding inode object and
-+maintained by a customizable life-time option.
-+
-+Some people may call it can be a security hole or invite DoS attack
-+since the opened and once readdir-ed dir (file object) holds its entry
-+list and becomes a pressure for system memory. But I'd say it is similar
-+to files under /proc or /sys. The virtual files in them also holds a
-+memory page (generally) while they are opened. When an idea to reduce
-+memory for them is introduced, it will be applied to aufs too.
-+For those who really hate this situation, I've developed readdir(3)
-+library which operates this merging in userspace. You just need to set
-+LD_PRELOAD environment variable, and aufs will not consume no memory in
-+kernel space for readdir(3).
-+
-+
-+Workqueue
-+----------------------------------------------------------------------
-+Aufs sometimes requires privilege access to a branch. For instance,
-+in copy-up/down operation. When a user process is going to make changes
-+to a file which exists in the lower readonly branch only, and the mode
-+of one of ancestor directories may not be writable by a user
-+process. Here aufs copy-up the file with its ancestors and they may
-+require privilege to set its owner/group/mode/etc.
-+This is a typical case of a application character of aufs (see
-+Introduction).
-+
-+Aufs uses workqueue synchronously for this case. It creates its own
-+workqueue. The workqueue is a kernel thread and has privilege. Aufs
-+passes the request to call mkdir or write (for example), and wait for
-+its completion. This approach solves a problem of a signal handler
-+simply.
-+If aufs didn't adopt the workqueue and changed the privilege of the
-+process, then the process may receive the unexpected SIGXFSZ or other
-+signals.
-+
-+Also aufs uses the system global workqueue ("events" kernel thread) too
-+for asynchronous tasks, such like handling inotify/fsnotify, re-creating a
-+whiteout base and etc. This is unrelated to a privilege.
-+Most of aufs operation tries acquiring a rw_semaphore for aufs
-+superblock at the beginning, at the same time waits for the completion
-+of all queued asynchronous tasks.
-+
-+
-+Whiteout
-+----------------------------------------------------------------------
-+The whiteout in aufs is very similar to Unionfs's. That is represented
-+by its filename. UnionMount takes an approach of a file mode, but I am
-+afraid several utilities (find(1) or something) will have to support it.
-+
-+Basically the whiteout represents "logical deletion" which stops aufs to
-+lookup further, but also it represents "dir is opaque" which also stop
-+further lookup.
-+
-+In aufs, rmdir(2) and rename(2) for dir uses whiteout alternatively.
-+In order to make several functions in a single systemcall to be
-+revertible, aufs adopts an approach to rename a directory to a temporary
-+unique whiteouted name.
-+For example, in rename(2) dir where the target dir already existed, aufs
-+renames the target dir to a temporary unique whiteouted name before the
-+actual rename on a branch, and then handles other actions (make it opaque,
-+update the attributes, etc). If an error happens in these actions, aufs
-+simply renames the whiteouted name back and returns an error. If all are
-+succeeded, aufs registers a function to remove the whiteouted unique
-+temporary name completely and asynchronously to the system global
-+workqueue.
-+
-+
-+Copy-up
-+----------------------------------------------------------------------
-+It is a well-known feature or concept.
-+When user modifies a file on a readonly branch, aufs operate "copy-up"
-+internally and makes change to the new file on the upper writable branch.
-+When the trigger systemcall does not update the timestamps of the parent
-+dir, aufs reverts it after copy-up.
-+
-+
-+Move-down (aufs3.9 and later)
-+----------------------------------------------------------------------
-+"Copy-up" is one of the essential feature in aufs. It copies a file from
-+the lower readonly branch to the upper writable branch when a user
-+changes something about the file.
-+"Move-down" is an opposite action of copy-up. Basically this action is
-+ran manually instead of automatically and internally.
-+For desgin and implementation, aufs has to consider these issues.
-+- whiteout for the file may exist on the lower branch.
-+- ancestor directories may not exist on the lower branch.
-+- diropq for the ancestor directories may exist on the upper branch.
-+- free space on the lower branch will reduce.
-+- another access to the file may happen during moving-down, including
-+ UDBA (see "Revalidate Dentry and UDBA").
-+- the file should not be hard-linked nor pseudo-linked. they should be
-+ handled by auplink utility later.
-+
-+Sometimes users want to move-down a file from the upper writable branch
-+to the lower readonly or writable branch. For instance,
-+- the free space of the upper writable branch is going to run out.
-+- create a new intermediate branch between the upper and lower branch.
-+- etc.
-+
-+For this purpose, use "aumvdown" command in aufs-util.git.
-diff -Nur linux-4.0.4.orig/Documentation/filesystems/aufs/design/03atomic_open.txt linux-4.0.4/Documentation/filesystems/aufs/design/03atomic_open.txt
---- linux-4.0.4.orig/Documentation/filesystems/aufs/design/03atomic_open.txt 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/filesystems/aufs/design/03atomic_open.txt 2015-05-30 22:11:31.000000000 +0200
-@@ -0,0 +1,72 @@
-+
-+# Copyright (C) 2015 Junjiro R. Okajima
-+
-+Support for a branch who has its ->atomic_open()
-+----------------------------------------------------------------------
-+The filesystems who implement its ->atomic_open() are not majority. For
-+example NFSv4 does, and aufs should call NFSv4 ->atomic_open,
-+particularly for open(O_CREAT|O_EXCL, 0400) case. Other than
-+->atomic_open(), NFSv4 returns an error for this open(2). While I am not
-+sure whether all filesystems who have ->atomic_open() behave like this,
-+but NFSv4 surely returns the error.
-+
-+In order to support ->atomic_open() for aufs, there are a few
-+approaches.
-+
-+A. Introduce aufs_atomic_open()
-+ - calls one of VFS:do_last(), lookup_open() or atomic_open() for
-+ branch fs.
-+B. Introduce aufs_atomic_open() calling create, open and chmod. this is
-+ an aufs user Pip Cet's approach
-+ - calls aufs_create(), VFS finish_open() and notify_change().
-+ - pass fake-mode to finish_open(), and then correct the mode by
-+ notify_change().
-+C. Extend aufs_open() to call branch fs's ->atomic_open()
-+ - no aufs_atomic_open().
-+ - aufs_lookup() registers the TID to an aufs internal object.
-+ - aufs_create() does nothing when the matching TID is registered, but
-+ registers the mode.
-+ - aufs_open() calls branch fs's ->atomic_open() when the matching
-+ TID is registered.
-+D. Extend aufs_open() to re-try branch fs's ->open() with superuser's
-+ credential
-+ - no aufs_atomic_open().
-+ - aufs_create() registers the TID to an internal object. this info
-+ represents "this process created this file just now."
-+ - when aufs gets EACCES from branch fs's ->open(), then confirm the
-+ registered TID and re-try open() with superuser's credential.
-+
-+Pros and cons for each approach.
-+
-+A.
-+ - straightforward but highly depends upon VFS internal.
-+ - the atomic behavaiour is kept.
-+ - some of parameters such as nameidata are hard to reproduce for
-+ branch fs.
-+ - large overhead.
-+B.
-+ - easy to implement.
-+ - the atomic behavaiour is lost.
-+C.
-+ - the atomic behavaiour is kept.
-+ - dirty and tricky.
-+ - VFS checks whether the file is created correctly after calling
-+ ->create(), which means this approach doesn't work.
-+D.
-+ - easy to implement.
-+ - the atomic behavaiour is lost.
-+ - to open a file with superuser's credential and give it to a user
-+ process is a bad idea, since the file object keeps the credential
-+ in it. It may affect LSM or something. This approach doesn't work
-+ either.
-+
-+The approach A is ideal, but it hard to implement. So here is a
-+variation of A, which is to be implemented.
-+
-+A-1. Introduce aufs_atomic_open()
-+ - calls branch fs ->atomic_open() if exists. otherwise calls
-+ vfs_create() and finish_open().
-+ - the demerit is that the several checks after branch fs
-+ ->atomic_open() are lost. in the ordinary case, the checks are
-+ done by VFS:do_last(), lookup_open() and atomic_open(). some can
-+ be implemented in aufs, but not all I am afraid.
-diff -Nur linux-4.0.4.orig/Documentation/filesystems/aufs/design/03lookup.txt linux-4.0.4/Documentation/filesystems/aufs/design/03lookup.txt
---- linux-4.0.4.orig/Documentation/filesystems/aufs/design/03lookup.txt 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/filesystems/aufs/design/03lookup.txt 2015-05-30 22:11:30.000000000 +0200
-@@ -0,0 +1,100 @@
-+
-+# Copyright (C) 2005-2015 Junjiro R. Okajima
-+
-+Lookup in a Branch
-+----------------------------------------------------------------------
-+Since aufs has a character of sub-VFS (see Introduction), it operates
-+lookup for branches as VFS does. It may be a heavy work. But almost all
-+lookup operation in aufs is the simplest case, ie. lookup only an entry
-+directly connected to its parent. Digging down the directory hierarchy
-+is unnecessary. VFS has a function lookup_one_len() for that use, and
-+aufs calls it.
-+
-+When a branch is a remote filesystem, aufs basically relies upon its
-+->d_revalidate(), also aufs forces the hardest revalidate tests for
-+them.
-+For d_revalidate, aufs implements three levels of revalidate tests. See
-+"Revalidate Dentry and UDBA" in detail.
-+
-+
-+Test Only the Highest One for the Directory Permission (dirperm1 option)
-+----------------------------------------------------------------------
-+Let's try case study.
-+- aufs has two branches, upper readwrite and lower readonly.
-+ /au = /rw + /ro
-+- "dirA" exists under /ro, but /rw. and its mode is 0700.
-+- user invoked "chmod a+rx /au/dirA"
-+- the internal copy-up is activated and "/rw/dirA" is created and its
-+ permission bits are set to world readable.
-+- then "/au/dirA" becomes world readable?
-+
-+In this case, /ro/dirA is still 0700 since it exists in readonly branch,
-+or it may be a natively readonly filesystem. If aufs respects the lower
-+branch, it should not respond readdir request from other users. But user
-+allowed it by chmod. Should really aufs rejects showing the entries
-+under /ro/dirA?
-+
-+To be honest, I don't have a good solution for this case. So aufs
-+implements 'dirperm1' and 'nodirperm1' mount options, and leave it to
-+users.
-+When dirperm1 is specified, aufs checks only the highest one for the
-+directory permission, and shows the entries. Otherwise, as usual, checks
-+every dir existing on all branches and rejects the request.
-+
-+As a side effect, dirperm1 option improves the performance of aufs
-+because the number of permission check is reduced when the number of
-+branch is many.
-+
-+
-+Revalidate Dentry and UDBA (User's Direct Branch Access)
-+----------------------------------------------------------------------
-+Generally VFS helpers re-validate a dentry as a part of lookup.
-+0. digging down the directory hierarchy.
-+1. lock the parent dir by its i_mutex.
-+2. lookup the final (child) entry.
-+3. revalidate it.
-+4. call the actual operation (create, unlink, etc.)
-+5. unlock the parent dir
-+
-+If the filesystem implements its ->d_revalidate() (step 3), then it is
-+called. Actually aufs implements it and checks the dentry on a branch is
-+still valid.
-+But it is not enough. Because aufs has to release the lock for the
-+parent dir on a branch at the end of ->lookup() (step 2) and
-+->d_revalidate() (step 3) while the i_mutex of the aufs dir is still
-+held by VFS.
-+If the file on a branch is changed directly, eg. bypassing aufs, after
-+aufs released the lock, then the subsequent operation may cause
-+something unpleasant result.
-+
-+This situation is a result of VFS architecture, ->lookup() and
-+->d_revalidate() is separated. But I never say it is wrong. It is a good
-+design from VFS's point of view. It is just not suitable for sub-VFS
-+character in aufs.
-+
-+Aufs supports such case by three level of revalidation which is
-+selectable by user.
-+1. Simple Revalidate
-+ Addition to the native flow in VFS's, confirm the child-parent
-+ relationship on the branch just after locking the parent dir on the
-+ branch in the "actual operation" (step 4). When this validation
-+ fails, aufs returns EBUSY. ->d_revalidate() (step 3) in aufs still
-+ checks the validation of the dentry on branches.
-+2. Monitor Changes Internally by Inotify/Fsnotify
-+ Addition to above, in the "actual operation" (step 4) aufs re-lookup
-+ the dentry on the branch, and returns EBUSY if it finds different
-+ dentry.
-+ Additionally, aufs sets the inotify/fsnotify watch for every dir on branches
-+ during it is in cache. When the event is notified, aufs registers a
-+ function to kernel 'events' thread by schedule_work(). And the
-+ function sets some special status to the cached aufs dentry and inode
-+ private data. If they are not cached, then aufs has nothing to
-+ do. When the same file is accessed through aufs (step 0-3) later,
-+ aufs will detect the status and refresh all necessary data.
-+ In this mode, aufs has to ignore the event which is fired by aufs
-+ itself.
-+3. No Extra Validation
-+ This is the simplest test and doesn't add any additional revalidation
-+ test, and skip the revalidation in step 4. It is useful and improves
-+ aufs performance when system surely hide the aufs branches from user,
-+ by over-mounting something (or another method).
-diff -Nur linux-4.0.4.orig/Documentation/filesystems/aufs/design/04branch.txt linux-4.0.4/Documentation/filesystems/aufs/design/04branch.txt
---- linux-4.0.4.orig/Documentation/filesystems/aufs/design/04branch.txt 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/filesystems/aufs/design/04branch.txt 2015-05-30 22:11:30.000000000 +0200
-@@ -0,0 +1,61 @@
-+
-+# Copyright (C) 2005-2015 Junjiro R. Okajima
-+
-+Branch Manipulation
-+
-+Since aufs supports dynamic branch manipulation, ie. add/remove a branch
-+and changing its permission/attribute, there are a lot of works to do.
-+
-+
-+Add a Branch
-+----------------------------------------------------------------------
-+o Confirm the adding dir exists outside of aufs, including loopback
-+ mount, and its various attributes.
-+o Initialize the xino file and whiteout bases if necessary.
-+ See struct.txt.
-+
-+o Check the owner/group/mode of the directory
-+ When the owner/group/mode of the adding directory differs from the
-+ existing branch, aufs issues a warning because it may impose a
-+ security risk.
-+ For example, when a upper writable branch has a world writable empty
-+ top directory, a malicious user can create any files on the writable
-+ branch directly, like copy-up and modify manually. If something like
-+ /etc/{passwd,shadow} exists on the lower readonly branch but the upper
-+ writable branch, and the writable branch is world-writable, then a
-+ malicious guy may create /etc/passwd on the writable branch directly
-+ and the infected file will be valid in aufs.
-+ I am afraid it can be a security issue, but aufs can do nothing except
-+ producing a warning.
-+
-+
-+Delete a Branch
-+----------------------------------------------------------------------
-+o Confirm the deleting branch is not busy
-+ To be general, there is one merit to adopt "remount" interface to
-+ manipulate branches. It is to discard caches. At deleting a branch,
-+ aufs checks the still cached (and connected) dentries and inodes. If
-+ there are any, then they are all in-use. An inode without its
-+ corresponding dentry can be alive alone (for example, inotify/fsnotify case).
-+
-+ For the cached one, aufs checks whether the same named entry exists on
-+ other branches.
-+ If the cached one is a directory, because aufs provides a merged view
-+ to users, as long as one dir is left on any branch aufs can show the
-+ dir to users. In this case, the branch can be removed from aufs.
-+ Otherwise aufs rejects deleting the branch.
-+
-+ If any file on the deleting branch is opened by aufs, then aufs
-+ rejects deleting.
-+
-+
-+Modify the Permission of a Branch
-+----------------------------------------------------------------------
-+o Re-initialize or remove the xino file and whiteout bases if necessary.
-+ See struct.txt.
-+
-+o rw --> ro: Confirm the modifying branch is not busy
-+ Aufs rejects the request if any of these conditions are true.
-+ - a file on the branch is mmap-ed.
-+ - a regular file on the branch is opened for write and there is no
-+ same named entry on the upper branch.
-diff -Nur linux-4.0.4.orig/Documentation/filesystems/aufs/design/05wbr_policy.txt linux-4.0.4/Documentation/filesystems/aufs/design/05wbr_policy.txt
---- linux-4.0.4.orig/Documentation/filesystems/aufs/design/05wbr_policy.txt 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/filesystems/aufs/design/05wbr_policy.txt 2015-05-30 22:11:30.000000000 +0200
-@@ -0,0 +1,51 @@
-+
-+# Copyright (C) 2005-2015 Junjiro R. Okajima
-+
-+Policies to Select One among Multiple Writable Branches
-+----------------------------------------------------------------------
-+When the number of writable branch is more than one, aufs has to decide
-+the target branch for file creation or copy-up. By default, the highest
-+writable branch which has the parent (or ancestor) dir of the target
-+file is chosen (top-down-parent policy).
-+By user's request, aufs implements some other policies to select the
-+writable branch, for file creation several policies, round-robin,
-+most-free-space, and other policies. For copy-up, top-down-parent,
-+bottom-up-parent, bottom-up and others.
-+
-+As expected, the round-robin policy selects the branch in circular. When
-+you have two writable branches and creates 10 new files, 5 files will be
-+created for each branch. mkdir(2) systemcall is an exception. When you
-+create 10 new directories, all will be created on the same branch.
-+And the most-free-space policy selects the one which has most free
-+space among the writable branches. The amount of free space will be
-+checked by aufs internally, and users can specify its time interval.
-+
-+The policies for copy-up is more simple,
-+top-down-parent is equivalent to the same named on in create policy,
-+bottom-up-parent selects the writable branch where the parent dir
-+exists and the nearest upper one from the copyup-source,
-+bottom-up selects the nearest upper writable branch from the
-+copyup-source, regardless the existence of the parent dir.
-+
-+There are some rules or exceptions to apply these policies.
-+- If there is a readonly branch above the policy-selected branch and
-+ the parent dir is marked as opaque (a variation of whiteout), or the
-+ target (creating) file is whiteout-ed on the upper readonly branch,
-+ then the result of the policy is ignored and the target file will be
-+ created on the nearest upper writable branch than the readonly branch.
-+- If there is a writable branch above the policy-selected branch and
-+ the parent dir is marked as opaque or the target file is whiteouted
-+ on the branch, then the result of the policy is ignored and the target
-+ file will be created on the highest one among the upper writable
-+ branches who has diropq or whiteout. In case of whiteout, aufs removes
-+ it as usual.
-+- link(2) and rename(2) systemcalls are exceptions in every policy.
-+ They try selecting the branch where the source exists as possible
-+ since copyup a large file will take long time. If it can't be,
-+ ie. the branch where the source exists is readonly, then they will
-+ follow the copyup policy.
-+- There is an exception for rename(2) when the target exists.
-+ If the rename target exists, aufs compares the index of the branches
-+ where the source and the target exists and selects the higher
-+ one. If the selected branch is readonly, then aufs follows the
-+ copyup policy.
-diff -Nur linux-4.0.4.orig/Documentation/filesystems/aufs/design/06fhsm.txt linux-4.0.4/Documentation/filesystems/aufs/design/06fhsm.txt
---- linux-4.0.4.orig/Documentation/filesystems/aufs/design/06fhsm.txt 1970-01-01 01:00:00.000000000 +0100
-+++ linux-4.0.4/Documentation/filesystems/aufs/design/06fhsm.txt 2015-05-30 22:11:30.000000000 +0200
-@@ -0,0 +1,105 @@
-+
-+# Copyright (C) 2011-2015 Junjiro R. Okajima
-+
-+File-based Hierarchical Storage Management (FHSM)