File systems

Organized into files and directories Every directory ends with a slash / Root directory: /

On Unix File systems can be mounted anywhere under the root directory

In the Unix philosophy everything is a file

Special files: makes I/O divices look like files

  • Block special files: under the /dev/ directory, directly access devices block data like SSDs or HDDs
  • Character special files: files that accept a character stream: used to model keyboards, printers, mice

File Permissions #

rwx -> 3 bits for for owner, group, other users All files have a 9-bit protection code indicating the permissions for the user, group, and other users denoted by a read write and execute pattern. On directories x indicates permission to access the contents of the directory and traverse it (r will only allow listing the files). This is why files have a rw permission by default (r for group/other) and directories rxw (rx for group/other).

SPECIAL PERMISSIONS

Booting #

Basic Input Output System (BIOS): firmware on motherboard, first thing that runs when computer is powered on, initializes hardware devices

Master Boot Record (MBR): first section on boot device that contains a program to load the secondary bootloader

Unified Extensible Firmware Interface (UEFI): modern replacement for BIOS, allows faster booting, different architectures and more storage

GUID Partition System (GPT): stores information about the location of partition on the drive

UEFI looks in the GPT table contained on the second sector and then loads the EFI System Partition (ESP)/ UEFI is almost a little operating system itself which can read partition and file systems and executables

File system implementation #

Partition table, which contains partitions Superblock contains all key parameters

Files are divided into blocks when stored on a drive, how those blocks are stored can be implemented in different ways:

  • Contiguous layout: simple & high performance, but becomes fragmented when files are removed
  • Linked list: the first word of each block is a pointer to the next block.
    • Advantage: no fragmentation
    • Disadvantage: extremely slow random access. A file allocation table (FAT) could be kept in main memory to solve this problem
  • I(ndex)-nodes: each file has a fixed data structure that contains metadata about it with pointers to the blocks containing the data

Directories map an ASCII name to information needed to locate the data When the system is using i-nodes this can be done by referecning the i-nodes

The operating system can be smart to decide which blocks of the free list / bitmap to keep inside the main memory to minimize memory usage.

Files can be shared by creating links, this avoids having to copy the file but the directory structure becomes a DAG (Directed Acyclic Graph) rahter than a tree which complicates maintenance.

Hard linking: faster, increases count on i-node to prevent deleting Symbolic linking: overhead parsing path, requires extra i-node

Block size: often 4KB, too small requires multiple seeks and rotational delays to read while too large blocks will waste space if there are many small files.

The OS also has to keep track of the blocks that are free, there are two common methods for this:

  • Free list: a linked list is used to store the free blocks: each block contains numbers of free blocks -1 for pointer to next one. Stored in free blocks, size shrinks when more blocks become filled, storage is practically free.
  • Bitmap: a disk of $n$ blocks requires a bitmap of $n$ bits

File system types:

  • Journaling file systems: actions to be completed are written to disk first before operations are started to prevent corruption when the system crashes during a operation.
  • Flash based file systems: based on properties of SSDs, to write a flash block must be ereased which is expensive
  • Virtual File System (VFS): integrates multiple file systems into one structure, Linux allows you to mount different file systems anywhere e.g. NFS allows for remote file systems

Executable and Linkable Format (ELF) #

The executable file format used on Linux and most BSDs

ELF Header

Consists of:

  • ELF header
  • Program header table: describes segments
  • Zero or more Segments (used by executable / relevant at run time)
  • Zero or more sections (used by object files / relevant at link time)
  • Section header table: describes sections

ELF has a very flexible design and is thus used on many platforms.

File system #

  • Single directory file system: containing only files
  • Hierarchical file system: has a root directory which can contain files and subdirectories

File paths #

  • Root directory: /
  • Home directory: ~ or $HOME
  • Current working directory: . or $PWD
  • Parent directory: ..

Backups #

  • Incremental dumps: only backs up files that have changed since the previous backup
  • Physical dump: copies all blocks over, also dumps unused blocks
  • Logical dump: recursively dumps all files and directories from some root directory

Phases:

  1. Marks all modified files and all directories
  2. Unmarks directories without modified files
  3. /