Understanding EXT4
A really interesting series of articles on SANS Digital Forensics Blog
On 2010, Hal Pomeranz has started on SANS Digital Forensics blog a series of technical articles about EXT4 filesystem.
What is EXT4?
EXT4 is a journaling file system for Linux, developed as the successor to ext3: it modifies important data structures of the previous filesystem such as the ones destined to store the file data. The result is a filesystem with an improved design, better performance, reliability and features.
It was accepted as “stable” in the Linux 2.6.28 kernel in October 2008.
The ‘episodes’
The publication of the ‘episodes’ continued since years, and recently Hal has published the part 6, focused on “Directories”.
Here the list of all parts:
1. Extents
20 Dec 2010
EXT4 has moved to 48-bit block addresses. I’ll refer you to the paper cited above for the whys and wherefores of this decision and what it means as far as maximum file system size, etc. What’s really a departure for EXT4 however, is the use of extents rather than the old, inefficient indirect block[3] mechanism used by earlier Unix file systems (e.g. EXT2/EXT3) for tracking file content. Extents are similar to cluster runs in the NTFS file system- essentially they specify an initial block address and the number of blocks that make up the extent. A file that is fragmented will have multiple extents, but EXT4 tries very hard to keep files contiguous.
[embed]https://digital-forensics.sans.org/blog/2010/12/20/digital-forensics-understanding-ext4-part-1-extents[/embed]
2. Timestamps
14 Mar 2011
The EXT4 developers tried very hard to maintain backwards compatibility with the EXT2/EXT3 inode layout. 64-bit timestamps and a completely new file creation timestamp obviously complicate this goal. The EXT4 developers solved this problem by putting the extra stuff in the upper 128 bits of the new, larger 256-bit EXT4 inode.
[embed]https://digital-forensics.sans.org/blog/2010/12/20/digital-forensics-understanding-ext4-part-1-extents[/embed]
3. Extent Trees
28 Mar 2011
[…] you can only have a maximum of 4 extent structures per inode. Furthermore, there are only 16 bits in each extent structure for representing the number of blocks in the extent, and in fact the upper bit is reserved (it’s used to mark the extent as “reserved but initialized”, part of EXT4’s pre-allocation feature). That means each extent can only contain a maximum of 2¹⁵ blocks- which is 128MB assuming 4K blocks.
Now 128MB is pretty big, but what happens when you have a file that’s bigger than half a gigabyte? Such a file would require more than 4 extents to fully index. Or what happens when you have a file that’s small but very fragmented? Again, you could need more than 4 extents to represent the collections of blocks that make up the file.
[embed]https://digital-forensics.sans.org/blog/2010/12/20/digital-forensics-understanding-ext4-part-1-extents[/embed]
4. Demolition Derby
08 Apr 2011
I got curious about what would happen when I deleted my /var/log/messages file. How does the inode change? What happens to block 131090, which holds my extent tree structure? Well, there’s really only one way to find out: I deleted the file… carefully so I didn’t lose any logging data. In fact, I didn’t just delete the file; I used “shred -u /var/log/messages” to overwrite the data blocks with nulls before unlinking the file. Once the file had been purged, I dumped out the both inode associated with the file as well as block 131090 and took a look at them in my hex editor.
[embed]https://digital-forensics.sans.org/blog/2010/12/20/digital-forensics-understanding-ext4-part-1-extents[/embed]
5. Large Extents
22 Aug 2011
[…] you can only have 32K blocks in an extent. Assuming a typical 4K block size, that means you can only have 128MB of data in a single extent. A 4GB file is therefore going to require at least 32 extents, and even that assumes you can find 32 runs of 32K contiguous blocks to use. More likely we’ll have more than 32 extents, some of which don’t use the full 128MB length.
[embed]https://digital-forensics.sans.org/blog/2010/12/20/digital-forensics-understanding-ext4-part-1-extents[/embed]
6. Directories
07 Jun 2017
One item I never got around to was documenting how directories were structured in EXT. Some recent research has caused me to dive back into this topic, and given me an excuse to add additional detail to this EXT4 series.
If you go back and read earlier posts in this series, you will note that the EXT inode does not store file names. Directories are the only place in traditional Unix file systems where file name information is kept. In EXT, and the classic Unix file systems it is evolved from, directories are simply special files that associate file names with inode numbers.
Furthermore, in the simplest case, EXT directories are just sequential lists of file entries. The entries aren’t even sorted. For the most part, directory entries in EXT are simply added to the directory file in the order files are created in the directory.
[embed]https://digital-forensics.sans.org/blog/2010/12/20/digital-forensics-understanding-ext4-part-1-extents[/embed]
I wish you a good reading!