This translation is older than the original page and might be outdated. See what has changed.

Flash Layout/Flash 布局

有种东西叫做硬盘,被认为是 块(block)设备, 与之类似的是闪存(flash memory). Flash 的类型有很多, NOR, SLC NAND, MLC NAND等

如果闪存芯片是直接连接到CPU(SoC),并且能够直接被操作系统/Linux读取, 我们就把他叫做“raw flash”. 如果闪存芯片不能够直接被操作系统/Linux读取 (由于闪存和SoC之间还需要另外的控制芯片来连接), 我们叫他 “FTL (Flash Translation Layer) flash”. 嵌入式系统几乎全部使用“raw flash”, 而USB 闪存盘/U盘/SSD固态硬盘, 几乎全部使用 FTL flash!

嵌入式系统中的Flash芯片并不需要单独的控制芯片,因此不是 “FTL-设备 而是 “raw flash”-设备. 存储空间是通过操作MTD设备 MTD (Memory Technology Device)加上特定的文件系统filesystems来完成的. 传统的存储空间使用 MBRPBRs 来存储分区相关的信息, 而嵌入式设备中 这由内核 Linux Kernel 来完成(而且有时候会单独由bootloader来完成!). 你只需要定义, “kernel分区 由偏移量x起 至偏移量y止”. 这样做的优点是, 我们之后能直接通过 分区名来定位某个分区,而不用指定数据的精确起始点.

:!: 下图中的表格是 TP-Link TL-WR1043ND型号的分区图, 注意这只是一个例子! 另一个, 与这个有些微不同的, 例子能在wiki 的这个页面找到 -- DIR-300. 注意不同的设备的Flash布局是不同的! 请参考wiki页面上针对特定设备的页面, 来获取该设备的Flash布局信息.

TP-Link WR1043ND Flash Layout
Layer0 m25p80 spi0.0: m25p64 8192KiB
Layer1 mtd0 u-boot 128KiB mtd5 firmware 8000KiB mtd4 art 64KiB
Layer2 mtd1 kernel 1280KiB mtd2 rootfs 6720KiB
mountpoint /
filesystem mini_fo
Layer3 mtd3 rootfs_data 5184KiB
Size in KiB 128KiB 1280KiB 1536KiB 5184KiB 64KiB
Name u-boot kernel rootfs_data art
mountpoint none none /rom /overlay none
filesystem none none SquashFS JFFS2 none

由于分区是嵌套(nested)的,我们把他分成几层来看:

  1. 第0层Layer0: 对应Flash芯片,8MiB大小, 焊接在PCB上,连接到CPU(SoC)soc -- 通过SPI (Serial Peripheral Interface Bus)总线.
  2. 第1层Layer1: 我们把存储空间“分区”为 mtd0 给 bootloader, mtd5 给 firmware/固件使用, 并且, 在这个例子中, mtd4给ART (Atheros Radio Test/Atheros电波测试) - 它包含MAC地址和无线系统的校准信息(EEPROM). 如果该部分的信息丢失或损坏,ath9k (无线驱动程序) 就彻底罢工了.
  3. 第2层Layer2: 我们把mtd5 (固件) 进一步分割为 mtd1 (kernel/内核) and mtd2 (rootfs); 在固件的一般处理流程中 (参考imagebuilder) Kernel 二进制文件 先由LZMA打包, 然后用gzip压缩 之后文件被 直接写入到raw flash (mtd1)中 而不mount到任何文件系统上!
  4. 第3层Layer3: 我们把rootfs更进一步分割成 mtd3 (rootfs_data) 和剩下的 未命名 部分 - 未来用来容纳SquashFS-分区.

→ 参考文件系统专题页面filesystems

  • / 这是文件系统根入口,它由 /rom/overlay构成.。在日常使用中请忽略/rom/overlay目录!
  • /rom 包含所有基础文件,比如busybox, dropbear 也或iptables包含的缺省配置文件。此目录不包含内核,此目录重点文件是在SqashFS区域, 因此文件无法删除。但是,因为我们使用的是mini_fo文件系统, 所谓的overlay-whiteout-symlinks可以被创建在JFFS2区域.
  • /overlay/rom共同构成统一的/根目录,/overlay是可读可写的部分。它包含It contains anything that was written to the router after installation, e.g. changed configuration files, additional packages installed with opkg, etc. It is formated with JFFS2.

Rather than deleting the files, insert a whiteout, a special high-priority entry that marks the file as deleted. File system code that sees a whiteout entry for file F behaves as if F does not exist.

#!/bin/bash
# shows all overlay-whiteout symlinks
 
find /overlay -type l | while read FILE
  do
    [ -z "$FILE" ] && break
    if ls -la "$FILE" 2>&- | grep -q '(overlay-whiteout)'; then
    echo "$FILE"
    fi
  done

NOTE1: If the Kernel was part of the SquashFS, we could not control where exactly on the flash it is written to (on which blocks it's data is contained). Thus we could not tell the bootloader to simply load and execute certain blocks on the flash storage, but would have to address it with path and filename. Now this would not be bad, but in order to that the bootloader would have to understand the SquashFS filesystem, which it does not. The embedded bootloader we utilize with OpenWrt generally have no concept of filesystems, thus they cannot address files by path and filename. They pretty much assume that the start of the trx data section is executable code.
NOTE2: the denomination “firmware” usually is used for the entire data on the flash comprising the boot loader and any other data necessary to operate the device, such as ART, NVRAM, FIS, etc, but we also use it to only name the parts that are being rewritten. Don't let this confuse you ;-)

TODO

cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00020000 00010000 "u-boot"
mtd1: 00140000 00010000 "kernel"
mtd2: 00690000 00010000 "rootfs"
mtd3: 00530000 00010000 "rootfs_data"
mtd4: 00010000 00010000 "art"
mtd5: 007d0000 00010000 "firmware"

The erasesize is the block size of the flash, in this case 64KiB. The size is little or big endian hex value in Bytes. In case of little endian, you switch to hex-mode and enter 02 0000 into the calculator for example and convert to decimal (by switching back to decimal mode again). Then guess how they are nested into each other. Or execute dmesg after a fresh boot and look for something like:

Creating 5 MTD partitions on "spi0.0":
0x000000000000-0x000000020000 : "u-boot"
0x000000020000-0x000000160000 : "kernel"
0x000000160000-0x0000007f0000 : "rootfs"
mtd: partition "rootfs" set to be root filesystem
mtd: partition "rootfs_data" created automatically, ofs=2C0000, len=530000
0x0000002c0000-0x0000007f0000 : "rootfs_data"
0x0000007f0000-0x000000800000 : "art"
0x000000020000-0x0000007f0000 : "firmware"

These are the start and end offsets of the partitions as hex values in Bytes. Now you don't have to guess which is nested in which. E.g. 02 0000 = 131.072 Bytes = 128KiB.

The flash chip can be represented as a large block of continuous space:

start of flash ................. end of flash

There is no ROM to boot from; at power up the CPU begins executing the code at the very start of the flash. Luckily this isn't the firmware or we'd be in real danger every time we reflashed. Boot is actually handled by a section of code we tend to refer to as the bootloader (the BIOS of your PC is a bootloader).

Boot Loader Partition Firmware Partition Special Configuration Data
Atheros U-Boot firmware ART
Broadcom CFE firmware NVRAM
Atheros RedBoot firmware FIS recovery RedBoot config boardconfig

The partition or partitions containing so called Special Configuration Data differ very much from each other. Example: The ART-partition you will meet in conjunction with Atheros-Wireless and U-Boot, contains only data regarding the wireless driver, while the NVRAM-partition of broadcom devices is used for much more then only that.

If you dig into the “firmware” section you'll find a trx. A trx is just an encapsulation, which looks something like this:

trx-header
HDR0 length crc32 flags pointers data

“HDR0” is a magic value to indicate a trx header, rest is 4 byte unsigned values followed by the actual contents. In short, it's a block of data with a length and a checksum. So, our flash usage actually looks something like this:

CFE trx containing firmware NVRAM

Except that the firmware is generally pretty small and doesn't use the entire space between CFE and NVRAM:

CFE trx firmware unused NVRAM

(NOTE: The <model>.bin files are nothing more than the generic *.trx file with an additional header appended to the start to identify the model. The model information gets verified by the vendor's upgrade utilities and only the remaining data -- the trx -- gets written to the flash. When upgrading from within OpenWrt remember to use the *.trx file.)

So what exactly is the firmware?

The boot loader really has no concept of filesystems, it pretty much assumes that the start of the trx data section is executable code. So, at the very start of our firmware is the kernel. But just putting a kernel directly onto flash is quite boring and consumes a lot of space, so we compress the kernel with a heavy compression known as LZMA. Now the start of firmware is code for an LZMA decompress:

lzma decompress lzma compreszsed kernel

Now, the boot loader boots into an LZMA program which decompresses the kernel into memory and executes it. It adds one second to the bootup time, but it saves a large chunk of flash space. (And if that wasn't amusing enough, it turns out the boot loader does know gzip compression, so we gzip compressed the LZMA decompression program)

Immediately following the kernel is the filesystem. We use SquashFS for this because it's a highly compressed readonly filesystem -- remember that altering the contents of the trx in any way would invalidate the crc, so we put our writable data in a JFFS2 partition, which is outside the trx. This means that our firmware looks like this:

trx gzip'd lzma decompress lzma'd kernel (SquashFS filesystem)

And the entire flash usage looks like this -

CFE trx gz'd lzma lzma'd kernel SquashFS JFFS2 filesystem NVRAM

That's about as tight as we can possibly pack things into flash.


An image file is byte by byte copy of data contained in a file system. If you installed a Debian or a Windows in the usual way onto one or two harddisc partitions and would afterwards copy the whole content byte by byte from the hard disc into one file:

dd if=/dev/sda of=/media/sdb3/backup.dd

the obtained backup file /media/sdb3/backup.dd, could be used in the exact same manner like an OpenWrt-Image-File.

The difference is, that OpenWrt-Image-File are not created that way ;-) They are being generated with the Image Generator (former called Image Builder). You can read about the:

This website uses cookies. By using the website, you agree with storing cookies on your computer. Also you acknowledge that you have read and understand our Privacy Policy. If you do not agree leave the website.More information about cookies
  • Last modified: 2020/12/14 14:04
  • by tmomas