Table of Contents

OpenWrt フェイルセーフ

OpenWrt SquashFS-Imagesにはフェイルセーフモードが組み込まれています。フェイルセーフモードの起動では、JFFS2 パーティション(書換可能な「overlay」ファイルシステム)上にあるコンフィグレーションをバイパスし、代わりに SquashFSパーティション(ルータOSに相当する読出専用区画)にあるハードコード化された基本セットを使います。

フェイルセーフモードは、ユーザロックアウト、ネットワーク接続特ロックアウト、スタートアップスクリプトの破損、パッケージや設定の破損や完全なJFFS2保存(他のJFFSコンテンツ)のような設定問題のために通常の方法ではアクセス出来なくなったルーターの問題解決に使われます。それは、ハードブリックやハードウェア、カーネル、適切な起動やハードウェアレベルでの接続を妨げるsquashFSイメージに関連するような、より基礎的な問題の解決はできません。

フェイルセーフモードは、ルーターの起動の間 -LED点滅を待ってボタンを押す、(パケット探知器のよる)特別な送信パケットを待ってボタンを押す、(シリアルポートから)起動メッセージを見てシリアルキーボードから“f”キーを押す-の3つの特別な手順によって開始することができます。通常はLED点滅の確認が簡単です。どのきっかけを使ってもルーターはフェールセーフモードに入り、Telnet(常に可)やシリアルキボードからアクセスすることができます。フェールセーフモードに入った時に役立つ情報と一緒に手順を以下に示します

フェールセーフモードに入るとルーターはeth0ネットワークインターフェースを通じて192.168.1.1/24ネットワークアドレスで起動し、基本的なサービスだけで動作します。 192.168.1.1/24の外側からのネットワークトラフィックには応答しない点に注意してください。このルーティングができる適当なデバイス上のIPをセットする必要があるかもしてません。デバイスが複数のネットワークインターフェース(eth0, eth1, ...)を持っているならば、通常(ほとんど例外なく)eth0がswitchへ接続されるインターフェースです。telnetまたはシリアル接続を使ってJFFS2パーティションをmount_rootコマンドを使ってマウントすることができ、JFFS2上で問題の診断または解決をします。

更なる情報は、OpenWrt Flash Layout がなぜ、OpenWrtはフェールセーフが可能かを説明しており、起動プロセスがどのように動くかを解説しています。(基本的にOpenWrtはpreinitと呼ばれる付加的な起動ステージがあります。)

generic.debrick

Prerequisites

How to trigger failsafe mode

You can trigger failsafe mode in three ways:

  1. Pressing any button at the appropriate time during the router's boot process. You can determine when to press by:
    1. Watching the router LEDs for flashing during boot, and pressing any hardware button when seen (standard and often easiest)
    2. Using a packet sniffer to watch for a special broadcast packet during boot, and pressing any hardware button when seen
  2. Using a serial connection, watching for a message during boot, then pressing the “f” key on your serial keyboard

Add by MrGenie: Although pressing the reset button the moment the first LED starts to blink works on most of my routers, I did encounter a few (all by Linksys) that made me pull out my hair before I had figured it out. The Linksys of mine boot with all lights up, then the LAN/WAN which are connected start blinking. DO NOT PRESS RESET HERE! wait it out. All lights go off. Now after they went off a 2nd time, the moment the first LED starts to blink:“PRESS ENTER RIGHT NOW!” now you're in failsafe mode.

Triggering by pressing any hardware button during boot

Stage 1: Router and Computer preparation

Stage 2: Enter failsafe mode

Immediately when the LED blink pattern or the network broadcast message is seen, click the device button. If your device has multiple buttons, any button should work. OpenWrt is configured in a way, that pressing of any button during preinit will trigger booting into failsafe mode. But in case a button should not work, try another. It can also help to press the button repeatedly until the blink speeds up or the “success” broadcast packet or other evidence of triggering failsafe mode successfully, is seen.

See ADD by MrGenie for several Linksys routers where this doesn't work

Stage 2 option 1: Entering failsafe mode using a blinking LED on the router

On many routers, OpenWrt will start to blink a “SYS” LED (may be “Power”, may be other) on the front of the router when it is in its early boot cycle. Since r44056 there are three different LED blinking speeds for most of the routers (in trunk and CC15.05):

Steps:

Some routers only have one hardware button, the reset button, which is often on the back of the unit (often labeled “Reset” or “WPS/Reset”). It may have a visible (external) button, or may be behind a hole (with button in the depth). If it is in a hole, you require a paper clip or similar tool to operate it. Please no not use a nail to press the button in the hole!

Stage 2 option 2: Entering failsafe using the broadcast packet

The exact steps will depend on the device you are using to watch for the broadcast packet. Details are given below for Linux and Windows. Most *nix/BSD/OS X/Android/Mac should be very similar to Linux (often identical). For many other devices and systems the same steps should be possible (but details not provided).

You will need to be sure the router is connected to the device/PC, the cable is working, the device's firewall will not block the packet, and that network LEDs or other diagnostics you may have, show the connection is working. You may also need to temporarily disable the firewall on your device or open a port on it - take care and secure it again after!

Steps:

Unverified Information!
Up to today (Jan 11, 2013) this page didn't precise on which port to listen. In the case of TL-WR1043ND, it's the WAN port. If you find a contradictory example, it will be necessarry to remove or adapt this note.

Screenshots of typical packet sniffer using broadcast packet method

'Broadcast packet and success packet under Linux (broadcast packet is the first part only!):'

Run wireshark, cshark or tcpdump

'Broadcast packet and success packet under Windows (broadcast packet is the first line only!):'

Monitoring the special packet in a program recvudp.exe.

Stage 3: Log into the router when has booted into failsafe mode

Important notes and troubleshooting for failsafe mode login:

  1. In failsafe mode, the router will not respond to network traffic from outside the 192.168.1.0/24 subnet, so you cannot telnet or ping it unless the source is also in this range of IP addresses. You may need to temporarily set a suitable IP on the device used and/or ensure that any routing and firewalls will allow packets if accessed across a network.
  2. If your device has multiple network interfaces (eth0, eth1, ...), usually eth0 is the interface connected to the switch (there may be very seldom exceptions).
  3. If the router does not boot in safe mode despite clicking the button, it may be a timing problem, missing the brief window when OpenWrt is looking for a button press. If so, immediately after turning the router on, rapidly click and keep clicking the button on the router for about 60 seconds to try to not miss the safe mode boot window.
  4. If your router has a ridiculously long boot time (such as DIR-300 A), then you may do this for a longer time.

How to tell when failsafe mode is active:

If you are using a trunk snapshot, revision 46809 or newer, ssh to 192.168.1.1 from the computer and log in as root (no password required). The host key will be randomly generated. You can pass -o “UserKnownHostsFile /dev/null” -o “StrictHostKeyChecking no” to ssh if you want to allow a different host key temporarily.

If you are using a release image, telnet (not SSH) to 192.168.1.1 from the computer. There is no username or password required.

Now go to section When you are in failsafe mode

Serial connection triggering by keyboard key combination in a serial console

  1. Unplug the router's power cord.
  2. Connect the router's WAN port directly to your PC.
  3. Configure your PC with a static IP address between 192.168.1.2 and 192.168.1.254. E. g. 192.168.1.2 (gateway and DNS is not required).
  4. Plugin the power.
  5. Connect via serial
  6. Wait until the following messages is passing: Press the [f] key and hit [enter] to enter failsafe mode
  7. Press “f” and the “enter” key
  8. You should be able to telnet (not SSH) to the router at 192.168.1.1 now (no username and password)

Login message

You get a message similar or same like this (using OpenWrt 12.09):

 === IMPORTANT ============================
  Use 'passwd' to set your login password
  this will disable telnet and enable SSH
 ------------------------------------------


BusyBox v1.19.4 (2013-03-14 11:28:31 UTC) built-in shell (ash)
Enter 'help' for a list of built-in commands.

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 ATTITUDE ADJUSTMENT (12.09, r36088)
 -----------------------------------------------------
  * 1/4 oz Vodka      Pour all ingredients into mixing
  * 1/4 oz Gin        tin with ice, strain into glass.
  * 1/4 oz Amaretto
  * 1/4 oz Triple sec
  * 1/4 oz Peach schnapps
  * 1/4 oz Sour mix
  * 1 splash Cranberry juice
 -----------------------------------------------------
root@(none):/#

Additional note (r42985):

================= FAILSAFE MODE active ================
special commands:

    firstboot reset settings to factory defaults
    mount_root mount root-partition with config files 

after mount_root:

    passwd change root's password
    /etc/config directory with config files 

for more help see:
​http://wiki.openwrt.org/doc/howto/generic.failsafe
=======================================================

The file systems in failsafe mode

OpenWrt uses an overlay file system (JFFS2) which overlays the default router files on the SquashFS partition. JFFS2 contains all config, all packages, and any temp or other files which are not part of the default OpenWrt. Deleting a file from the JFFS2 effectively “resets” the JFFS2 file version back to default, because the original file will be seen on the SquashFS (if it existed). Deleting the entire contents of the JFFS2 will effective resets the router to OpenWrt default settings and packages.

The root file system in failsafe mode is the only the SquashFS partition and the JFFS2 is not present. To mount (access) the JFFS2 as read/write in failsafe mode you must manually mount it. Enter the command mount_root to do this.

Once the JFFS2 file system is mounted read/write, you can view/edit/delete/fix the files which are changed from the default firmware. Any files that are changed will be accessible at /overlay/* (or /overlay/upper/* on some routers).

The core config files are usually at /overlay/etc/config/* (or /overlay/upper/etc/config/*) and have names such as “network”, “firewall” etc. Other copies may exist in the /rom subdirectory and the router's UI code may exist in subdirectories such as /lua

Useful commands and procedures

General:

Specific commands and procedures:

Changing or resetting some config by editing files

Run the command mount_root and then edit or delete such files as you need. To reset all of the JFFS2 (OpenWrt version of “factory reset”) see the next section.

The core config files are usually at /overlay/etc/config/* (or /overlay/upper/etc/config/*) and have names such as “network”, “firewall” etc which you can search using the find -name command (see below). If you know your error is (say) some network switch or VLAN issue, then you can edit/delete the network config file and reboot. The router will keep all settings except the settings of the file you changed/deleted which will go back to default.

Wiping JFFS2 file system ('Factory reset' to default config)

  1. This procedure is safe (it will restore the default setup and not brick your router).
  2. It will clear the JFFS2 partition, resetting all custom settings and removing all installed packages, logs, dumps, and temp files, the OpenWrt equivalent of a factory reset. If you need any of these, take a backup to some other device in failsafe mode before doing this.

Run mount_root first (see above) to mount the JFFS2 partition. Once the JFFS2 partition is mounted for read/write, use any of these commands to erase the files on it, which resets the router:

NOTE: there is a bug report that sometimes firstboot or mtd-r erase rootfs_data may not work and “hangs”. If that happens then the files can be deleted using the “rm...” method. The overlay is “on top” of the SquashFS so deleting overlay files just leaves the original SquashFS files showing.

Flash new firmware in failsafe mode

Steps (overview):

  1. Prepare your local desktop with netcat to listen to tcp/ip connections on port 3333 and feed any incoming connection with the firmware file you want to flash.
  2. On the router with netcat connect to your local desktop ipaddress on port 3333 and pipe the recieved data to a file.
  3. On the router perform a sysupgrade with the receieved file.

Let us assume the following:

Windows Desktop

 nc -l -p 3333 < flash.bin

Cygwin Desktop

 
$ cat nxtfw.bin | pv -b | nc -l 3333

Linux Desktop

cat nxtfw.bin | pv -b | nc -l -p 3333

Failsafed device

nc 192.168.1.123 3333 > /tmp/nxtfw.bin
root@(none):/# sysupgrade /tmp/nxtfw.bin
Saving config files...
killall: watchdog: no process killed
Failed to connect to ubus
Switching to ramdisk...
Performing system upgrade...
Unlocking firmware ...

Writing from <stdin> to firmware ...
Appending jffs2 data from /tmp/sysupgrade.tgz to firmware...
Writing from <stdin> to firmware ...
Upgrade completed
Rebooting system...
[217.460000] reboot: Restarting system

Notes