OpenWrt フェイルセーフ

OpenWrt SquashFS-Imagesにはフェイルセーフモードが組み込まれています。フェイルセーフモードの起動では、JFFS2 パーティション(書換可能な「overlay」ファイルシステム)上にあるコンフィグレーションをバイパスし、代わりに SquashFSパーティション(ルータOSに相当する読出専用区画)にあるハードコード化された基本セットを使います。

フェイルセーフモードは、ユーザロックアウト、ネットワーク接続特ロックアウト、スタートアップスクリプトの破損、パッケージや設定の破損や完全なJFFS2保存(他のJFFSコンテンツ)のような設定問題のために通常の方法ではアクセス出来なくなったルーターの問題解決に使われます。それは、ハードブリックやハードウェア、カーネル、適切な起動やハードウェアレベルでの接続を妨げるsquashFSイメージに関連するような、より基礎的な問題の解決はできません。

フェイルセーフモードは、ルーターの起動の間 -LED点滅を待ってボタンを押す、(パケット探知器のよる)特別な送信パケットを待ってボタンを押す、(シリアルポートから)起動メッセージを見てシリアルキーボードから“f”キーを押す-の3つの特別な手順によって開始することができます。通常はLED点滅の確認が簡単です。どのきっかけを使ってもルーターはフェールセーフモードに入り、Telnet(常に可)やシリアルキボードからアクセスすることができます。フェールセーフモードに入った時に役立つ情報と一緒に手順を以下に示します

フェールセーフモードに入るとルーターはeth0ネットワークインターフェースを通じて192.168.1.1/24ネットワークアドレスで起動し、基本的なサービスだけで動作します。 192.168.1.1/24の外側からのネットワークトラフィックには応答しない点に注意してください。このルーティングができる適当なデバイス上のIPをセットする必要があるかもしてません。デバイスが複数のネットワークインターフェース(eth0, eth1, ...)を持っているならば、通常(ほとんど例外なく)eth0がswitchへ接続されるインターフェースです。telnetまたはシリアル接続を使ってJFFS2パーティションをmount_rootコマンドを使ってマウントすることができ、JFFS2上で問題の診断または解決をします。

更なる情報は、OpenWrt Flash Layout がなぜ、OpenWrtはフェールセーフが可能かを説明しており、起動プロセスがどのように動くかを解説しています。(基本的にOpenWrtはpreinitと呼ばれる付加的な起動ステージがあります。)

generic.debrick

  • Your device must have at least one configurable hardware button to use flashing LED or broadcast packet triggering. Any configurable button will work (except the main power on/off!), even if it is labelled as a “reset” button or a “wifi on/off” or something else. If your router has any button that you can physically press and release, it's likely to be configurable. Check if there's specific info about failsafe mode for your device and make sure everything still works as expected everytime you update!
  • Your device must have an OpenWrt firmware with a SquashFS-Image partition flashed to it. Failsafe cannot be implemented on a system based on JFFS2-Images
  • The hardware ports you will use for commands when you are in failsafe mode must work (either a valid ethernet port or valid serial connection)
  • Of course you must have a networked or serial-connected device (PC, notepad etc) and for serial connection, you must be able to send key-strokes to the router, to use in failsafe mode!
  • The boot process must be able to succeed. Failsafe mode can be used to fix any problem in the JFFS2 partition (because it doesn't need this to enter failsafe mode), but it needs the kernel partition and the SquashFS partition to be able to support the boot process, so that...
    • ...the boot process is able to get as far as required to register the pressing of the button
    • ...the minimal required binaries and the firmware's default configuration files are available and the boot process can successfully enter failsafe mode (these are all on the SquashFS)

You can trigger failsafe mode in three ways:

  1. Pressing any button at the appropriate time during the router's boot process. You can determine when to press by:
    1. Watching the router LEDs for flashing during boot, and pressing any hardware button when seen (standard and often easiest)
    2. Using a packet sniffer to watch for a special broadcast packet during boot, and pressing any hardware button when seen
  2. Using a serial connection, watching for a message during boot, then pressing the “f” key on your serial keyboard

Add by MrGenie: Although pressing the reset button the moment the first LED starts to blink works on most of my routers, I did encounter a few (all by Linksys) that made me pull out my hair before I had figured it out. The Linksys of mine boot with all lights up, then the LAN/WAN which are connected start blinking. DO NOT PRESS RESET HERE! wait it out. All lights go off. Now after they went off a 2nd time, the moment the first LED starts to blink:“PRESS ENTER RIGHT NOW!” now you're in failsafe mode.

  • Power off the router
  • Unplug the WAN port, and in some cases if needed any other ports (if the WAN IP address and LAN IP address are the same or two LAN ports (an address collision), you would not be able to enter failsafe mode unless you unplug the other port(s) which are colliding)
  • Set your computer's IP to 192.168.1.2, subnet mask 255.255.255.0. The router will be reached at 192.168.1.1 when failsafe mode is running. (You may also use any other IP in the range 192.168.1.2-254.)
  • Connect the computer to the router. You may need to check which port to use, especially if you plan to watch for a broadcast packet to trigger failsafe (see below)
  • To detect when failsafe mode can be triggered, there are two options:
    • Look for a bootup LED blink pattern (easiest with many routers). Looking for a blink pattern is often much more convenient than the other options.
    • Use a packet sniffer on any computer to listen for a special broadcast packet on UDP port 4919 during boot, then press the front button on the router when this packet is seen.

Immediately when the LED blink pattern or the network broadcast message is seen, click the device button. If your device has multiple buttons, any button should work. OpenWrt is configured in a way, that pressing of any button during preinit will trigger booting into failsafe mode. But in case a button should not work, try another. It can also help to press the button repeatedly until the blink speeds up or the “success” broadcast packet or other evidence of triggering failsafe mode successfully, is seen.

See ADD by MrGenie for several Linksys routers where this doesn't work

Stage 2 option 1: Entering failsafe mode using a blinking LED on the router

On many routers, OpenWrt will start to blink a “SYS” LED (may be “Power”, may be other) on the front of the router when it is in its early boot cycle. Since r44056 there are three different LED blinking speeds for most of the routers (in trunk and CC15.05):

  • first a moderate 0.1 second blinking rhythm during those two seconds, when router waits for user to trigger the failsafe mode
  • then either
    • a slow 0.2 second blink if the failsafe was not triggered and the normal boot continues
    • a rapid 0.05 second blink if the user pressed a button and failsafe mode was triggered

Steps:

  • Power on the router.
  • As soon as this blink pattern is seen, press any hardware button of the router. At least one TP-Link router seems to respond better to repeatedly clicking the button before the SYS LED starts to blink, until the SYS LED lights with the rapid-flash pattern.
  • The LED will change to faster blink pattern, indicating the router is now in failsafe mode.

Some routers only have one hardware button, the reset button, which is often on the back of the unit (often labeled “Reset” or “WPS/Reset”). It may have a visible (external) button, or may be behind a hole (with button in the depth). If it is in a hole, you require a paper clip or similar tool to operate it. Please no not use a nail to press the button in the hole!

Stage 2 option 2: Entering failsafe using the broadcast packet

The exact steps will depend on the device you are using to watch for the broadcast packet. Details are given below for Linux and Windows. Most *nix/BSD/OS X/Android/Mac should be very similar to Linux (often identical). For many other devices and systems the same steps should be possible (but details not provided).

You will need to be sure the router is connected to the device/PC, the cable is working, the device's firewall will not block the packet, and that network LEDs or other diagnostics you may have, show the connection is working. You may also need to temporarily disable the firewall on your device or open a port on it - take care and secure it again after!

Steps:

  • You will need some packet sniffing/packet capture software:
    • Linux (also most *nix/BSD/OS X/Android/Mac): Software is often built in or very easy to download. GUI wireshark or console tcpdump or cshark or other. If you do not have any, then these are all very common open source ports and available + free on most platforms. Y you should be able to download one of these for your device easily in the usual way (or any other packet sniffer you like).
    • Windows: You can use the recvudp.exe utility software, or any other packet sniffer. There are also Windows versions of some of the above software as well.
  • Start watching for packets. The exact commands or menu options are different for each sniffer, so you need to find how to do this on your software. The packet will be sent to destination address 192.168.1.255 port UDP 4919. So for example, in a terminal and using tcpdump, with the router connected to port eth0, you would enter the command tcpdump -Ani eth0 port 4919 and udp
  • Turn the router power off, wait a few seconds, and then back on.
  • Look in the sniffer for the early boot network broadcast packet to be shown (Could take up to 30-40 seconds on some routers). The packet will also show the message that it is waiting for your click on a button. The screenshots below show what this can look like.
  • When you see the packet/message, press any configurable hardware key on the router. If needed press several times. Often you will know you have succeeded because the router will send a second broadcast “success” packet when failsafe mode is triggered, and you will see this in the packet sniffer as well (also shown in the screenshots below). But not all versions do this.

Unverified Information!
Up to today (Jan 11, 2013) this page didn't precise on which port to listen. In the case of TL-WR1043ND, it's the WAN port. If you find a contradictory example, it will be necessarry to remove or adapt this note.

Screenshots of typical packet sniffer using broadcast packet method

'Broadcast packet and success packet under Linux (broadcast packet is the first part only!):'

Run wireshark, cshark or tcpdump

'Broadcast packet and success packet under Windows (broadcast packet is the first line only!):'

Monitoring the special packet in a program recvudp.exe.

Important notes and troubleshooting for failsafe mode login:

  1. In failsafe mode, the router will not respond to network traffic from outside the 192.168.1.0/24 subnet, so you cannot telnet or ping it unless the source is also in this range of IP addresses. You may need to temporarily set a suitable IP on the device used and/or ensure that any routing and firewalls will allow packets if accessed across a network.
  2. If your device has multiple network interfaces (eth0, eth1, ...), usually eth0 is the interface connected to the switch (there may be very seldom exceptions).
  3. If the router does not boot in safe mode despite clicking the button, it may be a timing problem, missing the brief window when OpenWrt is looking for a button press. If so, immediately after turning the router on, rapidly click and keep clicking the button on the router for about 60 seconds to try to not miss the safe mode boot window.
  4. If your router has a ridiculously long boot time (such as DIR-300 A), then you may do this for a longer time.

How to tell when failsafe mode is active:

  • Once in failsafe mode, a network broadcast confirmation message appears (not always, for the TL-WR1043ND no message comes).
  • On some router models (e.g. TP-LINK models), the SYS led blinks very quickly

If you are using a trunk snapshot, revision 46809 or newer, ssh to 192.168.1.1 from the computer and log in as root (no password required). The host key will be randomly generated. You can pass -o “UserKnownHostsFile /dev/null” -o “StrictHostKeyChecking no” to ssh if you want to allow a different host key temporarily.

If you are using a release image, telnet (not SSH) to 192.168.1.1 from the computer. There is no username or password required.

Now go to section When you are in failsafe mode

  1. Unplug the router's power cord.
  2. Connect the router's WAN port directly to your PC.
  3. Configure your PC with a static IP address between 192.168.1.2 and 192.168.1.254. E. g. 192.168.1.2 (gateway and DNS is not required).
  4. Plugin the power.
  5. Connect via serial
  6. Wait until the following messages is passing: Press the [f] key and hit [enter] to enter failsafe mode
  7. Press “f” and the “enter” key
  8. You should be able to telnet (not SSH) to the router at 192.168.1.1 now (no username and password)

You get a message similar or same like this (using OpenWrt 12.09):

 === IMPORTANT ============================
  Use 'passwd' to set your login password
  this will disable telnet and enable SSH
 ------------------------------------------


BusyBox v1.19.4 (2013-03-14 11:28:31 UTC) built-in shell (ash)
Enter 'help' for a list of built-in commands.

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 ATTITUDE ADJUSTMENT (12.09, r36088)
 -----------------------------------------------------
  * 1/4 oz Vodka      Pour all ingredients into mixing
  * 1/4 oz Gin        tin with ice, strain into glass.
  * 1/4 oz Amaretto
  * 1/4 oz Triple sec
  * 1/4 oz Peach schnapps
  * 1/4 oz Sour mix
  * 1 splash Cranberry juice
 -----------------------------------------------------
root@(none):/#

Additional note (r42985):

================= FAILSAFE MODE active ================
special commands:

    firstboot reset settings to factory defaults
    mount_root mount root-partition with config files 

after mount_root:

    passwd change root's password
    /etc/config directory with config files 

for more help see:
​http://wiki.openwrt.org/doc/howto/generic.failsafe
=======================================================

OpenWrt uses an overlay file system (JFFS2) which overlays the default router files on the SquashFS partition. JFFS2 contains all config, all packages, and any temp or other files which are not part of the default OpenWrt. Deleting a file from the JFFS2 effectively “resets” the JFFS2 file version back to default, because the original file will be seen on the SquashFS (if it existed). Deleting the entire contents of the JFFS2 will effective resets the router to OpenWrt default settings and packages.

The root file system in failsafe mode is the only the SquashFS partition and the JFFS2 is not present. To mount (access) the JFFS2 as read/write in failsafe mode you must manually mount it. Enter the command mount_root to do this.

Once the JFFS2 file system is mounted read/write, you can view/edit/delete/fix the files which are changed from the default firmware. Any files that are changed will be accessible at /overlay/* (or /overlay/upper/* on some routers).

The core config files are usually at /overlay/etc/config/* (or /overlay/upper/etc/config/*) and have names such as “network”, “firewall” etc. Other copies may exist in the /rom subdirectory and the router's UI code may exist in subdirectories such as /lua

General:

  • The UCI command reads and writes the router's main configuration files, and is also the main command line tool for modifying the configuration. So it has a lot of helpful commands for troubleshooting and fixing config-related problems. (You can also edit the config files directly using any text editor). See The UCI System.
  • If you are not very familiar with Linux, many commands have a --help option (for example: grep --help) which can suggest the options you need. Often you only need basic commands to get started, such as mv (move/rename), cp (copy), rm (remove/delete), find -name *XYZ* (find all files from the current dir with XYZ in the name), cd and ls (change and list current directory), cat (view file), less (view file with page up/down, use “q” to finish), grep (show matching lines/text only), and so on. If a command “hangs” or takes too long, ctrl-C will often kill it and return to a command line.

Specific commands and procedures:

  • Forgot password - In case you forgot your password, you need to set a new one. Type: passwd
  • Forgot IP address - In case you forgot the IP address for the router, get it with uci get network.lan.ipaddr
  • Bad switch/bridge/firewall settings/cannot reach webUI/cannot SSH - In case you have settings that stop you reaching the webUI/SSH, you can use uci (see above) or manually edit the relevant config files to fix the problem. If you are not confident of the correct settings, it's safe and easy to rename (mv) the relevant file to some temporary name, which will cause OpenWrt to use its defaults for that file. Then after reboot you can compare the default and your problem settings file, fix the problem, and restore the other settings which were OK. The most likely files for this are the 'network' and 'firewall' config files.
  • Storage (JFFS2) full - If the JFFS2 file system is full, maybe because you installed too big/too many packages, or it contains some big logs/dumps/cache/other files, or any other reason, either find & delete some files or packages which you don't need, or you can clear the entire JFFS2 partition (see below).
  • Finished/reboot - If you are done with failsafe mode use reboot to reboot.

Run the command mount_root and then edit or delete such files as you need. To reset all of the JFFS2 (OpenWrt version of “factory reset”) see the next section.

The core config files are usually at /overlay/etc/config/* (or /overlay/upper/etc/config/*) and have names such as “network”, “firewall” etc which you can search using the find -name command (see below). If you know your error is (say) some network switch or VLAN issue, then you can edit/delete the network config file and reboot. The router will keep all settings except the settings of the file you changed/deleted which will go back to default.

  1. This procedure is safe (it will restore the default setup and not brick your router).
  2. It will clear the JFFS2 partition, resetting all custom settings and removing all installed packages, logs, dumps, and temp files, the OpenWrt equivalent of a factory reset. If you need any of these, take a backup to some other device in failsafe mode before doing this.

Run mount_root first (see above) to mount the JFFS2 partition. Once the JFFS2 partition is mounted for read/write, use any of these commands to erase the files on it, which resets the router:

  • firstboot
  • rm -r /overlay/* (or /overlay/upper/* on some routers)
  • mtd -r erase rootfs_data (this will reboot the device as part of the process)

NOTE: there is a bug report that sometimes firstboot or mtd-r erase rootfs_data may not work and “hangs”. If that happens then the files can be deleted using the “rm...” method. The overlay is “on top” of the SquashFS so deleting overlay files just leaves the original SquashFS files showing.

Steps (overview):

  1. Prepare your local desktop with netcat to listen to tcp/ip connections on port 3333 and feed any incoming connection with the firmware file you want to flash.
  2. On the router with netcat connect to your local desktop ipaddress on port 3333 and pipe the recieved data to a file.
  3. On the router perform a sysupgrade with the receieved file.

Let us assume the following:

  • Your desktop is on ipaddress 192.168.1.123
  • the router is on default 192.168.1.1
  • the name of the firmware file to flash has been renamed (for simplicity reasons) to nxtfw.bin

Windows Desktop

  • Copy the firmware you intend to flash to the directory where you have unzipped/installed netcat
    • rename the file to something easy to type e.g. nxtfw.bin
  • Open an command prompt window, navigate to the directory where netcat is installed:
 nc -l -p 3333 < flash.bin

Cygwin Desktop

  • Install Cygwin if you have not already.
  • If you do not have pv (Pipe Viewer) or nc (netcat) installed in your Cygwin Environment (by default they are not installed):
    • Re-Run the setup executable and expressly search and mark for installation:
      • pv (found in Utils)
      • nc (found in Net)
  • Run Cygwin
 
$ cat nxtfw.bin | pv -b | nc -l 3333

Linux Desktop

cat nxtfw.bin | pv -b | nc -l -p 3333
  • pv (Pipe Viewer) shows progress of data transfer through the pipe.
  • nc (netcat) listen on port 3333 transferring the firmware

Failsafed device

  • On the Router via your Telnet connection
nc 192.168.1.123 3333 > /tmp/nxtfw.bin
  • install firmware with current settings.
root@(none):/# sysupgrade /tmp/nxtfw.bin
Saving config files...
killall: watchdog: no process killed
Failed to connect to ubus
Switching to ramdisk...
Performing system upgrade...
Unlocking firmware ...

Writing from <stdin> to firmware ...
Appending jffs2 data from /tmp/sysupgrade.tgz to firmware...
Writing from <stdin> to firmware ...
Upgrade completed
Rebooting system...
[217.460000] reboot: Restarting system
  • after reboot you should be able to access your device on 192.168.1.1
  • the article process.boot may help you better understand when failsafe “kicks in” once activated
This website uses cookies. By using the website, you agree with storing cookies on your computer. Also you acknowledge that you have read and understand our Privacy Policy. If you do not agree leave the website.More information about cookies
  • Last modified: 2021/04/18 21:12
  • by tmomas