User Tools

Site Tools


docs:guide-user:network:wifi:mesh:batman

B.A.T.M.A.N / batman-adv

B.A.T.M.A.N. is an alias for “Better Approach To Mobile Adhoc Networking” and works for stationary systems as well.

In addition to providing node-to-node and node-to-net connectivity, batman-adv can provide bridging of multiple VLANs over a mesh (or link), such as for “trusted” client, guest, IoT, and management networks. It provides an easy-to-configure alternative to other approaches to “backhaul”, such as WDS connections, GRE tunnels, and various “relay” and “pseudo-relay” approaches.

batman-adv can run on top of a variety of mesh implementations, including 802.11s, ad-hoc (IBSS), and multiple point-to-point links, wired or wireless.

batman-adv is reasonably robust to topology changes, typically adapting within a couple seconds.

batman-adv does not provide encryption or authentication. If required, it should be implemented either or both in the underlying transport (encrypted, authenticated mesh, for example), or protocols (IPSEC, TLS, ssh, …).

  • batman-adv is a mesh protocol for a Layer 2 networking (like Ethernet frames) running in the kernel
  • batmand is a user-space daemon for an older B.A.T.M.A.N. protocol that operates at Layer 3 (like TCP/IP packets)

Unless you've got a strong reason to use the older, Layer 3 protocol (such as interoperation with an existing mesh), batman-adv is suggested. This page documents configuration of batman-adv for a local mesh.

For further information, see, for example

Special thanks to the authors of this OpenWrt walk-through

Installation and Configuration of batman-adv

Configuration developed and tested on openwrt-18.06 and master in September, 2018

The Archer C7 v2 units in use support 802.11s mesh. This may be checked on other devices/builds with iw phy, looking for “mesh point” in the various sections of the output.

Overview

In this walk-through the following steps are described

  • Install needed packages for batman-adv
  • Configure a mesh on which batman-adv will run
  • Configure batman-adv to use the mesh
  • Configure one or more VLANs to be routed by batman-adv

While an 802.11s mesh is described here, an ad-hoc (IBSS) mesh, or point-to-point links can also be utilized.

Note that use of Ethernet links with their typical MTU of 1500 will reduce the PMTU to below 1500 due to the batman-adv headers. batman-adv is typically configured to manage fragmentation, with the attendant slight reduction in throughput when packets are fragmented. For many use cases, the higher speeds, lower latency, and/or reliability of an Ethernet link will make up for this effect.

Installation

opkg update
opkg install kmod-batman-adv

Suggested for ease of monitoring and debugging

opkg install batctl

To enable use of 802.11s mesh

opkg install wpad-mesh

If building/assembling your own image, you will need to remove the default wpad-mini as it conflicts with wpad-mesh. wpad (full) is likely sufficient, but not tested at this time.

Configuration

Default /etc/config/batman-adv Used

No modifications were made to the default /etc/config/batman-adv. It is shown here for completeness.

The formatting of the file puts single quotes around option names, which is a little different than convention. 'bat0' in config 'mesh' 'bat0' is the key that other configuration refers to. In that line, 'mesh' is an option name, not a value, even though in single quotes. The other options for “bat0” are all empty in the default configuration.

Options in /etc/config/batman-adv are described in the batctl man page, available at this time at https://downloads.open-mesh.org/batman/manpages/batctl.8.html

config 'mesh' 'bat0'
        option 'aggregated_ogms'
        option 'ap_isolation'
        option 'bonding'
        option 'fragmentation'
        option 'gw_bandwidth'
        option 'gw_mode'
        option 'gw_sel_class'
        option 'log_level'
        option 'orig_interval'
        option 'bridge_loop_avoidance'
        option 'distributed_arp_table'
        option 'multicast_mode'
        option 'network_coding'
        option 'hop_penalty'
        option 'isolation_mark'

# yet another batX instance
# config 'mesh' 'bat5'
#       option 'interfaces' 'second_mesh'

The UCI configuration is applied by /lib/netifd/proto/batadv.sh at this time.

VLAN-specific configuration is not discussed here, but can be see by examining /lib/netifd/proto/batadv_vlan.sh and consulting the batman-adv documentation.

(1) 802.11s Encrypted, Authenticated Mesh

For purposes of this walk-through, an 802.11s mesh is used. Other mesh and point-to-point links can be used.

Configure all mesh nodes with the same /etc/config/wireless stanza:

config wifi-iface 'mesh0'
        option device 'radio5'
        option ifname 'mesh0'
        option network 'nwi_mesh0'
        option mode 'mesh'
        option mesh_fwding '0'
        option mesh_id '<your advertised mesh "name" goes here>'
        option encryption 'psk2+ccmp'
        option sae_password '<your secure pass phrase goes here>'

Important points:

  • radio5 needs to match the declaration in /etc/config/wireless of your device, such as config wifi-device 'radio5'
    • The radios all need to be on the same channel and be configured to interoperate with each other (basically configured the same as each other)
  • mesh0 is a fixed identifier for the name of the interface itself (otherwise it will be dynamically assigned)
  • nwi_mesh0 is a reference to the entry in /etc/config/network that will be used to set the MTU and associate it with a batman-adv interface. Its name is selected here for readability, not to match a construct such as OpenWrt's prefixing of interface names.
  • mesh_fwding '0' turns off 802.11s forwarding/routing; it will be handled by batman-adv at each node
  • psk2+ccmp is believed to be the most secure option for home users at this time
  • sae_password is use here instead of key to allow for the more advanced features of SAE available with hostapd and 802.11s.

One can consistently use key instead of sae_password if you prefer the older notation and don't need the advanced features of SAE. sae_password was added to OpenWrt configuration in mid-2018, prior to the 18.06 release. Details of its use in advanced configuration can be found by examining hostapd.conf

Once applied, mesh0 should be seen in the output of ip link. The operation of the mesh can be confirmed with iw dev mesh0 station dump. You should see the other peers in the listing with mesh plink: ESTAB indicating that the peering has been successful. This can be checked prior to the changes in /etc/config/network being made by commenting-out option network 'nwi_mesh0'.

If testing the mesh prior to association with an entry in /etc/config/network, you may need to use ip to bring the interfaces up and modify their parameters, such as MTU.

(2) Associate batman-adv With Mesh

Now that the mesh is (or could be) up and running, create a batman-adv interface for routing traffic over the mesh.

Configure all mesh nodes with the same /etc/config/network stanza:

config interface 'nwi_mesh0'
        option ifname 'mesh0'
        option proto 'batadv'
        option mesh 'bat0'
#       option routing_algo 'BATMAN_V'
        option mtu '2304'

Important points:

  • nwi_mesh0 needs to match the declaration in /etc/config/wireless of your device, such as option network 'nwi_mesh0'
  • mesh0 needs to match the declaration in /etc/config/wireless of your device, such as option ifname 'mesh0'
  • The MTU is set to 2304 here, what an 802.11s link will support. A minimum of 1532 is suggested to provide batman-adv routing of typical 1500-byte Ethernet packets. The MTU cannot exceed the link's native MTU.

At this time (Fall, 2018, batman-adv 2018.2), the default routing algorithm is BATMAN_IV (4). BATMAN_V (5), in the author's experience, is not robust in this on-premise application. See, for example, https://forum.openwrt.org/t/batman-v-routing-on-prem-connectivity-loss-seen/20432

With the mesh up and the network configuration done, you should see bat0 in the output of ip link. The output of batctl o and/or batctl n should indicate that the various batman-adv nodes are “seeing” each other over the mesh.

(3) Bridge VLANs Over batman-adv

Note: This does not discuss how to configure switches for VLANs, associate wireless interfaces with bridges, or firewall traffic. Please consult other documentation for details of those operations as they may apply to your situation.

With batman-adv now able to route packets among peers, the remaining step is to use that facility to route “useful” traffic.

As appropriate for each node, edit /etc/config/network based on these examples. Multiple VLANs can be bridged/routed over a single batman-adv interface.

The option delegate '0' “turns off” certain IPv6-related features on the interface. If you are using IPv6, you should examine if this is the proper setting for your application.

By current convention, OpenWrt will name the interfaces for bridges by prefixing them with br- yielding br-vlan1111 and the like. There is the typical 15-character limit on interface names in the Linux kernel, which needs to include the br- prefix.

These bridges will have an MTU no larger than the smallest MTU of its bridged interfaces. As soon as a typical Ethernet-like interface is included, the MTU will be 1500 or less, even if one or more members of the bridge have a larger MTU. This is how bridges operate in Linux, in general, not an OpenWrt-specific limitation.

Bridge With IPv4 Address

config interface 'vlan1111'
        option type 'bridge'
        option stp '1'
        option ifname 'eth1.1111 bat0.1111'
        option proto 'static'
        option ipaddr '192.168.11.11'
        option netmask '255.255.255.0'
        option delegate '0'

Bridge Without IPv4 Address

config interface 'vlan2222'
        option type 'bridge'
        option stp '1'
        option ifname 'eth1.2222 bat0.2222'
        option proto 'none'
        option auto '1'
        option delegate '0'

Bridge Without Ethernet Interface

(such as for “only” a bridged wireless interface)

config interface 'vlan3333'
        option type 'bridge'
        option stp '1'
        option ifname 'bat0.3333'
        option proto 'none'
        option auto '1'
        option delegate '0'

(Optional) /etc/bat-hosts

This is not a required step – It makes some diagnostic output easier to read.

By creating the file /etc/bat-hosts the output of many batctl commands will replace the MAC addresses with symbolic names. These names do not need to be the host name, nor be consistent with DNS.

The MAC address to use is that of the “raw” interfaces that are used by bat0 – in this example configuration, those of mesh0 on each of the nodes.

32:b5:c2:aa:aa:aa	office.5g
c6:6e:1f:bb:bb:bb	garage.5g
32:b5:c2:cc:cc:cc	front.5g
1a:d6:c7:dd:dd:dd	back.5g
c6:e9:84:ee:ee:ee 	devel.5g

In Operation

On- and Off-Mesh Access

When bridging VLANs as described above, with one node bridging to the wired networks, off-mesh clients have access to mesh clients and vice-versa without any additional configuration. Off-mesh clients can also access other off-mesh clients over the mesh (such as clients of different APs and/or those on the wired network). With a five-node, on-premise mesh using BATMAN_IV, initial ping requests are typically returned within one or two seconds.

Multiple Nodes Bridging to Same Network Segment

Use of multiple nodes bridged to the same wired networks has not been deeply examined at this time. STP might be sufficient as a “poor-man's” approach, though there have been cases with other networking protocols where bridge loops involving the on-device switches did not seem to be detected and resolved by STP alone.

A quick test with two of the OpenWrt nodes (of the five deployed and participating in batman-adv) connected to the wired network through Cisco SG300-series switches had a “fail-over” occur after unplugging the “active” cable in ~90 seconds, with disturbances evident for another half minute. The output of batctl cl (“claim table”) appears to empty and update on about the same time scale. STP in the OpenWrt bridges has a hello_time of 2.00 s, max_age of 20.00 s, and forward_delay of 2.00 s, suggesting an STP cut-over time of ~26 seconds. Watching the claim table on one node while bringing down bat0 on the “preferred gateway” (without changing Ethernet connectivity) showed a one-minute delay before those associated with the down node were removed. As a result, at this time the delays are believed to be primarily due to batman-adv operation.

There is a batman-adv feature around advertising gateways. It appears to be designed for larger-scale deployments and seems to work by moderating DHCP assignments, rather than by dynamically routing packets with the mesh-routing logic itself.

Other Systems' Logs Flooded With kernel: arp: 43:05:43:05:00:00 is multicast

The batman-adv Bridge Loop Avoidance Protocol appears to use gratuitous ARP for 0.0.0.0 with the multicast bit set, acknowledging that “this is a misuse of ARP packets”. Some RFC-compliant systems will log this as an error, as “Installing such entries is an RFC 1812 violation, but some proprietary load balancing techniques require routers to do so.”

Disabling Bridge Loop Avoidance Protocol in /etc/config/batman-adv with option bridge_loop_avoidance 0 is one way to resolve this, though STP or other loop-avoidance methods are strongly suggested if this done.

For FreeBSD and FreeBSD-based systems, setting net.link.ether.inet.allow_multicast=1 should remove the log messages, but will “pollute” the ARP table as described in arp(4)

docs/guide-user/network/wifi/mesh/batman.txt · Last modified: 2018/09/07 22:45 by jeff