This guide explains about how to manually load balance OpenWrt by fixing IRQs for specific ethernet ports and assigning one or more CPU cores for the networking queues.
From : https://www.kernel.org/doc/html/latest/core-api/irq/irq-affinity.html
/proc/irq/IRQ#/smp_affinity and /proc/irq/IRQ#/smp_affinity_list specify which target CPUs are permitted for a given IRQ source. It’s a bitmask (smp_affinity) or CPU list (smp_affinity_list) of allowed CPUs. It’s not allowed to turn off all CPUs, and if an IRQ controller does not support IRQ affinity then the value will not change from the default of all CPUs.
/proc/irq/default_smp_affinity specifies default affinity mask that applies to all non-active IRQs. Once IRQ is allocated/activated its affinity bitmask will be set to the default mask. It can then be changed as described above. Default mask is 0xffffffff.
To set an IRQ to a specific CPU or group of CPU's requires a bit mask. So using binary we enable each CPU as required and then convert to hex to get the bitmask setting. This way we can restrict IRQs to specific CPUs to aid in load balancing or for heterogeneous SOCs use high/low power cores instead.
Bitmasks for CPUs
| Binary | Hex | CPU |
|---|---|---|
| 00000001 | 1 | 0 |
| 00000010 | 2 | 1 |
| 00000011 | 3 | 0,1 |
| 00000100 | 4 | 2 |
| 00000101 | 5 | 0,2 |
| 00000110 | 6 | 1,2 |
| 00000111 | 7 | 0,1,2 |
| 00001000 | 8 | 3 |
| 00001001 | 9 | 0,3 |
| 00001010 | A | 1,3 |
| 00001011 | B | 0,1,3 |
| 00001100 | C | 2,3 |
| 00001101 | D | 0,2,3 |
| 00001110 | E | 1,2,3 |
| 00001111 | F | 0,1,2,3 |
| ... | ... | ... |
| 00110000 | 30 | 4,5 |
OpenWrt routers have set defaults for multi CPU usage.
The following scripts are responsible for setting these up.
cat /etc/hotplug.d/net/20-smp-packet-steering
cat /etc/hotplug.d/net/40-net-smp-affinity
A more automated solution is to use irqbalance to help spread the load.
Irqbalance is a Linux daemon that distributes interrupts over multiple logical CPUs. This design intent being to improve overall performance which can result in a balanced load and power consumption.
However this does not always produce a predictable load distribution. Instead we can use manual tuning to improve load distribution.
First of all we need to find and identify the interrupts.
In order to monitor settings or changes :
cat /proc/interrupts
The code below is from an NanoPi R4S which has 4 A53 cores (CPU 0-3) and 2 A72 cores (CPU 4 and 5)
root@OpenWrt:~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
23: 27142318 12185540 5391618 2352924 137831569 145154023 GICv3 30 Level arch_timer
25: 67873664 61308794 11619382 2662637 16876546 43550490 GICv3 113 Level rk_timer
26: 0 0 0 0 0 0 GICv3-23 0 Level arm-pmu
27: 0 0 0 0 0 0 GICv3-23 1 Level arm-pmu
28: 0 0 0 0 0 0 GICv3 37 Level ff6d0000.dma-controller
29: 0 0 0 0 0 0 GICv3 38 Level ff6d0000.dma-controller
30: 0 0 0 0 0 0 GICv3 39 Level ff6e0000.dma-controller
31: 0 0 0 0 0 0 GICv3 40 Level ff6e0000.dma-controller
32: 1 0 0 0 0 0 GICv3 81 Level pcie-sys
34: 0 0 0 0 0 0 GICv3 83 Level pcie-client
35: 0 0 0 0 165575364 0 GICv3 44 Level eth0
36: 20438175 0 0 0 0 0 GICv3 97 Level dw-mci
37: 0 0 0 0 0 0 GICv3 58 Level ehci_hcd:usb1
38: 0 0 0 0 0 0 GICv3 60 Level ohci_hcd:usb3
39: 0 0 0 0 0 0 GICv3 62 Level ehci_hcd:usb2
40: 0 0 0 0 0 0 GICv3 64 Level ohci_hcd:usb4
42: 0 0 0 0 0 0 GICv3 91 Level ff110000.i2c
43: 6 0 0 0 0 0 GICv3 67 Level ff120000.i2c
44: 0 0 0 0 0 0 GICv3 68 Level ff160000.i2c
45: 6 0 0 0 0 0 GICv3 132 Level ttyS2
46: 0 0 0 0 0 0 GICv3 129 Level rockchip_thermal
47: 6393498 0 0 0 0 0 GICv3 89 Level ff3c0000.i2c
50: 0 0 0 0 0 0 GICv3 147 Level ff650800.iommu
52: 0 0 0 0 0 0 GICv3 149 Level ff660480.iommu
56: 0 0 0 0 0 0 GICv3 151 Level ff8f3f00.iommu
57: 0 0 0 0 0 0 GICv3 150 Level ff903f00.iommu
58: 0 0 0 0 0 0 GICv3 75 Level ff914000.iommu
59: 0 0 0 0 0 0 GICv3 76 Level ff924000.iommu
69: 0 0 0 0 0 0 GICv3 59 Level rockchip_usb2phy
70: 0 0 0 0 0 0 GICv3 137 Level xhci-hcd:usb5
71: 0 0 0 0 0 0 GICv3 142 Level xhci-hcd:usb7
72: 0 0 0 0 0 0 rockchip_gpio_irq 21 Level rk808
78: 0 0 0 0 0 0 rk808 5 Edge RTC alarm
82: 0 0 0 0 0 0 rockchip_gpio_irq 7 Edge fe320000.mmc cd
84: 0 0 0 0 0 0 ITS-MSI 0 Edge PCIe PME, aerdrv
85: 10 0 0 0 0 0 rockchip_gpio_irq 10 Level stmmac-0:01
86: 0 0 0 0 0 0 rockchip_gpio_irq 22 Edge gpio-keys
87: 0 0 0 0 0 1156859750 ITS-MSI 524288 Edge eth1
IPI0: 7085496 10371429 7027071 6124604 310818 114897 Rescheduling interrupts
IPI1: 2817025 2457651 882759 515246 2752519 543745 Function call interrupts
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 0 0 CPU stop (for crash dump) interrupts
IPI4: 5558568 4633615 2762056 1122565 763629 3435183 Timer broadcast interrupts
IPI5: 413711 300799 161541 117511 109020 76881 IRQ work interrupts
IPI6: 0 0 0 0 0 0 CPU wake-up interrupts
Err: 0
To find the IRQs for your ethernet ports :
grep eth /proc/interrupts
root@OpenWrt:~# grep eth /proc/interrupts 35: 0 0 0 0 165661665 0 GICv3 44 Level eth0 87: 0 0 0 0 0 1157284700 ITS-MSI 524288 Edge eth1
So here eth0 is IRQ 35 and eth1 is 87.
Interrupts can only be set one per core.
Set eth0 interrupt to core 0
echo 1 > /proc/irq/35/smp_affinity
Set eth1 interrupt to core 1
echo 2 > /proc/irq/87/smp_affinity
You could get the IRQ with a command instead of hard coding it:
#eth0 IRQ
echo f > /proc/irq/`grep eth0 /proc/interrupts|awk -F ':' '{print $1}'|xargs`/smp_affinity
#eth1 IRQ
echo f > /proc/irq/`grep eth1 /proc/interrupts|awk -F ':' '{print $1}'|xargs`/smp_affinity
The grep command prints the line from /proc/interrupts, awk prints the first column which is the IRQ number, and xargs trims the whitespace.
For kernel 5.15 need to use:
echo -n #HEX# > /proc/irq/#IRQ-NUMBER#/smp_affinity
If you restart Smart Queue Management or change SQM settings, it will reset the CPU affinity and you will need to reset your settings or re-apply them.
Reference: Receive Packet Steering (RPS)
Receive Packet Steering (RPS) is similar to RSS in that it is used to direct packets to specific CPUs for processing. However, RPS is implemented at the software level, and helps to prevent the hardware queue of a single network interface card from becoming a bottleneck in network traffic.
Network Queues can be spread across all CPUs if required or fixed to just one.
Set eth0 queue to core 3
echo 4 > /sys/class/net/eth0/queues/rx-0/rps_cpus
Set eth1 queue to core 4
echo 8 > /sys/class/net/eth1/queues/rx-0/rps_cpus
Set eth0 and eth1 to use all 6 cores
echo 3f > /sys/class/net/eth0/queues/rx-0/rps_cpus echo 3f > /sys/class/net/eth1/queues/rx-0/rps_cpus
Either edit
cat /etc/hotplug.d/net/40-net-smp-affinity
or create your own script to run with your new values and insert that in
/etc/hotplug.d/net/
to run after the default script.
eg:
/etc/hotplug.d/net/50-mysettings-for-net-smp-affinity
#eth0 core 0 echo 1 > /proc/irq/35/smp_affinity #eth1 core 2 echo 2 > /proc/irq/87/smp_affinity #queues on all cores echo 3f > /sys/class/net/eth0/queues/rx-0/rps_cpus echo 3f > /sys/class/net/eth1/queues/rx-0/rps_cpus
Now reboot and check to ensure the settings have taken.
Thanks to the following for discussions/contributions :
Reference threads :