Federated Learning for TinyML Devices

Over the past few years I have focused on privacy-preserving, energy-efficient AI for resource-constrained IoT devices—publishing papers, building hardware demos, and open-sourcing code.
Below you’ll find the highlights, starting with a deep dive into our MDPI Electronics article.

Papers:
Electronics 11(4): 573, 2022 [Open Access]
ACM SenSys ’21 Poster [DOI]
UPC, Full Thesis [Open Access]

Motivation

Typical TinyML workflows train models offline on powerful GPUs, then deploy quantized versions for inference only. We asked: what if training itself could run on-device—collaboratively and privately—via Federated Learning (FL)?

Prototype Hardware

Figure 1 – Three Arduino Nano 33 BLE Sense boards wired for federated keyword-spotting experiments — **Figure 1.** Triple-board setup: each Arduino Nano 33 BLE Sense acts as an FL client, connected via USB to a Python FL server. Buttons trigger local recording / training rounds.

Contributions

First fully on-device FL demo on 64-kB microcontrollers—no simulated nodes.
Design-space exploration of FL round frequency vs. bandwidth, memory, and convergence.
Open, reproducible dataset & code for keyword-spotting on Arduino hardware.

Experimental Highlights

Parameter	Value(s) tested	Insight
FL round interval	5 – 50 epochs	Faster rounds ↓ loss sooner but ↑ UART traffic / wall time
Hidden-layer size	16 – 64 nodes	25–32 nodes best trade-off between RAM (<64 kB) and accuracy
Learning rate / momentum	0.4–0.8 / 0.5	LR 0.6 with momentum 0.5 converged quickest for KWS

Even with ≤65 kB model footprints, the global model achieved 92 % accuracy on user-recorded keywords after just 10 FL rounds—demonstrating that practical on-device training is within reach for low-power IoT deployments.

Lightweight FL Framework for Heterogeneous IoT (SenSys 2021)

Venue: ACM SenSys ’21 Poster Session
We proposed a compression-aware aggregation scheme that slashed uplink bytes by ×3 while matching FedAvg accuracy—paving the way for FL across mixed 802.15.4 / Wi-Fi sensor networks.

TinyML-FederatedLearning GitHub Repo

Code: https://github.com/marcmonfort/TinyML-FederatedLearning
The repo contains:

Arduino/Nano client firmware (TensorFlow Lite-Micro + serial FL layer)
Python-based FL server & monitoring scripts
Step-by-step tutorial to reproduce the Electronics paper results on off-the-shelf boards

Closing Thoughts

These projects collectively push the envelope of edge autonomy—showing that even 64 kB MCUs can learn in the field while keeping raw data local. Next steps include:

Evaluating non-IID data splits in larger fleets
Integrating LoRaWAN transport for truly untethered FL
Exploring on-device differential privacy to harden the pipeline

Looking forward to building on this foundation and collaborating with the community!