Federated Learning for TinyML Devices

Over the past few years I have focused on privacy-preserving, energy-efficient AI for resource-constrained IoT devices—publishing papers, building hardware demos, and open-sourcing code.
Below you’ll find the highlights, starting with a deep dive into our MDPI Electronics article.

Papers:
Electronics 11(4): 573, 2022 [Open Access]
ACM SenSys ’21 Poster [DOI]
UPC, Full Thesis [Open Access]

Motivation

Typical TinyML workflows train models offline on powerful GPUs, then deploy quantized versions for inference only. We asked: what if training itself could run on-device—collaboratively and privately—via Federated Learning (FL)?

Prototype Hardware

Figure 1 – Three Arduino Nano 33 BLE Sense boards wired for federated keyword-spotting experiments
Figure 1. Triple-board setup: each Arduino Nano 33 BLE Sense acts as an FL client, connected via USB to a Python FL server. Buttons trigger local recording / training rounds.

Contributions

  • First fully on-device FL demo on 64-kB microcontrollers—no simulated nodes.
  • Design-space exploration of FL round frequency vs. bandwidth, memory, and convergence.
  • Open, reproducible dataset & code for keyword-spotting on Arduino hardware.

Experimental Highlights

ParameterValue(s) testedInsight
FL round interval5 – 50 epochsFaster rounds ↓ loss sooner but ↑ UART traffic / wall time
Hidden-layer size16 – 64 nodes25–32 nodes best trade-off between RAM (<64 kB) and accuracy
Learning rate / momentum0.4–0.8 / 0.5LR 0.6 with momentum 0.5 converged quickest for KWS

Even with ≤65 kB model footprints, the global model achieved 92 % accuracy on user-recorded keywords after just 10 FL rounds—demonstrating that practical on-device training is within reach for low-power IoT deployments.


Lightweight FL Framework for Heterogeneous IoT (SenSys 2021)

Venue: ACM SenSys ’21 Poster Session
We proposed a compression-aware aggregation scheme that slashed uplink bytes by ×3 while matching FedAvg accuracy—paving the way for FL across mixed 802.15.4 / Wi-Fi sensor networks.


TinyML-FederatedLearning GitHub Repo

Code: https://github.com/marcmonfort/TinyML-FederatedLearning
The repo contains:

  • Arduino/Nano client firmware (TensorFlow Lite-Micro + serial FL layer)
  • Python-based FL server & monitoring scripts
  • Step-by-step tutorial to reproduce the Electronics paper results on off-the-shelf boards

Closing Thoughts

These projects collectively push the envelope of edge autonomy—showing that even 64 kB MCUs can learn in the field while keeping raw data local. Next steps include:

  1. Evaluating non-IID data splits in larger fleets
  2. Integrating LoRaWAN transport for truly untethered FL
  3. Exploring on-device differential privacy to harden the pipeline

Looking forward to building on this foundation and collaborating with the community!

Federated Learning for TinyML Devices

Author

Marc Monfort

Publish Date

Jul 30, 2025

License

CC BY 4.0