Mixture of experts is a machine learning technique where multiple expert networks are used to divide a problem space into homogeneous regions. MoE represents a form of ensemble learning. They were also called committee machines. From Wikipedia
Researchers validated a metric for predicting sparse model compute efficiency, developing Hessian-aware low-bit inference with expert offloading to reduce on-device memory by roughly 60%