Visit my github for more. Following are some selected samples.

Paper Implementations

Switch Transformers.
Efficient PyTorch implementation of the Switch Transformer with (optional) aux loss for each layer and configurable number of experts and expert capacity..
.