WideNet. WideNet in PyTorch. . Switch Transformers. Switch Transformer in PyTorch with (optional) aux loss for each layer, configurable number of experts and expert capacity, aux loss free load balancing supported. .