Below are some of the talks that I’ve delivered on Cohere’s Discord. ML Efficiency GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection. . Talk Slides SpQR: A Sparse-Quantized Representation for Near-lossless LLM Weight Compression. . Talk Slides LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale. . Talk Slides NLP GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints. . Talk Slides Universal and Transferable Adversarial Attacks on Aligned Language Models. . Talk Slides Extending Context Window of Large Language Models via Positional Interpolation. . Talk Slides