Model building blocks Attention Mechanisms Logit Softcapping Multi-GPU Model Parallelism: Tensor Parallelism Quantization Quantization GPTQ Checkpoint Format