Elimination of matrix multiplication from LLM processing can massively increase performance-per-watt with the correct optimizations, researchers from UC Santa Cruz demonstrate. It remains to be seen how applicable this approach is for AI in general.
Elimination of matrix multiplication from LLM processing can massively increase performance-per-watt with the correct optimizations, researchers from UC Santa Cruz demonstrate. It remains to be seen how applicable this approach is for AI in general.