Particle.news

Download on the App Store

Apple and NVIDIA Collaborate to Boost AI Text Generation with ReDrafter

The partnership integrates Apple's ReDrafter technique into NVIDIA's TensorRT-LLM, achieving faster and more efficient large language model performance.

  • Apple's open-sourced ReDrafter combines beam search and dynamic tree attention to enhance text generation in large language models (LLMs).
  • ReDrafter has been integrated into NVIDIA's TensorRT-LLM framework, enabling faster and more efficient LLM inference on NVIDIA GPUs.
  • The integration required NVIDIA to add or modify operators, improving TensorRT-LLM's ability to handle complex models and decoding methods.
  • Benchmarks show a 2.7x increase in token generation speed for greedy decoding, reducing latency, power consumption, and computational costs.
  • This collaboration underscores the potential for short-term partnerships between Apple and NVIDIA on AI technologies, despite their historically limited business ties.
Hero image