Models Modelling - Search News

Nvidia shrinks LLM memory 20x without changing model weights

Nvidia's KV Cache Transform Coding (KVTC) compresses LLM key-value cache by 20x without model changes, cutting GPU memory costs and time-to-first-token by up to 8x for multi-turn AI applications.

Choosing The Right AI Model: A Practical Decision Framework

Choosing an AI model is no longer about “best model wins.” Instead, the right choice is the one that meets accuracy targets, ...

12d

Microsoft Builds A Compact AI Model That Decides When To Think

Microsoft's Phi-4-reasoning-vision-15B uses careful data curation and selective reasoning to compete with models trained on ...

Nvidia expands open AI model portfolio and enlists partners for frontier development

Touting its status as the “world’s largest contributor to open-source AI,” Nvidia Corp. is doubling down on open artificial ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results