Blog

Apr 9, 2024

Paper page — InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Posted by in category: robotics/AI

From sensetime, shanghai #AI lab, & tsinghua U

InternLM-XComposer2-4KHD

A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD https://huggingface.co/papers/2404.

The Large Vision-Language Model (LVLM) field has seen significant advancements, yet its progression…


Join the discussion on this paper page.

Comments are closed.