News

Previous studies were focused on the long text generation problem of image paragraph, ignoring the characteristics of the image and the auxiliary role of patient background information for diagnosis.
(1) We released the 50 diffusion steps model (instead of 1000 steps) which runs 20X faster with comparable results. (2) Calling CLIP just once and caching the result runs 2X faster for all models.