消息
×
loading..
SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning
2024-11-15
状态已发表
摘要Existing Image Quality Assessment (IQA) methods achieve remarkable success in analyzing quality for overall image, but few works explore quality analysis for Regions of Interest (ROIs). The quality analysis of ROIs can provide fine-grained guidance for image quality improvement and is crucial for scenarios focusing on region-level quality. This paper proposes a novel network, SEAGULL, which can SEe and Assess ROIs quality with GUidance from a Large vision-Language model. SEAGULL incorporates a vision-language model (VLM), masks generated by Segment Anything Model (SAM) to specify ROIs, and a meticulously designed Mask-based Feature Extractor (MFE) to extract global and local tokens for specified ROIs, enabling accurate fine-grained IQA for ROIs. Moreover, this paper constructs two ROI-based IQA datasets, SEAGULL-100w and SEAGULL-3k, for training and evaluating ROI-based IQA. SEAGULL-100w comprises about 100w synthetic distortion images with 33 million ROIs for pre-training to improve the model's ability of regional quality perception, and SEAGULL-3k contains about 3k authentic distortion ROIs to enhance the model's ability to perceive real world distortions. After pre-training on SEAGULL-100w and fine-tuning on SEAGULL-3k, SEAGULL shows remarkable performance on fine-grained ROI quality assessment. 
语种英语
DOIarXiv:2411.10161
相关网址查看原文
出处Arxiv
收录类别PPRN.PPRN
WOS记录号PPRN:119244441
WOS类目Computer Science, Software Engineering
文献类型预印本
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/464724
专题信息科学与技术学院
通讯作者Li, Bing
作者单位
1.CASIA, State Key Lab Multimodal Artificial Intelligence Syst, Beijing, Peoples R China
2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
3.Beijing Jiaotong Univ, Beijing, Peoples R China
4.Beijing Union Univ, Beijing, Peoples R China
5.China Univ Petr, Beijing, Peoples R China
6.PeopleAI Inc, Beijing, Peoples R China
7.ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
推荐引用方式
GB/T 7714
Chen, Zewen,Wang, Juan,Wang, Wen,et al. SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning. 2024.
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Chen, Zewen]的文章
[Wang, Juan]的文章
[Wang, Wen]的文章
百度学术
百度学术中相似的文章
[Chen, Zewen]的文章
[Wang, Juan]的文章
[Wang, Wen]的文章
必应学术
必应学术中相似的文章
[Chen, Zewen]的文章
[Wang, Juan]的文章
[Wang, Wen]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。