nsa@kbin.social · 1 year agoWhat's In My Big Data?plus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkWhat's In My Big Data?plus-squarearxiv.orgnsa@kbin.social · 1 year agomessage-square0fedilink
nsa@kbin.social · 1 year agoThe Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AIplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkThe Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AIplus-squarearxiv.orgnsa@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoMM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasksplus-squareaclanthology.orgexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkMM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasksplus-squareaclanthology.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoDemystifying CLIP Dataplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkDemystifying CLIP Dataplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
nsa@kbin.social · 1 year agoGPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problemsplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkGPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problemsplus-squarearxiv.orgnsa@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoPaLI-3 Vision Language Models: Smaller, Faster, Strongerplus-squarearxiv.orgexternal-linkmessage-square3fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkPaLI-3 Vision Language Models: Smaller, Faster, Strongerplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square3fedilink
KingsmanVince@kbin.social · 1 year agoMiniGPT-v2: large language model as a unified interface for vision-language multi-task learningplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkMiniGPT-v2: large language model as a unified interface for vision-language multi-task learningplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoFinetune Like You Pretrain: Improved Finetuning of Zero-Shot Vision Modelsplus-squareopenaccess.thecvf.comexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkFinetune Like You Pretrain: Improved Finetuning of Zero-Shot Vision Modelsplus-squareopenaccess.thecvf.comKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
nsa@kbin.social · 1 year agoA Long Way to Go: Investigating Length Correlations in RLHFplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkA Long Way to Go: Investigating Length Correlations in RLHFplus-squarearxiv.orgnsa@kbin.social · 1 year agomessage-square0fedilink
nsa@kbin.social · 1 year agoThink before you speak: Training Language Models With Pause Tokensplus-squarearxiv.orgexternal-linkmessage-square1fedilinkarrow-up14arrow-down10
arrow-up14arrow-down1external-linkThink before you speak: Training Language Models With Pause Tokensplus-squarearxiv.orgnsa@kbin.social · 1 year agomessage-square1fedilink
KingsmanVince@kbin.social · 1 year agoCLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say Noplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkCLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say Noplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
nsa@kbin.social · 1 year agoLanguage Modeling Is Compressionplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkLanguage Modeling Is Compressionplus-squarearxiv.orgnsa@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoScaling Vision-Language Models with Sparse Mixture of Expertsplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkScaling Vision-Language Models with Sparse Mixture of Expertsplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoHydra-MoE: A new class of Open-Source Mixture of Expertsplus-squaregithub.comexternal-linkmessage-square0fedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkHydra-MoE: A new class of Open-Source Mixture of Expertsplus-squaregithub.comKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoBridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasksplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkBridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasksplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoFoundational Models Defining a New Era in Vision: A Survey and Outlookplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkFoundational Models Defining a New Era in Vision: A Survey and Outlookplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoUnifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-trainingplus-squareaclanthology.orgexternal-linkmessage-square1fedilinkarrow-up15arrow-down10
arrow-up15arrow-down1external-linkUnifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-trainingplus-squareaclanthology.orgKingsmanVince@kbin.social · 1 year agomessage-square1fedilink
KingsmanVince@kbin.social · 1 year agoMaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasksplus-squarearxiv.orgexternal-linkmessage-square1fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkMaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasksplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square1fedilink
KingsmanVince@kbin.social · 1 year agoVision Language Transformers: A Surveyplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up13arrow-down10
arrow-up13arrow-down1external-linkVision Language Transformers: A Surveyplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink
KingsmanVince@kbin.social · 1 year agoVisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Useplus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up14arrow-down10
arrow-up14arrow-down1external-linkVisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Useplus-squarearxiv.orgKingsmanVince@kbin.social · 1 year agomessage-square0fedilink