{"id":37841,"date":"2025-06-25T09:17:00","date_gmt":"2025-06-25T07:17:00","guid":{"rendered":"https:\/\/kinit.sk\/event\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/"},"modified":"2025-09-11T15:36:15","modified_gmt":"2025-09-11T13:36:15","slug":"knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp","status":"publish","type":"event","link":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/","title":{"rendered":"A knowledge sharing seminar on multimodal models in computer vision and NLP"},"content":{"rendered":"<div id=\"\" class=\"element core-paragraph\">\n<p><strong>Jana Ko\u0161eck\u00e1, an Associate Professor of Computer Science at George Mason University, Virginia, gave a lecture on <em>From Pixels to Words: Frontiers of Multi-Modal Vision-Language Learning<\/em>.<\/strong><\/p>\n<\/div>\n\n<div id=\"\" class=\"element core-heading\">\n<h4 class=\"wp-block-heading\">Lecture abstract<\/h4>\n<\/div>\n\n<div id=\"\" class=\"element core-paragraph\">\n<p>In recent years, foundational models have reshaped the field of computer vision, achieving remarkable advances across tasks such as classification, detection, segmentation, and image generation. This talk surveys the current state of foundational vision models, highlighting advances in self-supervised learning, masked image modelling, contrastive learning, and vision transformers. These models are pre-trained on massive, diverse datasets and then adapted to a wide range of downstream applications. Beyond vision alone, we are witnessing a rapid convergence toward multi-modal vision-language models that align visual and textual information in shared embedding spaces. Models such as Segment Anything, CLIP, BLIP, Flamingo, and LLava bridge the gap between visual understanding and natural language, enabling powerful new capabilities like open-vocabulary recognition, image captioning, visual question answering, and interactive grounding. This talk will survey the state of the art in foundational vision models, explore the underlying principles that enable their generalisation, and highlight how vision-language integration is transforming both research and real-world applications. I will also discuss emerging challenges and future directions at the intersection of perception, language, and reasoning.<\/p>\n<\/div>\n\n<div id=\"\" class=\"element core-heading\">\n<h2 class=\"wp-block-heading\">Photos from the lecture<\/h2>\n<\/div>\n\n<div id=\"\" class=\"element core-columns\">\n<div class=\"wp-block-columns has-small-font-size is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\"><div id=\"\" class=\"element core-column\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div id=\"\" class=\"element core-image\">\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"768\" data-src=\"https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka1-1024x768.jpg\" alt=\"\" class=\"wp-image-37810 lazyload\" data-srcset=\"https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka1-1024x768.jpg 1024w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka1-300x225.jpg 300w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka1-768x576.jpg 768w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka1-1536x1152.jpg 1536w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka1-2048x1536.jpg 2048w\" data-sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1024px; --smush-placeholder-aspect-ratio: 1024\/768;\" \/><\/figure>\n<\/div><\/div>\n<\/div>\n\n<div id=\"\" class=\"element core-column\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div id=\"\" class=\"element core-image\">\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"2560\" height=\"1920\" data-src=\"https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka2-scaled.jpg\" alt=\"\" class=\"wp-image-37812 lazyload\" data-srcset=\"https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka2-scaled.jpg 2560w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka2-300x225.jpg 300w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka2-1024x768.jpg 1024w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka2-768x576.jpg 768w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka2-1536x1152.jpg 1536w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka2-2048x1536.jpg 2048w\" data-sizes=\"(max-width: 2560px) 100vw, 2560px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 2560px; --smush-placeholder-aspect-ratio: 2560\/1920;\" \/><\/figure>\n<\/div><\/div>\n<\/div>\n\n<div id=\"\" class=\"element core-column\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\"><div id=\"\" class=\"element core-image\">\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"768\" data-src=\"https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka3-1024x768.jpg\" alt=\"\" class=\"wp-image-37814 lazyload\" data-srcset=\"https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka3-1024x768.jpg 1024w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka3-300x225.jpg 300w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka3-768x576.jpg 768w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka3-1536x1152.jpg 1536w, https:\/\/kinit.sk\/wp-content\/uploads\/2025\/09\/kosecka3-2048x1536.jpg 2048w\" data-sizes=\"(max-width: 1024px) 100vw, 1024px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1024px; --smush-placeholder-aspect-ratio: 1024\/768;\" \/><\/figure>\n<\/div><\/div>\n<\/div><\/div>\n<\/div>","protected":false},"featured_media":37797,"template":"","meta":{"_acf_changed":true,"footnotes":""},"categories":[87,520],"class_list":["post-37841","event","type-event","status-publish","has-post-thumbnail","hentry","category-past-events-sk","category-2025-sk"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>A knowledge sharing seminar on multimodal models in computer vision and NLP - KInIT<\/title>\n<meta name=\"description\" content=\"Jana Ko\u0161eck\u00e1, an Associate Professor of Computer Science at George Mason University, Virginia, gave a lecture on From Pixels to...\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/\" \/>\n<meta property=\"og:locale\" content=\"sk_SK\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"A knowledge sharing seminar on multimodal models in computer vision and NLP - KInIT\" \/>\n<meta property=\"og:description\" content=\"Jana Ko\u0161eck\u00e1, an Associate Professor of Computer Science at George Mason University, Virginia, gave a lecture on From Pixels to...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/\" \/>\n<meta property=\"og:site_name\" content=\"KInIT\" \/>\n<meta property=\"article:modified_time\" content=\"2025-09-11T13:36:15+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/kinit.sk\/wp-content\/uploads\/2024\/10\/slovaks.ai-event_A-knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-natural-language-processing-scaled.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1340\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@kinit\" \/>\n<meta name=\"twitter:label1\" content=\"Predpokladan\u00fd \u010das \u010d\u00edtania\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 min\u00faty\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/kinit.sk\\\/sk\\\/podujatie\\\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\\\/\",\"url\":\"https:\\\/\\\/kinit.sk\\\/sk\\\/podujatie\\\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\\\/\",\"name\":\"A knowledge sharing seminar on multimodal models in computer vision and NLP - KInIT\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/kinit.sk\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/kinit.sk\\\/sk\\\/podujatie\\\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/kinit.sk\\\/sk\\\/podujatie\\\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/kinit.sk\\\/wp-content\\\/uploads\\\/2024\\\/10\\\/slovaks.ai-event_A-knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-natural-language-processing-scaled.png\",\"datePublished\":\"2025-06-25T07:17:00+00:00\",\"dateModified\":\"2025-09-11T13:36:15+00:00\",\"description\":\"Jana Ko\u0161eck\u00e1, an Associate Professor of Computer Science at George Mason University, Virginia, gave a lecture on From Pixels to...\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/kinit.sk\\\/sk\\\/podujatie\\\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\\\/#breadcrumb\"},\"inLanguage\":\"sk-SK\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/kinit.sk\\\/sk\\\/podujatie\\\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"sk-SK\",\"@id\":\"https:\\\/\\\/kinit.sk\\\/sk\\\/podujatie\\\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\\\/#primaryimage\",\"url\":\"https:\\\/\\\/kinit.sk\\\/wp-content\\\/uploads\\\/2024\\\/10\\\/slovaks.ai-event_A-knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-natural-language-processing-scaled.png\",\"contentUrl\":\"https:\\\/\\\/kinit.sk\\\/wp-content\\\/uploads\\\/2024\\\/10\\\/slovaks.ai-event_A-knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-natural-language-processing-scaled.png\",\"width\":2560,\"height\":1340},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/kinit.sk\\\/sk\\\/podujatie\\\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/kinit.sk\\\/sk\\\/uvod\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Past events\",\"item\":\"https:\\\/\\\/kinit.sk\\\/category\\\/past-events\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"A knowledge sharing seminar on multimodal models in computer vision and NLP\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/kinit.sk\\\/#website\",\"url\":\"https:\\\/\\\/kinit.sk\\\/\",\"name\":\"KInIT\",\"description\":\"Vyu\u017e\u00edvame v\u00fdskum pre \u013eud\u00ed a priemysel\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/kinit.sk\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"sk-SK\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"A knowledge sharing seminar on multimodal models in computer vision and NLP - KInIT","description":"Jana Ko\u0161eck\u00e1, an Associate Professor of Computer Science at George Mason University, Virginia, gave a lecture on From Pixels to...","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/","og_locale":"sk_SK","og_type":"article","og_title":"A knowledge sharing seminar on multimodal models in computer vision and NLP - KInIT","og_description":"Jana Ko\u0161eck\u00e1, an Associate Professor of Computer Science at George Mason University, Virginia, gave a lecture on From Pixels to...","og_url":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/","og_site_name":"KInIT","article_modified_time":"2025-09-11T13:36:15+00:00","og_image":[{"width":2560,"height":1340,"url":"https:\/\/kinit.sk\/wp-content\/uploads\/2024\/10\/slovaks.ai-event_A-knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-natural-language-processing-scaled.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_site":"@kinit","twitter_misc":{"Predpokladan\u00fd \u010das \u010d\u00edtania":"2 min\u00faty"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/","url":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/","name":"A knowledge sharing seminar on multimodal models in computer vision and NLP - KInIT","isPartOf":{"@id":"https:\/\/kinit.sk\/#website"},"primaryImageOfPage":{"@id":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/#primaryimage"},"image":{"@id":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/#primaryimage"},"thumbnailUrl":"https:\/\/kinit.sk\/wp-content\/uploads\/2024\/10\/slovaks.ai-event_A-knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-natural-language-processing-scaled.png","datePublished":"2025-06-25T07:17:00+00:00","dateModified":"2025-09-11T13:36:15+00:00","description":"Jana Ko\u0161eck\u00e1, an Associate Professor of Computer Science at George Mason University, Virginia, gave a lecture on From Pixels to...","breadcrumb":{"@id":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/#breadcrumb"},"inLanguage":"sk-SK","potentialAction":[{"@type":"ReadAction","target":["https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/"]}]},{"@type":"ImageObject","inLanguage":"sk-SK","@id":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/#primaryimage","url":"https:\/\/kinit.sk\/wp-content\/uploads\/2024\/10\/slovaks.ai-event_A-knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-natural-language-processing-scaled.png","contentUrl":"https:\/\/kinit.sk\/wp-content\/uploads\/2024\/10\/slovaks.ai-event_A-knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-natural-language-processing-scaled.png","width":2560,"height":1340},{"@type":"BreadcrumbList","@id":"https:\/\/kinit.sk\/sk\/podujatie\/knowledge-sharing-seminar-on-multimodal-models-in-computer-vision-and-nlp\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/kinit.sk\/sk\/uvod\/"},{"@type":"ListItem","position":2,"name":"Past events","item":"https:\/\/kinit.sk\/category\/past-events\/"},{"@type":"ListItem","position":3,"name":"A knowledge sharing seminar on multimodal models in computer vision and NLP"}]},{"@type":"WebSite","@id":"https:\/\/kinit.sk\/#website","url":"https:\/\/kinit.sk\/","name":"KInIT","description":"Vyu\u017e\u00edvame v\u00fdskum pre \u013eud\u00ed a priemysel","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/kinit.sk\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"sk-SK"}]}},"_links":{"self":[{"href":"https:\/\/kinit.sk\/sk\/wp-json\/wp\/v2\/event\/37841","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kinit.sk\/sk\/wp-json\/wp\/v2\/event"}],"about":[{"href":"https:\/\/kinit.sk\/sk\/wp-json\/wp\/v2\/types\/event"}],"version-history":[{"count":1,"href":"https:\/\/kinit.sk\/sk\/wp-json\/wp\/v2\/event\/37841\/revisions"}],"predecessor-version":[{"id":38007,"href":"https:\/\/kinit.sk\/sk\/wp-json\/wp\/v2\/event\/37841\/revisions\/38007"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/kinit.sk\/sk\/wp-json\/wp\/v2\/media\/37797"}],"wp:attachment":[{"href":"https:\/\/kinit.sk\/sk\/wp-json\/wp\/v2\/media?parent=37841"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kinit.sk\/sk\/wp-json\/wp\/v2\/categories?post=37841"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}