in conjunction with IEEE CVPR 2024
June 17, 2024


PDT Time

June 17

  •  8:45-9:00 
    Welcome Remarks
  •  9:00-9:25 
    Scale Learning in Image Semantics: A 15-Year Review
    Liangliang Cao

  •  9:30-9:40 
    DIA: Diffusion based Inverse Network Attack on Collaborative Inference
  • 9:40-9:50 
    Fast-NTK: Parameter-Efficient Unlearning for Large-Scale Models
  • 9:50-10:00 
    AR-CP: Uncertainty-Aware Perception in Adverse Conditions with Conformal Prediction and Augmented Reality for Assisted Driving
  •  10:00-10:20 
    Coffee Break
  •  10:20-10:45 
    Rubber Hits the Road: Lessons Learned from DeepFake Detection in real-world
    Siwei Lyu

  •  10:45-10:55 
    Mitigating Bias Using Model-Agnostic Data Attribution
  •  10:55-11:05 
    Practical Region-level Attack against Segment Anything Models
  •  11:05-11:15 
    Towards Explainable Visual Vessel Recognition Using Fine-Grained Classification and Image Retrieval
  •  11:15-11:25 
    Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation
  •  11:25-11:50 
    Incentivizing Opt-in and Enabling Opt-out for Text-to-Image Models
    Richard Zhang

  •  11:50-13:00 
    Lunch Break
  •  13:00-13:25 
    Uncovering and Addressing Biases in Diffusion Models
    R. Ventakesh Babu

  •  13:25-13:35 
    Towards Efficient Machine Unlearning with Data Augmentation: Guided Loss-Increasing (GLI) to Prevent the Catastrophic Model Utility Drop
  •  13:35-13:45 
    ReweightOOD: Loss Reweighting for Distance-based OOD Detection
  •  13:45-13:55 
    T2FNorm: Train-time Feature Normalization for OOD Detection in Image Classification
  •  13:55-14:05 
    Our Deep CNN Matchers have Developed Achromatopsia
  •  14:05-14:15 
    Test-time Assessment of a Models Performance on Unseen Domains via Optimal Transport
  •  14:15-14:40 
    Content Creation Beyond Text to Pixel
    Yu-Chuan Su

  •  14:40-14:50 
    Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input
  •  14:50-15:00 
    RLNet: Robust Linearized Networks for Efficient Private Inference
  •  15:00-15:20 
    Coffee Break
  •  15:20-15:30 
    Robust and Explainable Fine-Grained Visual Classification with Transfer Learning: A Dual-Carriageway Framework
  •  15:30-15:40 
    Data-free Defense of Black Box Models Against Adversarial Attacks
  •  15:40-15:50 
    Fractals as Pre-training Datasets for Anomaly Detection and Localization
  •  15:50-16:00 
    SkipPLUS: Skip the First Few Layers to Better Explain Vision Transformers
  •  16:00-16:25 
    Brain-inspired Design of Vision Transformers
    Tianming Liu

  •  16:30-17:30 
    Closing Remarks and Disperse for Poster Session

About the Workshop

In every walk of life, computer vision and AI systems are playing a significant and increasing role. They are being employed for making mundane day to day decisions such as healthy food choices and dress choices from the wardrobe to match the occasion of the day as well as mission-critical and life-changing decisions such as diagnosis of diseases, detection of financial frauds, and selecting new employees. Many upcoming applications such as autonomous driving to automated cancer treatment recommendations has everyone worrying about the level of trust associated with vision systems today. The concerns are genuine as many weaker sides of modern vision systems have been exposed through adversarial attacks, bias, and lack of explainability in the current rapidly evolving vision systems. While these vision systems are reaping the advantage of the novel learning methods, they exhibit brittleness to minor changes in the input data and lack the capability to explain its decisions to a human. Furthermore, they are unable to address the bias in their training data and are often highly opaque in terms of revealing the lineage of the system and how they were trained and tested. It has been conjectured that the current use of AI is based on only about 20% of the data the world has access to. Rest 80% of the data that can help AI systems is not available because of regulations and compliance requirements around security and privacy. The present AI systems haven’t demonstrated the ability to learn without compromising on the privacy and security of data. Nor can they even assign appropriate credit to the data sources. 

With the ever increasing appetite for data in machine learning, we need to face the reality that for many applications, sufficient data may not be available. Even if raw data is plenty, quality labeled data may be scarce, and if it is not, then relevant labeled data for a particular objective function may not be sufficient. The latter is often the case in tail end of the distribution problems, such as recognizing in autonomous driving that a baby stroller is rolling on the street. The event is rare in training and testing data, but certainly highly critical for the objective function of personal and property damage. Even the performance evaluation of such a situation is challenging. One may stage experiments geared towards particular situations, but this is not a guarantee that the staging conforms to the natural distribution of events, and even if, then there are many tail ends in high dimensional distributions, that are by their nature hard to enumerate manually. 

Many publicly available computer vision datasets are responsible for great progress in visual recognition and analytics. These datasets serve as source of large amounts of training data as well as assessing performance of state-of-the-art competing algorithms. Performance saturation on such datasets has led the community to believe many general visual recognition problems to be close to be solved, with various commercial offerings stemming from models trained on such data. However, such datasets present significant biases in terms of both categories and image quality, thus creating a significant gap between their distribution and the data coming from the real world. For example, many of the publicly available datasets underrepresent certain ethnic and cultural communities and over represent others. Many variations have been observed to impact visual recognition including resolution, illumination and simple cultural variations of similar objects. Systems based on a skewed training dataset are bound to produce skewed results. This mismatch has been evidenced in the significant drop in performance of state of the art models trained on those datasets when applied to images for example of particular gender and/or ethnicity groups for face analytics. It has been shown that such biases may have serious impacts on performance in challenging situations where the outcome is critical either for the subject or to a community. Often research evaluations are quite unaware of those issues, while focusing on saturating the performance on skewed datasets. 

In order to progress toward fair visual recognition truly in the wild, we propose this workshop to understand the underlying issues in bias free and culturally diverse visual recognition. 

Under such circumstances, our workshop on Fair, Data Efficient and Trusted Computer Vision will address four critical issues in enhancing user trust in AI and computer vision systems namely: (i) Fairness, (ii) Data Efficient learning and critical aspects of trust including (ii) explainability, (iii) mitigating adversarial attacks robustly and (iv) improve privacy and security in model building with right level of credit assignment to the data sources along with transparency in lineage.



Submission Instructions

We solicit submissions of technical papers. The accepted papers will be published in CVPR 2024 workshop proceedings and presented at the workshop. Please submit at the CMT Submission Site  

Submitted technical papers must follow the CVPR paper format and guidelines (see CVPR2024 Author Guidelines). All accepted submissions must be presented by one of the authors. 

Submission deadline: for technical papers is March 25 2024 11:59pm Pacific Time  
Notification to authors: April 7 2024 11:59pm Pacific Time
Camera ready deadline:  April 14 2024 11:59pm Pacific Time

We invite submissions of original work. The review will be double-blind. 


    We solicit original research papers covering these areas to be submitted to the workshop:

  • Vision/AI and bias
  • Secure machine learning in vision and AI
  • Vision/AI model security using blockchain
  • Explainability in Vision/AI decisions
  • Analytics in encrypted domain
  • Secure Vision/AI computing and blockchain
  • Vision/AI provenance and lineage
  • Trust in Vision/AI
  • Privacy in Vision/AI
  • Robustness of Vision/AI models
  • Vision/AI forensics
  • Vision/AI models attribution
  • Work that spans across the many dimensions of trust
  • Algorithms and theories for learning computer vision models under bias and scarcity
  • Methods for exploiting prior knowledge to learn models under bias/scarcity
  • Optimization methods designed for learning models from side-channel/alternative/synthetic sources of data
  • Domain adaptation methods to bridge train/test data gap
  • Methods for studying generalization characteristics of vision models trained from alternative data sources
  • Methods of evaluating performance of models under bias/scarcity
  • Domain-specific methods designed for important computer vision applications
  • Performance characterization of vision algorithms and systems under bias and scarcity
  • Continuous re nement of vision models using active/online learning
  • Meta-learning models from various existing task-speci c models
  • Brave new ideas to learn computer vision models under bias and scarcity
  • New algorithms and architectures explicitly designed to reduce bias in visual analytics
  • New techniques to balance/manipulate data to reduce bias in visual analytics
  • New datasets to improve and measure bias/diversity in visual analytics
  • New evaluation protocols to assess and measure bias/diversity in visual analytics
  • Generative methods to reduce bias in visual analytics
  • Evaluations of bias/diversity of state of the art techniques in visual analytics
  • Transfer learning/domain adaptation techniques for more fair visual analytics