in conjunction with IEEE CVPR 2024
June 17, 2024
In every walk of life, computer vision and AI systems are playing a significant and increasing role. They are being employed for making mundane day to day decisions such as healthy food choices and dress choices from the wardrobe to match the occasion of the day as well as mission-critical and life-changing decisions such as diagnosis of diseases, detection of financial frauds, and selecting new employees. Many upcoming applications such as autonomous driving to automated cancer treatment recommendations has everyone worrying about the level of trust associated with vision systems today. The concerns are genuine as many weaker sides of modern vision systems have been exposed through adversarial attacks, bias, and lack of explainability in the current rapidly evolving vision systems. While these vision systems are reaping the advantage of the novel learning methods, they exhibit brittleness to minor changes in the input data and lack the capability to explain its decisions to a human. Furthermore, they are unable to address the bias in their training data and are often highly opaque in terms of revealing the lineage of the system and how they were trained and tested. It has been conjectured that the current use of AI is based on only about 20% of the data the world has access to. Rest 80% of the data that can help AI systems is not available because of regulations and compliance requirements around security and privacy. The present AI systems haven’t demonstrated the ability to learn without compromising on the privacy and security of data. Nor can they even assign appropriate credit to the data sources.
With the ever increasing appetite for data in machine learning, we need to face the reality that for many applications, sufficient data may not be available. Even if raw data is plenty, quality labeled data may be scarce, and if it is not, then relevant labeled data for a particular objective function may not be sufficient. The latter is often the case in tail end of the distribution problems, such as recognizing in autonomous driving that a baby stroller is rolling on the street. The event is rare in training and testing data, but certainly highly critical for the objective function of personal and property damage. Even the performance evaluation of such a situation is challenging. One may stage experiments geared towards particular situations, but this is not a guarantee that the staging conforms to the natural distribution of events, and even if, then there are many tail ends in high dimensional distributions, that are by their nature hard to enumerate manually.
Many publicly available computer vision datasets are responsible for great progress in visual recognition and analytics. These datasets serve as source of large amounts of training data as well as assessing performance of state-of-the-art competing algorithms. Performance saturation on such datasets has led the community to believe many general visual recognition problems to be close to be solved, with various commercial offerings stemming from models trained on such data. However, such datasets present significant biases in terms of both categories and image quality, thus creating a significant gap between their distribution and the data coming from the real world. For example, many of the publicly available datasets underrepresent certain ethnic and cultural communities and over represent others. Many variations have been observed to impact visual recognition including resolution, illumination and simple cultural variations of similar objects. Systems based on a skewed training dataset are bound to produce skewed results. This mismatch has been evidenced in the significant drop in performance of state of the art models trained on those datasets when applied to images for example of particular gender and/or ethnicity groups for face analytics. It has been shown that such biases may have serious impacts on performance in challenging situations where the outcome is critical either for the subject or to a community. Often research evaluations are quite unaware of those issues, while focusing on saturating the performance on skewed datasets.
In order to progress toward fair visual recognition truly in the wild, we propose this workshop to understand the underlying issues in bias free and culturally diverse visual recognition.
Under such circumstances, our workshop on Fair, Data Efficient and Trusted Computer Vision will address four critical issues in enhancing user trust in AI and computer vision systems namely: (i) Fairness, (ii) Data Efficient learning and critical aspects of trust including (ii) explainability, (iii) mitigating adversarial attacks robustly and (iv) improve privacy and security in model building with right level of credit assignment to the data sources along with transparency in lineage.
We solicit submissions of technical papers. The accepted papers will be published in CVPR 2024 workshop proceedings and presented at the workshop. Please submit at the CMT Submission Site
Submitted technical papers must follow the CVPR paper format and guidelines (see CVPR2024 Author Guidelines). All accepted submissions must be presented by one of the authors.
Submission deadline: for technical papers is March 25 2024 11:59pm Pacific Time
Notification to authors: April 7 2024 11:59pm Pacific Time
Camera ready deadline: April 14 2024 11:59pm Pacific Time
We invite submissions of original work. The review will be double-blind.
We solicit original research papers covering these areas to be submitted to the workshop: