Zichong Wang, Avash Palikhe, Zhipeng Yin, Jiale Zhang and Wenbin Zhang
The 34th International Joint Conference on Artificial Intelligence (IJCAI), 2025
The 34th ACM International Conference on Information and Knowledge Management (CIKM), 2025
The 25th IEEE International Conference on Data Mining (ICDM), 2025
Large Language Models achieve outstanding performance across diverse applications but often produce biased outcomes, raising concerns about their trustworthy deployment. While fairness has been extensively studied in machine learning, most existing tutorials focus on settings where model internals or training data are accessible, assumptions that often do not hold for LLMs. As LLMs increasingly influence society, addressing fairness becomes crucial, including how bias emerges, quantifying bias effectively, and mitigating it. This tutorial systematically reviews recent fairness advancements specific to LLMs. It begins by presenting real-world examples of bias and identifying underlying bias sources. Next, it defines fairness concepts tailored to LLMs and reviews various bias evaluation methods and fairness-enhancing algorithms. We also present a multi-dimensional taxonomy of benchmark datasets for fairness evaluation and conclude with a discussion of open research challenges. All tutorial resources are publicly accessible at https://github.com/lavinWong/fairness-in-large-language-models.
Zhipeng Yin, Zichong Wang, Avash Palikhe and Wenbin Zhang
The 34th ACM International Conference on Information and Knowledge Management (CIKM), 2025
As generative artificial intelligence (AI) becomes increasingly prevalent in creative industries, intellectual property issues have come to the forefront, especially regarding AI-generated content that closely resembles human-created works. Recent high-profile incidents involving AI-generated outputs reproducing copyrighted materials underscore the urgent need to reassess current copyright frameworks and establish effective safeguards against infringement. To this end, this tutorial provides a structured overview of copyright challenges in generative AI across the entire development lifecycle. It begins by outlining key copyright principles relevant to generative models, then explores methods for detecting and evaluating potential infringement in generated outputs. The session also introduces strategies to safeguard creative content and training data from unauthorized replication, including mitigation techniques during model training. Finally, it reviews existing regulatory frameworks, highlights unresolved research questions, and offers recommendations to guide future work in this evolving area.
Archer Amon, Zichong Wang, Zhipeng Yin and Wenbin Zhang
The 24th IEEE International Conference on Data Mining (ICDM), 2024
In the rapidly evolving landscape of generative artificial intelligence (AI), the increasingly pertinent issue of copyright infringement arises as AI advances to generate content from scraped copyrighted data, prompting questions about ownership and protection that impact professionals across various careers. With this in mind, this survey provides an extensive examination of copyright infringement as it pertains to generative AI, aiming to stay abreast of the latest developments and open problems. Specifically, it will first outline methods of detecting copyright infringement in mediums such as text, image, and video. Next, it will delve an exploration of existing techniques aimed at safeguarding copyrighted works from generative models. Furthermore, this survey will discuss resources and tools for users to evaluate copyright violations. Finally, insights into ongoing regulations and proposals for AI will be explored and compared. Through combining these disciplines, the implications of AI-driven content and copyright are thoroughly illustrated and brought into question.
Thang Viet Doan, Zichong Wang, Nhat Hoang and Wenbin Zhang
The 33rd ACM International Conference on Information and Knowledge Management (CIKM), 2024
Large Language Models (LLMs) have demonstrated remarkable success across various domains but often lack fairness considerations, potentially leading to discriminatory outcomes against marginalized populations. Unlike fairness in traditional machine learning, fairness in LLMs involves unique backgrounds, taxonomies, and fulfillment techniques. This tutorial provides a systematic overview of recent advances in the literature concerning fair LLMs, beginning with real-world case studies to introduce LLMs, followed by an analysis of bias causes therein. The concept of fairness in LLMs is then explored, summarizing the metrics for evaluating bias and the algorithms designed to promote fairness. Additionally, resources for assessing bias in LLMs, including toolkits and datasets, are compiled, and current research challenges and open questions in the field are discussed.
Eric Xu, Wenbin Zhang and Weifeng Xu
The 33rd ACM International Conference on Information and Knowledge Management (CIKM), 2024
In the pursuit of justice and accountability in the digital age, the integration of Large Language Models (LLMs) with digital forensics holds immense promise. This half-day tutorial provides a comprehensive exploration of the transformative potential of LLMs in automating digital investigations and uncovering hidden insights. Through a combination of real-world case studies, interactive exercises, and hands-on labs, participants will gain a deep understanding of how to harness LLMs for evidence analysis, entity identification, and knowledge graph reconstruction. By fostering a collaborative learning environment, this tutorial aims to empower professionals, researchers, and students with the skills and knowledge needed to drive innovation in digital forensics. As LLMs continue to revolutionize the field, this tutorial will have far-reaching implications for enhancing justice outcomes, promoting accountability, and shaping the future of digital investigations.
All course materials are available on Canvas
Florida International University
This course offers a comprehensive exploration of fairness and bias in AI systems, emphasizing the responsible and human-centered use of algorithms and data. It focuses on the intersection of data, language, networks, and machine learning with fairness, introducing both foundational and advanced concepts in algorithmic bias and its societal impacts. Topics include fairness issues in large language models (LLMs) and generative AI, alongside traditional learning systems, examining how these technologies can perpetuate or mitigate bias in various domains. Students will engage with real-world challenges and apply quantitative methods to detect and address bias in areas such as healthcare, education, and housing. Open to graduate students from all disciplines, the course supports collaboration across technical and applied fields to guide the development of responsible and context-aware AI systems.
Florida International University
In this course, you will learn the fundamental ideas and techniques that are used in the construction of problem solvers that use artificial intelligence (AI) technology. Topics include knowledge representation and reasoning, problem-solving, heuristics, search heuristics, inference mechanisms, machine learning as well as advanced topics, such as AI fairness, natural language processing, knowledge graph, etc.