Gods and Robots In this episode of the podcast we shake things up! Neil is on the guest side of the table with his partner Rabbi Laura Janner-Klausner to discuss their upcoming project Gods and Robots. Katherine is joined on the host side by friend of the show professor Michael Littman. See... See More Episodes arXiv Whitepapers XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models Without proper safeguards, large language models will readily follow malicious instructions and generate toxic content. This motivates safety efforts such as red-teaming and large-scale feedback learning, which aim to make models both helpful and harmless. However, there is a tension between these... Robust Counterfactual Explanations for Neural Networks With Probabilistic Guarantees There is an emerging interest in generating robust counterfactual explanations that would remain valid if the model is updated or changed even slightly. Towards finding robust counterfactuals, existing literature often assumes that the original model m and the new model M are bounded in the... Bound by the Bounty: Collaboratively Shaping Evaluation Processes for Queer AI Harms Bias evaluation benchmarks and dataset and model documentation have emerged as central processes for assessing the biases and harms of artificial intelligence (AI) systems. However, these auditing processes have been criticized for their failure to integrate the knowledge of marginalized communities... More featured content News Articles 5 Essential Papers on AI Training Data Algorithms workers can’t see are increasingly pulling the management strings Stay in the loop. Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings. E-mail Leave this field blank Privacy-preserving AI – Why do we need it? Rewriting the rules of machine-generated art Data systems that learn to be better An automated health care system that understands when to step in Dataset Splitting Best Practices in Python Looking into the black box Algorithm finds hidden connections between paintings at the Met Appropriately Handling Missing Values for Statistical Modelling and Prediction More news