Science ❯ Computer Science ❯ Artificial Intelligence
Reinforcement Learning Data Poisoning Deep Learning Parameter-efficient Fine-tuning Learning Behaviors Data Annotation Simulation Synthetic Data Generation Reasoning Traces Conversational Agents Data Challenges
Routine safety training can largely neutralize such simple backdoors.