Fixes That Fail: Self-Defeating Improvements in Machine-Learning Systems
Machine-learning systems such as self-driving cars or virtual assistants are composed of a large number of machine-learning models that recognize image content, transcribe speech, analyze natural language, infer preferences, rank options, etc. These systems can be represented as directed acyclic graphs in which each vertex is a model, and models feed each other information over the edges. Oftentimes, the models are developed and trained independently, which raises an obvious concern: Can improving a machine-learning model make the overall system worse? We answer this question affirmatively by showing that improving a model can deteriorate the performance of downstream models, even after those downstream models are retrained. Such self-defeating improvements are the result of entanglement between the models. We identify different types of entanglement and demonstrate via simple experiments how they can produce self-defeating improvements. We also show that self-defeating improvements emerge in a realistic stereo-based object detection system.
READ FULL TEXT