Cultural Evolution and Deep Learning
Evaluation and the Future of Science
Why did AI research stop exploring alternative technologies alongside deep learning?
Deep learning – the AI approach underpinning the current generative AI revolution – is a technology achieving marvelous things. Nevertheless, its hunger for data and processing power create an array of intractable ethical and epistemic challenges ranging from unsustainability to low interpretability and the amplification of undesirable social biases.
Weaving computational analyses of AI research papers with interviews across the history of AI, Koch suggests that the field’s transition from peer review to a system called benchmarking was pivotal in its collapse on deep learning. Although evaluation is usually considered retrospective – coming at the end of the process – he argues that it is fundamentally prospective, emphasizing that the work researchers choose to collectively publish and promote has implications for where the science goes. The history of AI thus illustrates the strengths and weaknesses of different types of evaluation in creating distinct epistemic cultures. More specifically, the success of deep learning raises the provocative question of whether peer review, a plodding institution plagued by its own criticisms, is really necessary for effective science at all. Future work will turn this theoretical lens to contemporary developments in generative AI, which presents new and exciting challenges for scientific evaluation.
How does culture change?
Wedding concepts from cultural evolution, cognition, and macroevolutionary biology, Koch develops populational theories that link the ideas in people’s heads to broader patterns of cultural change. He is particularly interested in how competition between ideas for people’s time and attention drives dynamics in popular culture. In parallel, he builds formal statistical and machine learning models to operationalize these theories and test for competition between ideas in real cultural datasets. Current work deploys this approach to a dataset of the formation and dissolution of tens of thousands of Metal bands to argue that it is primarily cultural competition, not economic trends or historical trends, that has driven the popularity and emergence of subgenres over time.
Koch is also interested in integrating text- and graph-based AI with rigorous statistical frameworks. Past work has focused on how graph neural networks can be used to predict the parameters of citation time series from evolving graphs. He has also written a primer on the use of deep learning to estimate heterogeneous causal effects within the potential outcomes framework. Ongoing evaluation research is developing LLM bioethics classifiers to quantitatively estimate how and why problematic racial hereditarian science gets published.
Collaborators: David Peterson, Kushan Dasgupta, Daniele Silvestro, Aaron Panofsky, Jacob Foster, BJ Moses-Rosenthal

