| ||
News
|
||
ResearchI do research on reinforcement learning (RL), with a focus on generalization and curriculum learning (CL). My publications can be found on my Google Scholar page, and some are listed with details below. During my Ph.D. so far, I have done research internships at Bosch Center for Artificial Intelligence and Honda Research Institute. At Bosch, I developed CL4AD, the first integration of CL into batched autonomous driving simulators, accelerating training by 77%. At HRI, I developed an action advising framework, Gen2Spec, that distills knowledge from generalist agents to specialists in a continual learning setting. Recently, I have been focusing on the following research directions:
Curriculum Learning for Reinforcement LearningThe design of task sequences, i.e., curricula, improves the performance of RL agents and speeds up the convergence in complex tasks. An effective curriculum typically begins with easy tasks and gradually changes them toward the target tasks. Common approaches require manually tailoring the curricula to identify easy and hard tasks, which requires domain knowledge that might be unavailable. My research focuses on developing automated curriculum generations algorithms that utilize generative models, introduce notions of risk and uncertainty, address constrained RL, and exploit task specifications.PublicationsCevahir Koprulu, Thiago D. Simão, Nils Jansen, Ufuk Topcu International Conference on Learning Representations (ICLR), 2025 Cevahir Koprulu, Thiago D. Simão, Nils Jansen, Ufuk Topcu Conference on Uncertainty in Artifical Intelligence (UAI), 2023 Cevahir Koprulu, Ufuk Topcu Conference on Uncertainty in Artifical Intelligence (UAI), 2023 Cevahir Koprulu, Ufuk Topcu International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2023 (accepted as Extended Abstract) WorkshopsCevahir Koprulu, Thiago D. Simão, Nils Jansen, Ufuk Topcu RLBRew and RLSW at Reinforcement Learning Conference , 2024 Exploiting Side Information for Offline and Mixed RLWe investigate use of side information to improve the performance of RL agents offline and mixed (offline + online) settings. For offline RL, we develop an uncertainty-aware, offline model-based reinforcement learning approach (NUNO) with neural stochastic differential equations, levelaraging prior physics knowledge as inductive bias, improving state-of-the-art in low-quality data regimes. For mixed settings, we introduce a systematic reward-shaping framework that distills the information contained in (1) a task-agnostic prior data set and (2) a few task-specific expert demonstrations for dense dynamics-aware reward synthesis.PublicationsCevahir Koprulu, Franck Djeumou, Ufuk Topcu International Conference on Learning Representations (ICLR), 2025 Cevahir Koprulu, Po-han Li, Tianyu Qiu, Ruihan Zhao, Tyler Westenbroek, David Fridovic-Keil, Sandeep Chinchali, Ufuk Topcu Learning for Dynamics and Control (L4DC) Conference , 2025 Learning Reward Machines and PoliciesWe study the problem of reinforcement learning for a task encoded by a reward machine. The task is defined over a set of properties in the environment, called atomic propositions, and represented by Boolean variables. One unrealistic assumption commonly used in the literature is that the truth values of these propositions are accurately known. In real situations, however, these truth values are uncertain since they come from sensors that suffer from imperfections. At the same time, reward machines can be difficult to model explicitly, especially when they encode complicated tasks. We develop a reinforcement-learning algorithm that infers a reward machine that encodes the underlying task while learning how to execute it, despite the uncertainties of the propositions' truth values.PublicationsChristos Verginis, Cevahir Koprulu, Sandeep Chinchali, Ufuk Topcu Aritificial Intelligence, 2024 Level-k Game Theory to Model Human DriversLevel-k game theory is a hierarchical multi-agent decision-making model where a level-k player is a best responder to a level-(k-1) player. In this work, we studied level-k game theory to model reasoning levels of human drivers. Different from existing methods, we proposed a dynamic approach, where the actions are the levels themselves, resulting in a dynamic behavior. The agent adapts to its environment by exploiting different behavior models as available moves to choose from, depending on the requirements of the traffic situation.PublicationsCevahir Koprulu, Yildiray Yildiz IEEE Conference on Control Technology and Applications (CCTA), 2021 Cevahir Koprulu, Yildiray Yildiz ArXiV, 2021 |
|
This website is based on Jon Barron's source code.
His website can be found here.
|