publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- ArXivThe Lagrangian Method for Solving Constrained Markov GamesSoham Das, Santiago Paternain, Luiz F. O. Chamon, and Ceyhun EksinarXiv, 2025
We propose the concept of a Lagrangian game to solve constrained Markov games. Such games model scenarios where agents face cost constraints in addition to their individual rewards, that depend on both agent joint actions and the evolving environment state over time. Constrained Markov games form the formal mechanism behind safe multiagent reinforcement learning, providing a structured model for dynamic multiagent interactions in a multitude of settings, such as autonomous teams operating under local energy and time constraints, for example. We develop a primal-dual approach in which agents solve a Lagrangian game associated with the current Lagrange multiplier, simulate cost and reward trajectories over a fixed horizon, and update the multiplier using accrued experience. This update rule generates a new Lagrangian game, initiating the next iteration. Our key result consists in showing that the sequence of solutions to these Lagrangian games yields a nonstationary Nash solution for the original constrained Markov game.
@article{das2025lagrangian, title = {The Lagrangian Method for Solving Constrained Markov Games}, author = {Das, Soham and Paternain, Santiago and Chamon, Luiz F. O. and Eksin, Ceyhun}, journal = {arXiv}, year = {2025}, } - NeurIPS CoMLThe Lagrangian Method for Solving Locally Constrained Markov GamesSoham Das, Santiago Paternain, Luiz F. O. Chamon, and Ceyhun EksinNeurIPS 2025 Workshop on Constrained Optimization for Machine Learning (CoML), Non-archival workshop paper (Upcoming), 2025
We introduce a dual gradient descent method to solve Markov games subject to local but coupled constraints. These constrained Markov games (CMGs) model multiagent systems operating in dynamic environments where each agent seeks to maximize its individual utility while respecting specific local constraints. These constraints, which capture real-world limitations such as resource budgets or safety requirements, are local in the sense that they pertain to individual agents but are coupled through dependencies on other agents’ actions and the shared environment state. Our approach builds on the concept of the unconstrained Lagrangian game, which agents repeatedly solve based on the current values of their own local Lagrange multipliers. Agents simulate trajectories to estimate cumulative rewards and constraint violations, then adjust the multipliers using a dual gradient step grounded in their local observations. This primal-dual process produces a sequence of policy profiles that converge to a nonstationary and feasible Nash equilibrium of the original constrained game. This work extends previous frameworks for solving Markov games with global constraints, providing a principled approach to safe multiagent learning under localized, interdependent feasibility conditions.
@article{das2025localconstraints, title = {The Lagrangian Method for Solving Locally Constrained Markov Games}, author = {Das, Soham and Paternain, Santiago and Chamon, Luiz F. O. and Eksin, Ceyhun}, journal = {NeurIPS 2025 Workshop on Constrained Optimization for Machine Learning (CoML), Non-archival workshop paper (Upcoming)}, year = {2025}, } - IEEE CDCA Lagrangian Framework for Safe Cooperative Reinforcement LearningSoham Das, Luiz F. O. Chamon, Santiago Paternain, and Ceyhun EksinIEEE Conference on Decision and Control (Upcoming), 2025
We consider the problem of safe cooperative multiagent reinforcement learning (MARL) within the framework of a constrained multiagent Markov decision process (MDP). Agents share a common value function and learn to coordinate their actions to maximize a joint objective while adhering to system-level constraints. These constraints can enforce safety, reliability, or additional regulatory requirements governing the evolution of the multiagent system. We propose a Lagrangian-based approach, where agents iteratively solve a relaxed Lagrangian MDP using a joint learning mechanism. During execution, agents independently follow their policies, accumulating constraint violations over an epoch, which are then used to update the Lagrange multipliers. We show that continuous execution of this primal-dual algorithm produces episodes which are feasible almost surely. Further, we prove that the sequence of policies generated by the algorithm yields a nonstationary approximately optimal solution for the safe cooperative MARL problem.
@article{das2025cooperative, title = {A Lagrangian Framework for Safe Cooperative Reinforcement Learning}, author = {Das, Soham and Chamon, Luiz F. O. and Paternain, Santiago and Eksin, Ceyhun}, journal = {IEEE Conference on Decision and Control (Upcoming)}, year = {2025}, }
2024
- SIAMAverage Submodularity of Maximizing Anticoordination in Network GamesSoham Das, and Ceyhun EksinSIAM Journal on Control and Optimization, 2024
Abstract. We consider the control of decentralized learning dynamics for agents in an anticoordination network game. In the anticoordination network game, there is a preferred action in the absence of neighbors’ actions, and the utility an agent receives from the preferred action decreases as more of its neighbors select the preferred action, potentially causing the agent to select a less desirable action. The decentralized dynamics that are based on the synchronous best-response dynamics converge for the considered payoffs. Given a convergent action profile, we measure anticoordination by the number of edges in the underlying graph that have at least one agent in either end of the edge not taking the preferred action. A designer wants to find an optimal set of agents to control under a finite budget in order to achieve maximum anticoordination (MAC) on game convergence as a result of the dynamics. We show that the MAC is submodular in expectation over all realizations of the payoff interaction constants in bipartite networks. The proof relies on characterizing well-behavedness of MAC instances for bipartite networks, and designing a coupling between the dynamics and another distribution preserving selection protocol, for which we can show the diminishing returns property. Utilizing this result, we obtain a performance guarantee for the greedy optimization of MAC. Finally, we provide a computational study to show the effectiveness of greedy node selection strategies to solve MAC on general bipartite networks.
@article{doi:10.1137/22M1506614, author = {Das, Soham and Eksin, Ceyhun}, title = {Average Submodularity of Maximizing Anticoordination in Network Games}, journal = {SIAM Journal on Control and Optimization}, volume = {62}, number = {5}, pages = {2639-2663}, year = {2024}, doi = {10.1137/22M1506614}, eprint = {https://epubs.siam.org/doi/pdf/10.1137/22M1506614} } - IEEE LCSSLearning Nash in Constrained Markov Games With an α -PotentialSoham Das, and Ceyhun EksinIEEE Control Systems Letters. Winner, Outstanding Student Paper Prize, IEEE Control Systems Society, TC Networked Systems , 2024
We develop a best-response algorithm for solving constrained Markov games assuming limited violations for the potential game property. The limited violations of the potential game property mean that changes in value function due to unilateral policy alterations can be measured by the potential function up to an error α . We show the existence of stationary ϵ -approximate constrained Nash policy whenever the set of feasible stationary policies is non-empty. Our setting has agents accessing an efficient probably approximately correct solver for a constrained Markov decision process which they use for generating best-response policies against the other agents’ former policies. For an accuracy threshold ϵ>4α , the best-response dynamics generate provable convergence to ϵ -Nash policy in finite time with probability at least 1−δ at the expense of polynomial bounds on sample complexity that scales with the reciprocal of ϵ and δ .
@article{10531766, author = {Das, Soham and Eksin, Ceyhun}, journal = {IEEE Control Systems Letters}, title = {Learning Nash in Constrained Markov Games With an α -Potential}, year = {2024}, volume = {8}, number = {}, pages = {808-813}, keywords = {Games;Stochastic processes;Picture archiving and communication systems;Kernel;Finite element analysis;Complexity theory;Vehicle dynamics;Game theory;constrained control;optimization;machine learning}, doi = {10.1109/LCSYS.2024.3402132}, }
2023
- IISE Tran.Statistical and dynamic model of surface morphology evolution during polishing in additive manufacturingAdithyaa Karthikeyan, Soham Das, Satish TS Bukkapatnam, and Ceyhun EksinIISE Transactions. Featured in ISE Magazine , 2023
Many industrial components, especially those realized through 3D printing undergo surface finishing processes, predominantly, in the form of mechanical polishing. The polishing processes for custom components remain manual and iterative. Determination of the polishing endpoints, i.e., when to stop the process to achieve a desired surface finish, remains a major obstacle to process automation and in the cost-effective custom/3D printing process chains. With the motivation to automate the polishing process of 3D printed materials to a desired level of surface smoothness, we propose a dynamic model of surface morphology evolution of 3D printed materials during a polishing process. The dynamic model can account for both material removal and redistribution during the polishing process. In addition, the model accounts for increased material flow due to heat generated during the polishing process. We also provide an initial random surface model that matches the initial surface statistics. We propose an optimization problem for model parameter estimation based on empirical data using KL-divergence and surface roughness as two metrics of the objective. We validate the proposed model using data from polishing of a 3D printed sample. The procedures developed makes the model applicable to other 3D printed materials and polishing processes. We obtain a network formation model as a representation of the surface evolution from the heights and radii of asperities. We use the network connectivity (Fiedler number) as a metric for surface smoothness that can be used to determine whether a desired level of smoothness is reached or not.
@article{karthikeyan2023statistical, title = {Statistical and dynamic model of surface morphology evolution during polishing in additive manufacturing}, author = {Karthikeyan, Adithyaa and Das, Soham and Bukkapatnam, Satish TS and Eksin, Ceyhun}, journal = {IISE Transactions}, pages = {1--15}, year = {2023}, publisher = {Taylor \& Francis}, }
2022
- IEEE CDCApproximate Submodularity of Maximizing Anticoordination in Network GamesSoham Das, and Ceyhun EksinIn 2022 IEEE 61st Conference on Decision and Control (CDC), 2022
In this paper we show that MAC is approximately submodular, hence a greedy approach to selecting control agents guarantees fractionally sub-optimal solutions for the line network. Moreover, we provide a special scenario for bipartite graphs that shows that the violation of submodularity can be arbitrarily bad, highlighting limitations of our provided guarantees in such worst case situations, thereby indicating that greedy approaches may not be always desirable in attaining anti-coordination. We further provide encouraging computational results which strongly suggest the effectiveness of greedy selection in practical scenarios.
@inproceedings{9993180, author = {Das, Soham and Eksin, Ceyhun}, booktitle = {2022 IEEE 61st Conference on Decision and Control (CDC)}, title = {Approximate Submodularity of Maximizing Anticoordination in Network Games}, year = {2022}, volume = {}, number = {}, pages = {3151-3157}, keywords = {Heuristic algorithms;Sociology;Games;Approximation algorithms;Cognition;Statistics;Convergence}, doi = {10.1109/CDC51059.2022.9993180}, }