Feature-Conditioned Reinforcement Learning for Generalizable Engineering Optimization

Varun S. Chavan, Mitsuru Endo, Zexin Shan, Yukio Tsutsui

IEEE/SICE International Symposium on System Integration (SII2026)

MoAT2.2


背景 – Background

Black-box optimization is often essential for global optimization problems. However, many existing methods rely on fixed hyperparameters and require manual tuning, which limits their ability to generalize across problem instances. Feature-Conditioned Reinforcement Learning (FC-RL) introduces a meta-network on top of the actor network of the Deep Deterministic Policy Gradient (DDPG) algorithm. This meta-network performs Feature-wise Linear Modulation (FiLM) to condition policy behavior on explicit problem features, enabling adaptability and generalization [2].


手法 – Method

The optimization task is formulated as a Markov Decision Process (MDP) and trained in a staged manner using the FC-RL agent [1]. The actor learns to produce actions based on the current state, while the meta-network modulates the actor’s behavior according to the problem type and configuration. As a result, even when problem configurations change, the conditioned policy adapts its actions accordingly, enabling generalization without retraining.

Fig.1 Conceptual diagram of FC-RL architecture illustrating feature conditioning via FiLM layers.


結果 – Result

Comparative evaluations show that FC-RL outperforms existing optimization methods such as Particle Swarm Optimization (PSO), Genetic Algorithms (GA), DIRECT (DIviding RECTangles), and a baseline DDPG approach. The proposed method demonstrates strong generalization across varying problem configurations without the need for retuning or retraining. In addition, the deterministic nature of the policy leads to reproducible outputs, supporting scalable optimization workflows.


結論 – Conclusion

FC-RL provides a practical solution for repetitive engineering optimization tasks by enabling optimization across varying problems and configurations without manual tuning or retraining. This property improves optimization efficiency and makes the approach suitable for engineering design workflows that require repeated optimization under changing conditions.


参考文献 - Reference

[1] E. Perez et al., “FiLM: Visual reasoning with a general conditioning layer,” 2018.

[2] T. Lillicrap et al., “Continuous control with deep reinforcement learning,” 2016.


The content on this page is not freely available for reproduction or redistribution. Unauthorized use may lead to legal consequences.