Unsupervised Learning System to Self-Adaptive Learning Model with Imitation Learning

1. Imitation Learning with Unsupervised Learning Techniques

Figure 1. Self-Adaptive Learning Model with Imitation Learning

One of the most important parts of IDX Foundation technology is to develop self-adaptive agents that perform self-context recognition and action planning. The agent will be used in various fields, including autonomous driving and robotics control. For this, machine learning algorithms are emerging based on in-depth neural networks such as reinforcement learning or expert emulation learning. However, there are limitations that current algorithms do not yet guarantee agent performance in complex decision-making environments where the scope of status, behavioral data, and compensation for agent behavior is not clear. To address this, the self-adaptive agent must extract its own high-level characteristics that help make decisions from the state data it currently encounters. Unsupervised learning is the most effective learning paradigm for extracting characteristics inherent in raw data, and it is intended to apply it to learning algorithms so that learned agents can make accurate decisions even on stateless or complex state data.

Figure 2. MCFS-BC Model


2. Analysis of Effecting Unseen State on the Single Imitation Learning by Measuring the Cascading Error

  • The significance of imitation learning lies in the ability to learn context-sensitive policies efficiently by learning expert status data and behavioral data even in difficult situations because the Renewal for Action is not mathematically defined or sparsed.
  • These imitations are suitable techniques for use in strategic behavioral learning situations where it is difficult to clearly quantify subsequent performance.
  • However, there is a contradiction in verifying the performance of copycat learning techniques on Unseen State due to lack of clear rewords in practical strategic behavior learning situations.
  • To overcome the inconsistencies and demonstrate the effectiveness of copycat learning techniques, performance verification is carried out using physical engines that electronically embody physical characteristics and simulated spaces where the Reward function can be defined using them.

Figure 3. Unseen State Analysis


3. Data-preprocessing Method to Improve the Performance
The performance of the agent is not sufficiently learned for the Expert trajectory data, i.e. Unseen State, which is not given in the Imitation Learning. In addition, the simulators intended to implement throughout this study require the provision of unobserved states to provide different states. Therefore, in this study, a study was conducted to create an Unseen State.

Figure 4. Unseen State Generation Model