강화 학습을 이용한 자율주행 차량의 횡 방향 제어

강화 학습을 이용한 자율주행 차량의 횡 방향 제어
Lateral Control of An Autonomous Vehicle Using Reinforcement Learning

ㆍ 저자명: 이정훈,오세영,최두현
ㆍ 간행물명: 電子工學會論文誌. Journal of the Korean Institute of Telematics and Electronics. C
ㆍ 권/호정보: 1998년|11호|pp.76-88 (13 pages)
ㆍ 발행정보: 대한전자공학회
ㆍ 파일정보: 정기간행물|
PDF텍스트
ㆍ 주제분야: 기타

이 논문은 한국과학기술정보연구원과 논문 연계를 통해 무료로 제공되는 원문입니다.

서지반출

기타언어초록

강화 학습은 이산적인 공간을 가상하여 많은 연구가 행해졌지만, 많은 실제적인 제어 문제는 연속적인 공간에서 이루어진다. 평가 함수와 행동 함수를 연속함수로 하면 강화 학습 구조를 연속 공간에서 사용할 수 있다. 그러나 이 경우 두 가지 고려해야 할 점이 있다. 하나는 어떤 종류의 함수 표현 법을 사용할 것인가 하는 문제고, 다른 하나는 첨가하는 잡음의 양을 결정하는 것이다. 평가 함수와 정책 함수(제어기)에는 신경회로를 사용하였다. 강화 예측기로 다음 순간의 강화 신호를 예측하고, 아울러 첨가하는 잡음의 양도 결정하였다. 제안된 강화 학습 구조를 사용하여 차량의 횡 방향 제어 모의 실험에서 온라인 학습의 특성을 확인하였다. 제안된 구조를 실차 실험에도 적용하여 유용성과 타당성을 검증하였다.

기타언어초록

While most of the research on reinforcement learning assumed a discrete control space, many of the real world control problems need to have continuous output. This can be achieved by using continuous mapping functions for the value and action functions of the reinforcement learning architecture. Two questions arise here however. One is what sort of function representation to use and the other is how to determine the amount of noise for search in action space. The ubiquitous neural network is used here to learn the value and policy functions. Next, the reinforcement predictor that is intended to predict the next reinforcement is introduced that also determines the amount of noise to add to the controller output. The proposed reinforcement learning architecture is found to have a sound on-line learning control performance especially at high-speed road following of high curvature road. Both computer simulation and actual experiments on a test vehicle have been performed and their efficiency and effectiveness has been verified.

다운URL