References

bsuir

Доклады БГУИР

Doklady BGUIR

1729-76482708-0382

БГУИР

10.35596/1729-7648-2019-126-8-125-132

bsuir-2481

Research Article

ЭЛЕКТРОНИКА, РАДИОФИЗИКА, РАДИОТЕХНИКА, ИНФОРМАТИКА

ELECTRONICS, RADIOPHYSICS, RADIOENGINEERING, INFORMATICS

МЕТОД ПОСТРОЕНИЯ МОДЕЛИ НЕЙРОРЕГУЛЯТОРА ПРИ ОПТИМИЗАЦИИ СТРУКТУРЫ УПРАВЛЕНИЯ ТЕХНОЛОГИЧЕСКИМ ЦИКЛОМ

METHOD OF CONSTRUCTION OF A NEUROREGULATOR MODEL WHEN OPTIMIZING THE CONTROL STRUCTURE OF A TECHNOLOGICAL CYCLE

Смородин

В. С.

Smorodin

V. S.

Смородин Виктор Сергеевич, д.т.н., профессор, заведующий кафедрой математических проблем управления и информатики

246019, г. Гомель, ул. Советская, д. 104

Smorodin Victor Sergeevich, D.Sci., Professor, Head of the Department of Mathematical Problems of Control and Informatics

246019, Gomel, Sovetskaya st., 104

smorodin@gsu.by

Прохоренко

В. А.

Prokhorenko

V. A.

ассистент кафедры математических проблем управления и информатики

246019, г. Гомель, ул. Советская, д. 104

Prokhorenko V.A., M.Sci., Assistant of the Department of Mathematical Problems of Control and Informatics

246019, Gomel, Sovetskaya st., 104

Гомельский государственный университет имени Франциска СкориныGomel State University named after Francisk Skorina

2019

29122019

07-8125132

2019

Смородин В.С., Прохоренко В.А.

Smorodin V.S., Prokhorenko V.A.

Данная работа распространяется под лицензией Creative Commons Attribution 4.0.

This work is licensed under a Creative Commons Attribution 4.0 License.

https://doklady.bsuir.by/jour/article/view/2481

Цель работы, результаты которой представлены в рамках данной статьи, состояла в разработке метода построения модели нейрорегулятора для случая оптимизации структуры управления технологическим циклом, реализация которого осуществляется на базе средств автоматизации производственного процесса при наличии физического контроллера, который осуществляет управление технологическим процессом в соответствии с заданной программой. Для достижения поставленной цели были решены задачи, связанные с применением нейросетевых технологий при построении математической модели нейрорегулятора. При этом математическая модель нейрорегулятора разработана на основе физического прототипа, а процедура синтеза управления в режиме реального времени (адаптивного управления) основана на процедуре обучения рекуррентной нейронной сети, построенной с использованием блоков LSTM, которые имеют возможность хранить информацию в течение длительного времени. Предложен метод построения модели нейрорегулятора для реализации управления технологическим циклом производства при решении задачи поиска оптимальной траектории на фазовой плоскости параметров состояний технологического цикла. В рассматриваемой задаче поиска оптимальной траектории математическая модель нейрорегулятора в каждый момент времени получает информацию о текущем состоянии системы, данные о смежных состояниях объекта управления и направление движения по фазовой плоскости состояний, которое определяется действующими критериями оптимизации управления. С учетом полученных результатов установлено, что рекуррентные сети с LSTM-модулями могут успешно применяться в качестве аппроксиматора Q-функции агента для решения поставленной задачи в условиях, когда частично наблюдаемая область состояний системы имеет сложную структуру. Выбор предложенного в работе метода адаптации к управляющим воздействиям и внешним возмущениям окружающей среды удовлетворяет требованиям к быстродействию процесса адаптации, равно как и требованиям к качеству процессов управления для случаев, когда актуальная информация о природе случайных возмущений управления отсутствует. Среда для проведения экспериментов, а также модели нейронных сетей реализованы на языке программирования Python с использованием библиотеки TensorFlow.

In this paper authors present the results of a research that had a purpose to develop a method of constructing a neuroregulator model for the case of optimization of the control structure of a technological cycle. The method's implementation is based upon the automation of a production process when a physical controller, that operates the technological process according to a given program, is present. In order to achieve this goal, the artificial neural network approaches were implemented to create a mathematical model of the neuroregulator. The mathematical model of the neuroregulator is based on a physical prototype, and the procedure of a real-time control synthesis (adaptive control) is based on recurrent neural network training. The neural network architecture includes LSTM blocks, which are capable of storing information for long periods of time. A method is proposed for constructing a neuroregulator model for control of a production cycle when solving the task of the optimal trajectory finding on the phase plane of the technological cycle states. In the considered task of the optimal trajectory finding the mathematical model of the neuroregulator receives at each moment of time information about the current system state, the adjacent system states and the movement direction on the phase plane of states. Movement direction is determined by the given control optimization criteria. Based on the research results it was found that recurrent networks with LSTM modules can be used successfully as an approximator for the agent's Q-function to solve the given problem when the partially observed region of system states has a complex structure. The choice of the method of adaptation to the control actions and the external environmental disturbances proposed in the paper satisfies the requirements for the adatation process performance, as well as the requierments for the control processes quality, when there is lack of information about the nature of random control disturbances. The experimental environment, as well as the neural network models was implemented using the Python programming language with TensorFlow library.

модель нейрорегулятораадаптивное управлениеоптимизация параметров функционированияфазовая плоскость состоянийоптимальная траектория

neuroregulator modeladaptive controloptimization of functioning parametersphase plane of statesoptimal trajectory

References1

Максимей И.В., Смородин В.С., Демиденко О.М. Разработка имитационных моделей сложных технических систем. Гомель: ГГУ им. Ф. Скорины; 2014.

Maksimej I.V., Smorodin V.S., Demidenko O.M. [Development of simulation models of complex technical systems]. Gomel: GGU im. F. Skoriny; 2014. (in Russ.)

Hochreiter S., Schmidhuber J. Long short-term memory. Neural Computation. 1997;9(8):1735-1780. DOI:10.1162/neco.1997.9.8.1735.

Hochreiter S., Schmidhuber J. Long short-term memory. Neural Computation. 1997;9(8):1735-1780. doi:10.1162/neco.1997.9.8.1735.

Sutton R.S., Barto A.G. Reinforcement Learning: An Introduction. Cambridge: The MIT Press; 1998.

Mnih V., Kavukcuoglu K., Silver D., Rusu A., Veness J., Bellemare M., Graves A., Riedmiller M., Fidjeland A., Ostrovski G., Petersen S., Beattie C., Sadik A., Antonoglou I., King H., Kumaran D., Wiestra D., Legg S., Hassabis D. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):29-533. DOI:10.1038/nature14236.

Toffoli T., Margolus N. Cellular Automata Machines: A New Environment for Modeling. Cambridge: The MIT Press; 1987.

The authors declare that there are no conflicts of interest present.