Fig.2 learning techniques. Experiments conducted on handwritten digit

Fig.2 . Block diagram of the
hardware/software architecture for embedded DC-ELM. The data cache and EMC are
included only if external memory is used. It shows main components like GPIOs,
buses and memories.


On the other hand, the hardware is a
high-performance co-processor which is specifically designed for calculation of
SLFN. This hardware is able to do calculation for SLFN faster than an ordinary
processor. The sum of products of neurons calculations is directly
corresponding to the output. Both, ROM and RAM are implemented as a part of the
neurons. Therefore, the path between the output of the memories and the input
of the neurons’ DSP is as short as possible, thus decreasing the propagation
delay of the signals. In summation, the FPGA resources are used to develop a
dedicated neural networked co-processor. The implementation of the processor
and its on-chip subsystem requires about 2000 flip-flops, 2500 LUTs, 6 DSP
cores, and a number of RAM memory blocks 3. Along with internal memory, the
RAM is used for data cache, and for external memory control to deal with SDRAM. 



In the proposed work above, an
intelligent embedded system for Deep Convolution Extreme Learning Machine
(DC-ELM) is explained. It is a scalable HW/SW architecture based on a
reconfigurable device (FPGA) that performs fast neural network computations
allowing rapid machine learning. The hardware addresses an SLFN co-processor
core that is designed to perform high-speed parallel computation of the neural
network. Whereas, the software implements the DC-ELM learning algorithm using a
MicroBlaze soft microprocessor. DC-ELM employs multiple alternate convolution
and pooling layers that can extract sophisticated and robust features from the
raw input images. It is free of local minima and performance depends on a
single design parameter, the size of the hidden layer. Therefore, it requires
less human intervention and more real-time adaptation flexibility than any
other well-known machine learning techniques. Experiments conducted on
handwritten digit recognition tasks show that the proposed DC-ELM presents
better test accuracy on different cases than ELM, LRF-ELM, and state-of-the-art
deep leaning methods 6. Moreover, the HW/SW architecture explained make is
work well under constraint of memory size, autonomy, power consumption and
real-time adaptation. Hence, I argue that the suggested DC-ELM is more efficient
for classification and regression applications.