Name: | Description: | Size: | Format: | |
---|---|---|---|---|
3.6 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
Há cerca de uma década, o panorama da arquitetura dos sistemas de computação registou
um salto evolutivo, com o aparecimento de sistemas heterogéneos. Nestes sistemas, à
unidade central de processamento (CPU), talhada para uso genérico, juntaram-se dispositivos
co-processadores, como GPUs e FPGAs, de diferentes arquiteturas. Originalmente
concebidos para fins muito específicos (como processamento gráfico ou de sinal), estes coprocessadores
passaram a ser vistos como elementos auxiliares de processamento, capazes
de acelerar a execução de aplicações computacionalmente exigentes.
Para permitir a exploração eficiente de sistemas heterogéneos, e garantir portabilidade
do código, definiram-se standards abertos, como o OpenCL, suportando co-processadores
de virtualmente qualquer tipo. Noutros casos, passaram a existir frameworks proprietárias,
orientadas a dispositivos de fabricantes específicos, como a framework CUDA para
GPUs da NVIDIA. Comum a todas estas abordagens é o facto de, originalmente, apenas
preverem a utilização de co-processadores locais, ligados a um único sistema hospedeiro,
não possibilitando a exploração de aceleradores ligados a outros sistemas, acessíveis via
rede, limitando assim o potencial de aceleração das aplicações.
O trabalho desenvolvido nesta dissertação dá resposta a esta limitação. Consistiu na
criação do remote OpenCL (rOpenCL), middleware e serviços que, em conjunto, permitem
que uma aplicação OpenCL (mesmo pré-compilada), explore de forma transparente e
eficiente o conjunto de aceleradores disponíveis num ambiente distribuído de sistemas Linux,
recorrendo a comunicação portável assente em sockets BSD. A abordagem é validada
recorrendo a benchmarks OpenCL de referência, que provam a conformidade do rOpenCL
com a especificação OpenCL 1.2, bem como a robustez e escalabilidade da implementação.
About a decade ago, the landscape of computer systems architecture registered an evolutionary leap, with the appearance of heterogeneous systems. In these systems, the central processing unit (CPU), designed for generic use, was joined by co-processor devices, such as GPUS and FPGAS, of different architectures. Originally designed for very specific purposes (such as graphic or signal processing), these co-processors came to be seen as auxiliary processing elements, capable of accelerating the execution of computationally demanding applications. To allow efficient exploitation of heterogeneous systems, and to ensure portability of code, open standards were defined, such as OpenCL, supporting coprocessors of virtually any type. In other cases, there have been proprietary frameworks oriented to devices from specific manufacturers, such as the CUDA framework for NVIDIA GPUs. Common to all these approaches is that they originally only provide for the use of local co-processors, which are connected to a single host system, and do not allow the exploitation of accelerators connected to other systems, accessible via the network, thereby limiting the potential for application acceleration. The work developed in this dissertation responds to this limitation. It consisted of the creation of remote OpenCL (rOpenCL), middleware and services that allow an OpenCL application (even pre-compiled) to transparently and efficiently explore the set of accelerators available in a distributed Linux system environment, using portable BSD sockets for communication. The approach is validated using reference OpenCL benchmarks, which prove the rOpenCL compliance with the OpenCL 1.2 specification, as well as the robustness and scalability of the implementation.
About a decade ago, the landscape of computer systems architecture registered an evolutionary leap, with the appearance of heterogeneous systems. In these systems, the central processing unit (CPU), designed for generic use, was joined by co-processor devices, such as GPUS and FPGAS, of different architectures. Originally designed for very specific purposes (such as graphic or signal processing), these co-processors came to be seen as auxiliary processing elements, capable of accelerating the execution of computationally demanding applications. To allow efficient exploitation of heterogeneous systems, and to ensure portability of code, open standards were defined, such as OpenCL, supporting coprocessors of virtually any type. In other cases, there have been proprietary frameworks oriented to devices from specific manufacturers, such as the CUDA framework for NVIDIA GPUs. Common to all these approaches is that they originally only provide for the use of local co-processors, which are connected to a single host system, and do not allow the exploitation of accelerators connected to other systems, accessible via the network, thereby limiting the potential for application acceleration. The work developed in this dissertation responds to this limitation. It consisted of the creation of remote OpenCL (rOpenCL), middleware and services that allow an OpenCL application (even pre-compiled) to transparently and efficiently explore the set of accelerators available in a distributed Linux system environment, using portable BSD sockets for communication. The approach is validated using reference OpenCL benchmarks, which prove the rOpenCL compliance with the OpenCL 1.2 specification, as well as the robustness and scalability of the implementation.
Description
Keywords
OpenCL Sockets C Sistemas heterogéneos Sistemas distribuídos