December 04, 2012 by Tony DeYoung
AMD has released the V1.0 of CodeXL, a unified developer tool suite that enables developers to quickly and easily identify performance issues and programming errors in applications, without requiring source code modifications.
CodeXL includes comprehensive GPU debugging, GPU and CPU profiling, static OpenCL kernel analysis and a standalone user interface on Windows and Linux for enhanced accessibility and navigation.
Highlights of AMD CodeXL v1.0 include:
- GPU Debugger – provides a comprehensive debugging on AMD APUs/GPUs with OpenCL, OpenGL API calls and OpenCL kernels. It allows you to step through real-time OpenCL kernels from API calls, put breakpoints and debug inside the kernel, view all variable values and track API call histories – all on a single computer with a single GPU.
- CPU Profiler – a profiling suite that helps you to identify, investigate and tune application performance on AMD CPUs. It finds time critical hotspots in your code precisely with time-based, event-based and instruction-based sampling, and also allows you to narrow profiling to single process and capture profiling data for OpenCL codes running on the CPU. In addition, call graph profiling provides a butterfly view of your function calls with the trace history.
- GPU Profiler - a complete GPU profiler that you can use to discover bottlenecks in your OpenCL and DirectCompute applications, and find ways to improve performance on AMD APUs/GPUs. It collects and visualizes GPU counter data, application trace, kernel occupancy and hotspot analysis, with comprehensive timeline and summary views of host, kernel and data transfers in between.
- Static Analyzer – a handy utility to analyze your OpenCL application statically, without having to run on the actual hardware. It enables you to compile, analyze and disassemble your OpenCL code, estimate accurate performance of kernels and view disassembly of the generated hardware kernel.
For further information about CodeXL, visit the CodeXL homepage.
December 04, 2012 by Tony DeYoung
The new APP SDK 2.8 includes dozens of new and improved samples for OpenCL, Aparapi and C++ AMP that deliver significantly faster performance than APP SDK 2.7 – up to 2.3x faster on average in nine key benchmarks.
The APP SDK 2.8 also includes a preview version of AMD’s new open source C++ template library, codename “Bolt.”
Bolt is an STL compatible template library of data parallel primitives and provides a standard way to develop an application that can execute on either a regular CPU, or use any available OpenC™ capable accelerated compute unit, with a single code path.
V2.8 also SDK also improves and extends OpenCL capabilities by including support for the Direct3D 11 sharing Khronos extension in addition to including 64-bit atomics.
December 02, 2012 by Tony DeYoung
The new OpenCL 1.2 extensions provide enhanced parallel programming flexibility, functionality and performance through updates and additions including:
- enabling an OpenCL image to be created from a OpenGL multi-sampled texture that is designed for multi-sampled anti-aliasing using color or depth, providing more flexibility in interoperating 3D graphics and compute
- creating 2D images from an OpenCL buffer to enable flexibility in which memory structures are processed using the advanced properties of OpenCL images
- providing security features for WebCL implementations layered over OpenCL including: the ability to initialize local and private memory before a kernel begins execution, and a new query and API to terminate an OpenCL context to ensure a long running kernel does not affect system stability
- loading an OpenCL program object from a Standard Portable Intermediate Representation (SPIR) instance. SPIR is a vendor neutral non-source representation for OpenCL C programs that enables increased tool chain flexibility and avoids the need to ship kernel source in commercial applications
November 19, 2012 by Tony DeYoung
Altera has added their OpenCL SDK to the new Quartus II V12.1. Quartus II design suite support and simplify the development of today’s advanced programmable systems including CPU cores, digital signal processing (DSP) blocks and multiple IP sub-systems. The tools can dramatically increase designer productivity by offering higher levels of design abstraction such as OpenCL.
November 15, 2012 by Tony DeYoung
At SC12, AMD not only got the #1 award for the powerful and energy efficient supercomputer (SANAM) powered primarily by GPUs, AMD also announced an expansion of its software ecosystem by launching a series of tools that will enable HPC developers to take advantage of GPU compute with programming methodologies that integrate OpenCL. (See press release).
Green500 ranking
Powered by 420 AMD FirePro S10000 dual-GPU server graphics cards, the SANAM supercomputer was ranked #2 on Green500 list overall and #1 for GPU-powered systems, beating out the Tesla K20X-based systems. The FirePro-powered system can sustain 420 TFLOPS, providing a system energy efficiency of over 2.3 GFLOPS per watt and performing 2,351 million calculations per second per watt.
Maturing OpenCL Tools for HPC Developers
Accelereyes ArrayFire: Accelereyes is dedicated to delivering fast, simple GPU software. The general availability release of ArrayFire, a GPU software acceleration library that provides hundreds of functions already optimized for speed by top GPU computing experts, allows for easy integration into C, C++, Fortran and Python applications;
Portland Group (PGI) Accelerator compilers: PGI Accelerator Fortran, C and C++ compilers target the AMD line of APUs as well as the AMD line of discrete GPU accelerators. PGI continues to work closely with AMD to extend its PGI Accelerator directive-based compilers. The goal is to generate code directly for AMD GPU accelerators, and to generate heterogeneous x64+GPU executable files that automatically use both the CPU and GPU compute capabilities of AMD APUs;
CAPS Entreprise HMPP compiler: CAPS Entreprise is a leading provider of solutions for deploying applications on “many-core” systems. CAPS source-to-source HMPP compiler is based on C, C++, and Fortran directives and supports OpenACC and OpenHMPP standards. With help from AMD, the compiler incorporates a powerful OpenCL parallel data generator;
AMD CodeXL: AMD CodeXL is a comprehensive tool suite that enables developers to harness the benefits of AMD CPUs, GPUs and APUs. It includes powerful GPU debugging, comprehensive GPU and CPU profiling, and static OpenCL kernel analysis capabilities, enhancing accessibility for software developers to enter the era of heterogeneous computing. AMD CodeXL is available as both a Visual Studio extension and as a standalone user interface application for Windows and Linux.