CUDA

CUDA 5.5.20

A C language development environment

The CUDA Toolkit is a C language development environment for CUDA-enabled GPUs.

In a matter of a few years, the programmable graphics processor unit has
developed into an absolute computing workhorse.

With multiple cores driven by very high memory bandwidth, today's GPUs offer incredible resources for both graphics and non-graphics processing.

CUDA 5.5.20 details

Author:
License: Freeware
Price: FREE
Released:
File size: 1.00 MB
Downloads: 350
Keywords: development environment, CUDA GPU, graphic processor, graphic, development, GPU
Author URL: http://www.nvidia.com
CUDA screenshot

User Rating: 4.0 (6 votes)

Windows 10

x32

Windows 10

x64

CUDA Awards

Windows 10 download editor's pick

CUDA for Windows 10 - Full description

The CUDA™ architecture enables developers to leverage the massively parallel processing power of NVIDIA GPUs, delivering the performance of NVIDIA’s world-renowned graphics processor technology to general purpose GPU Computing.

With the CUDA architecture and tools, developers are achieving dramatic speedups in fields such as medical imaging and natural resource exploration, and creating breakthrough applications in areas such as image recognition and real-time HD video playback and encoding.

CUDA enables this unprecedented performance via standard APIs such OpenCL and DirectCompute, and high level programming languages such as C/C++, Fortran, Java, Python, and the Microsoft .NET Framework.

Release Highlights

* Support for the new Fermi architecture, with:
o Native 64-bit GPU support
o Multiple Copy Engine support
o ECC reporting
o Concurrent Kernel Execution
o Fermi HW debugging support in cuda-gdb
o Fermi HW profiling support for CUDA C and OpenCL in Visual Profiler
* C++ Class Inheritance and Template Inheritance support for increased programmer productivity
* A new unified interoperability API for Direct3D and OpenGL, with support for:
o OpenGL texture interop
o Direct3D 11 interop support
* CUDA Driver / Runtime Buffer Interoperability, which allows applications using the CUDA Driver API to also use libraries implemented using the CUDA C Runtime such as CUFFT and CUBLAS.
* CUBLAS now supports all BLAS1, 2, and 3 routines including those for single and double precision complex numbers
* Up to 100x performance improvement while debugging applications with cuda-gdb
* cuda-gdb hardware debugging support for applications that use the CUDA Driver API
* cuda-gdb support for JIT-compiled kernels
* New CUDA Memory Checker reports misalignment and out of bounds errors, available as a stand-alone utility and debugging mode within cuda-gdb
* CUDA Toolkit libraries are now versioned, enabling applications to require a specific version, support multiple versions explicitly, etc.
* CUDA C/C++ kernels are now compiled to standard ELF format
* Support for device emulation mode has been packaged in a separate version of the CUDA C Runtime (CUDART), and is deprecated in this release. Now that more sophisticated hardware debugging tools are available and more are on the way, NVIDIA will be focusing on supporting these tools instead of the legacy device emulation functionality.
o On Windows, use the new Parallel Nsight development environment for Visual Studio, with integrated GPU debugging and profiling tools (was code-named "Nexus"). Please see www.nvidia.com/nsight for details.
o On Linux, use cuda-gdb and cuda-memcheck, and check out the solutions from Allinea and TotalView that will be available soon.
* Support for all the OpenCL features in the latest R195 production driver package:
o Double Precision
o Graphics Interoperability with OpenCL, Direc3D9, Direct3D10, and Direct3D11 for high performance visualization
o Query for Compute Capability, so you can target optimizations for GPU architectures (cl_nv_device_attribute_query)
o Ability to control compiler optimization settings via support for pragma unroll in OpenCL kernels and an extension that allows programmers to set compiler flags. (cl_nv_compiler_options)
o OpenCL Images support, for better/faster image filtering
o 32-bit global and local atomics for fast, convenient data manipulation
o Byte Addressable Stores, for faster video/image processing and compression algorithms
o Support for the latest OpenCL spec revision 1.0.48 and latest official Khronos OpenCL headers as of 2010-02-17

CUDA for Windows 10 - Post your review