Re: VGPR usage questions
any other clues as to what would explain the discrepancy between what the profiler and analyzer are reporting?It is normal. The static analyzer performs evaluations in a completely insulated world...
View ArticleRe: Can't Debug Kernel in Teapot Example
Thanks for offering to help. This is what I get by running C:\Windows\System32\clinfo.exe: C:\Windows\System32>clinfoNumber of platforms: 2 Platform Profile:...
View ArticleRe: Relational built-in function 'select' causes "Segmentation fault"
While trying to find out the set-up details, I've updated the drivers of my AMD GPUs and my CodeXL installations to the latest version and now it compiles fine on both of my AMD GPUs.Thank you for your...
View ArticleRe: How to use OpenCL via SSH for normal users?
Fantastic! I'll try this out. I'm not sure if this solution would be safe to deploy but will be fine for an internal proof of concept. I hope AMD have a solution for the driver to be shipped with the...
View ArticleGL_AMD_bus_addressable_memory - W5100
Hello,I try to use DirectGMA but I cannot find the required extension: GL_AMD_bus_addressable_memory from glxinfo nor when using CodeXL.I use the new APP-SDKv2.9.1 and fglrx-14.41 on the FirePro W5100....
View ArticleHOW to download AMD-APP-SDK-linux-v2.9-1.599.381-GA-x64.tar.bz2
I've been trying to download the AMD APP SDK from http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/ but it doesn't start. Is there someone that knows the...
View ArticleRe: VGPR usage questions
Well, I am uncertain if less VGPRs would yield better performance or not. In order for me to check this, it would seem that I would need to remove code to reduce VGPRs and then check the performance....
View ArticleRe: GL_AMD_bus_addressable_memory - W5100
Thank you very much. That worked. Do you know whether there is documentation around DirectGMA because I found only in the OpenCL Progragmming Guide v2.7 a tiny bit of information.
View ArticleRe: segfault at clBuildProgram
In case anyone from the compiler team is interested, I think I have tracked down the main cause of my segfaults :implicit type casting from single to double or double to single. When I remove all of...
View ArticleCannot run CodeXL
Hello!I have a problem when running CodeXL on Windows 7 x64. Nothing happens when I try to open CodeXL by double click in . What can be a problem? Thanks in advance.
View ArticleRe: Re: Global to private memory copy
cjb80 wrote: ... If I were to change to using vloadn then do I need to worry about the alignment? (assuming that I am on a float2 boundary then I should be at a 64 bit boundary). Thanks!I recently...
View ArticleRe: Re: VGPR usage questions
Currently I am addicted to reducing VGPR usage so that I can get to the next threshold...It is indeed something I can understand! Never forget to look for VALUBusy% and the memory stall metrics. With...
View ArticleTroubleshooting a driver hang
Hello!I am having some serious trouble with a kernel I'm testing. As I'm testing this, I have a small framework to help me check validity and performance. The work dispatch is as follows:(I dream of a...
View ArticleRe: Cannot run CodeXL
evo wrote: Hello!I have a problem when running CodeXL on Windows 7 x64. Nothing happens when I try to open CodeXL by double click in . What can be a problem? Thanks in advance. I'd try launching the...
View ArticleRe: VGPR usage questions
OK, it looks like my VALUBusy% is around 37 to 40% and SALUBusy % is around 6.5%. VALUUtilization is nearly 100%. MemUnitStalled % = ~46%-55%, WriteUnitStalled % = ~30% Does this tell you anything...
View ArticleclBuildProgram() Dereference Bug -- Best place to submit a Bug report?
Hello, I've been trying to build an OpenCL application with pyopencl, but lately I've been hitting a segfault when I call program.build(), which is a wrapper for clBuildProgram(). Oddly, the problem...
View ArticleRe: clBuildProgram() Dereference Bug -- Best place to submit a Bug report?
P.S., data_t is a typedef for float. Additionally, for those who find this via Google, this, oddly, does compile: //Expected Grid Size: num_inputs x num_outputs x 1__kernel void...
View ArticleRe: segfault at clBuildProgram
It's hard to comment what may be the actual reason behind this segfault. However, I agree with you that one cannot expect a segfault against some implicit type casting. If that is the case, it would be...
View ArticleRe: Re: VGPR usage questions
I speculate you're thinking "1 work item ~ 1 thread" and have the output from each WI being written as:output[0] = finalValue[0]; output[1] = finalValue[1]; ... output[N-1] = finalValue[N-1];This will...
View ArticleHardware accelerated decoder
Hi to all, we have been developing an h264 decoder which uses the graphic card as hardware accelerator. The main goal is having multiple instances of such a decoder and, at the same...
View Article