Re: Re: Cuckoo hashing in OpenCL

himanshu.gautam wrote:

Looks like your sample needs boost to compile. Any more surprises, in case i try to compile it?
The code anyways looks complex, as i have no idea what cuckoo is. Maybe you can create a smaller testcase, which is easy to compile and run by other developers.

I decided to use boost (1.46.1) because it's cross platform and makes my life easy with strings and random numbers. It would be difficult to replace it with something simpler and still cross platform.

There won't be any more surprises I believe.

The code is pretty much the small testcase, so I'll try to explain it a bit

What we want

We have an array of (key, value) pairs and we want to store them in a way that is fast to retrieve a specific pair.

On this piece of code we focus on building the hash table, not retrieval.

What we do

We use a two-level hashing scheme, with the first level shuffling the input pairs and the second implementing the cuckoo hashing. The first 3 kernels is the first part, the last kernel is the second part.

In the first part, all threads work together, so if one fails, all of them restart. In the second part, each workgroup takes one part of the shuffled data and works independently, so if one thread fails, all threads in this workgroup restart (will understand later...).

Right now, we will ignore the first part and focus on the cuckoo hashing.

Theory

Cuckoo hashing is a dynamic hashing procedure, which means that the position of its key on the hash table is not based on a deterministic function, but a probabilistic one.

The whole hash table is broken into a number of subtables, 3 in our case (SUBTABLES in my code).

On the serial version of the algorithm, its key draws a random number and, based on that, tries to enter its value on the first subtable. If another key has already entered its value on this position, it draws another random number and tries again on the next subtable.

The procedure continues until all pairs have been written in an empty location.

If a pair hasn't managed to get into a subtable, the hash table is destroyed and the procedure restarts with different seeds for the random numbers. Hopefully, after a number of attempts, the table will be built...

Note that the hash table is bigger than the input, in order to minimize the conflicts. For example, if we have 100 pairs, the hash table will have size 100(1+gamma) pairs, where 0<gamma<1. That means that in the end the table will have some empty pairs.

In order to retrieve the pair, we also need to store the random numbers used when building the table.

GPU Cuckoo

Now on the parallel gpu version, the cuckoo hashing is performed inside a workgroup. Each workgroup initializes in local memory a hash table (hltables) with (key,value)=(MAX_UINT, MAX_UINT), where MAX_UINT=0xffffffff.

Instead of having 3 subtables, I keep one (hltables) and move to the right index by calculating the offset of each subtable

cur_ofst = tries * subtable_size; // The index at each subtable

Then I calculate the index at the subtable based on the function given in the algorithm and add it to the cur_ofst

bucket_id = ((newRandoms[tries].x + newRandoms[tries].y * value_in) % PRIME) % subtable_size;

cur_ofst += bucket_id; // Update the offset with the index in the subtable

After that, each thread enters its value to the subtable, waits at the local barrier and then checks if its value managed to remain at the table. If not, it moves to the next subtable

hltables[cur_ofst].x = value_in; // Insert your value and hope no one else overwrites it!

barrier(CLK_LOCAL_MEM_FENCE); // Synchronize all the workgroup threads so that the following read makes sense

value_out = hltables[cur_ofst].x; // Now check if your value was actually inserted

If all goes well, all threads have entered at one subtable their pair. If not, the threads that failed signal the rest with the variable alert and the cuckoo hashing restarts

for(attempts=0; attempts<MAX_ATTEMPTS; attempts++) {

...

barrier(CLK_LOCAL_MEM_FENCE);

if(!alert) break; // if nobody has alerted failure, break and save the hashtable

}

In the end, each workgroup copies the hash table it built from local memory to global and saves the seeds for the random numbers that built the table.

Problem

I write the Cuckoo hash table in a file for checking.

This table should contain unique values (except for those that are still 0xffffffff) in the range [0, num_uniqs-1] but doesn't. it looks like some of them are written in contiguous positions and some are overwritten.

References

On the page I link in my first post, there is a dissertation and a paper.

In the dissertation, there is an implementation for cuda in pages 68-69.

In the paper, there is a description of the algorithm in page 4 under paragraph Phase 2.

Thank you for the interest,

Andrew Paschos

Re: Re: Cuckoo hashing in OpenCL

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112