Author Topic: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID] (Read 107254 times)

digitalindustry

Seriously though I have no idea how to compile it anyhow, but if it works I guess its a share changer.

digitalindustry

Quote from: barwizi on November 10, 2013, 09:31:38 am

Quote from: gigawatt on November 10, 2013, 08:48:14 am
I'm happy to report that it works like a charm.
I've updated the source code.
Feel free to test it for yourself.

It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio. Latency has been made a non-issue due to immediate return on collision detection and external thread termination.
Quote from: barwizi on November 10, 2013, 09:31:38 am
Quote from: gigawatt on November 10, 2013, 08:48:14 am
I'm happy to report that it works like a charm.
I've updated the source code.
Feel free to test it for yourself.

It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio. Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:
Step 31 of 128... HITS: 743672 MISS: 1353480 TOTAL: 1966081 PCT: 35 Step 32 of 128... HITS: 760208 MISS: 1336944 TOTAL: 2031617 PCT: 36 RAINBOW HIT SEEK HASH: 423473256423089 ORIG NONCE: 28524699 RAIN NONCE: 5439512 + 1 0: 203104387400228 1: 423473256423089 2: 1041792957055939 3: 780112116212973 4: 629783454762940 5: 675829228796259 6: 738539531672861 7: 577969205224283 NAILED IT! Nonce: 28524696 + 3 0: 959239184726270 1: 186912846539772 2: 633556876464486 3: 423473256423089 4: 553069074389047 5: 746169572180730 6: 539623141106693 7: 588368479534609 Checking Work: Nonce A: 5439513 Nonce B: 28524699 Hash A: 423473256423089 Hash B: 423473256423089

And the view from the outside: http://i.imgur.com/rMUKK5t.png

so this is for cpu? how to compile for windows?

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:
Step 31 of 128... HITS: 743672 MISS: 1353480 TOTAL: 1966081 PCT: 35 Step 32 of 128... HITS: 760208 MISS: 1336944 TOTAL: 2031617 PCT: 36 RAINBOW HIT SEEK HASH: 423473256423089 ORIG NONCE: 28524699 RAIN NONCE: 5439512 + 1 0: 203104387400228 1: 423473256423089 2: 1041792957055939 3: 780112116212973 4: 629783454762940 5: 675829228796259 6: 738539531672861 7: 577969205224283 NAILED IT! Nonce: 28524696 + 3 0: 959239184726270 1: 186912846539772 2: 633556876464486 3: 423473256423089 4: 553069074389047 5: 746169572180730 6: 539623141106693 7: 588368479534609 Checking Work: Nonce A: 5439513 Nonce B: 28524699 Hash A: 423473256423089 Hash B: 423473256423089

And the view from the outside: http://i.imgur.com/rMUKK5t.png

so this is for cpu? how to compile for windows?

^
This is barwizi hating life lol , I think ill ramp up the core 2 duo barwizi...

bytemaster

Quote from: gigawatt on November 10, 2013, 08:48:14 am

I'm happy to report that it works like a charm.
I've updated the source code.
Feel free to test it for yourself.

It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio. Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:
Step 31 of 128... HITS: 743672 MISS: 1353480 TOTAL: 1966081 PCT: 35 Step 32 of 128... HITS: 760208 MISS: 1336944 TOTAL: 2031617 PCT: 36 RAINBOW HIT SEEK HASH: 423473256423089 ORIG NONCE: 28524699 RAIN NONCE: 5439512 + 1 0: 203104387400228 1: 423473256423089 2: 1041792957055939 3: 780112116212973 4: 629783454762940 5: 675829228796259 6: 738539531672861 7: 577969205224283 NAILED IT! Nonce: 28524696 + 3 0: 959239184726270 1: 186912846539772 2: 633556876464486 3: 423473256423089 4: 553069074389047 5: 746169572180730 6: 539623141106693 7: 588368479534609 Checking Work: Nonce A: 5439513 Nonce B: 28524699 Hash A: 423473256423089 Hash B: 423473256423089

And the view from the outside: http://i.imgur.com/rMUKK5t.png

Wow, this looks good.... I need to understand it a bit better first. You have come up with a new algorithm, how many HPM do you get with it on a CPU and how much RAM are you using?

barwizi

Quote from: gigawatt on November 10, 2013, 08:48:14 am

I'm happy to report that it works like a charm.
I've updated the source code.
Feel free to test it for yourself.

It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio. Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:
Step 31 of 128... HITS: 743672 MISS: 1353480 TOTAL: 1966081 PCT: 35 Step 32 of 128... HITS: 760208 MISS: 1336944 TOTAL: 2031617 PCT: 36 RAINBOW HIT SEEK HASH: 423473256423089 ORIG NONCE: 28524699 RAIN NONCE: 5439512 + 1 0: 203104387400228 1: 423473256423089 2: 1041792957055939 3: 780112116212973 4: 629783454762940 5: 675829228796259 6: 738539531672861 7: 577969205224283 NAILED IT! Nonce: 28524696 + 3 0: 959239184726270 1: 186912846539772 2: 633556876464486 3: 423473256423089 4: 553069074389047 5: 746169572180730 6: 539623141106693 7: 588368479534609 Checking Work: Nonce A: 5439513 Nonce B: 28524699 Hash A: 423473256423089 Hash B: 423473256423089

And the view from the outside: http://i.imgur.com/rMUKK5t.png

so this is for cpu? how to compile for windows?

iruu · « *Last Edit: November 10, 2013, 09:32:22 am by iruu* »

So what are the precise terms of the $5000 bounty now? So that there're no doubts whether something meets it or not.

Let's say a gpu miner uses x amount of memory and searches the 2^23 sha space in y seconds, on a modern graphics card. What're the x and y for the bounty?
Or is memory unimportant, just that it can work effectively on a gpu? Effectively defined how?
Or is it the relation? When memory use is cut in half, speed drops less than a half (ie. relation is lower than linear).

Can a proof of concept use simpler hash functions than sha512, if the algorithm is hash algorithm-agnostic?

Or is it off?

gigawatt · « *Last Edit: November 10, 2013, 08:54:58 am by gigawatt* »

I'm happy to report that it works like a charm.
I've updated the source code.
Feel free to test it for yourself.

It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio. Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:
Step 31 of 128... HITS: 743672 MISS: 1353480 TOTAL: 1966081 PCT: 35 Step 32 of 128... HITS: 760208 MISS: 1336944 TOTAL: 2031617 PCT: 36 RAINBOW HIT SEEK HASH: 423473256423089 ORIG NONCE: 28524699 RAIN NONCE: 5439512 + 1 0: 203104387400228 1: 423473256423089 2: 1041792957055939 3: 780112116212973 4: 629783454762940 5: 675829228796259 6: 738539531672861 7: 577969205224283 NAILED IT! Nonce: 28524696 + 3 0: 959239184726270 1: 186912846539772 2: 633556876464486 3: 423473256423089 4: 553069074389047 5: 746169572180730 6: 539623141106693 7: 588368479534609 Checking Work: Nonce A: 5439513 Nonce B: 28524699 Hash A: 423473256423089 Hash B: 423473256423089

And the view from the outside: http://i.imgur.com/rMUKK5t.png

gigawatt

Quote from: bytemaster on November 10, 2013, 06:30:51 am

I don't have time at the moment to go through your method, but does momentum_verify(...) work on your results?

I'm working out a few kinks right now. It said it found a hit earlier, but I accidentally had it dump the wrong info, so I had no way to verify.

Quote from: bytemaster on November 10, 2013, 06:30:51 am

This may be true for bandwidth of calculations, but what of latency? Suppose you turn this from a 1 minute search into a 768 minute search... if you perform it a billion times in parallel you will have amazing bandwidth, but terrible latency.

Absolutely correct. I edited my main post to reflect handling that.
The simple solution: as soon as you find any birthday collisions, return immediately. If it found a hit 2 seconds in, spending the next ~750 minutes to finish off the run before returning makes the collisions stale, regardless how many were found.

The second is allowing the parent thread to terminate the children whenever a new block is found. This is already handled in the main ProtoShares branch since commit 3dfc3cc33.

Given both of these capabilities, latency is no longer an issue.

bytemaster

I don't have time at the moment to go through your method, but does momentum_verify(...) work on your results?

Quote

This sounds useless at first, but consider the following: given that we can arbitrarily scale the memory requirements, the only limiting factor how fast we can find collisions is our brute computing power. What's good at brute computing power? GPUs and ASICs.

This may be true for bandwidth of calculations, but what of latency? Suppose you turn this from a 1 minute search into a 768 minute search... if you perform it a billion times in parallel you will have amazing bandwidth, but terrible latency.

gigawatt · « *Last Edit: November 10, 2013, 06:10:36 am by gigawatt* »

Quote from: bytemaster on November 09, 2013, 11:57:36 pm

You are right... I owe you that tip... sorry I have been very busy can you send me a BTC address. Thanks

Ditto.

In other news, I think I've finally, finally, actually done it (again).
As we discussed previously, it looks like using a cycle-detection algorithm is no-go because the hash doesn't map to itself.

However, what does work is making a pseudo-rainbow table in memory, on the fly.
The result: the ratio of memory used vs search rate remains exactly the same, but we can scale the size arbitrarily. For example, instead of taking 768 MB memory and 1 minute per search, we can use (768 MB / X) memory and X minutes per search.

This sounds useless at first, but consider the following: given that we can arbitrarily scale the memory requirements, the only limiting factor how fast we can find collisions is our brute computing power. What's good at brute computing power? GPUs and ASICs.

Here's my proof:
Overview: https://en.wikipedia.org/wiki/Rainbow_table Make a "rainbow table" map to hold the equivalent hash space of SEARCH_SPACE For chains of length CHAIN_LENGTH, rainbow table needs (SEARCH_SPACE / CHAIN_LENGTH) elements [name: RAINBOW_ENTRIES] Rainbow table will require (8 bytes (hash) + 4 bytes (nonce)) * RAINBOW_ENTRIES bytes total To compress each SEARCH_SPACE_BITS hash to a new nonce in the range MAX_MOMENTUM_NONCE, simply use (hash % MAX_MOMENTUM_NONCE) Pseudocode: For the first RAINBOW_ENTRIES nonce = element number For i = 1 to CHAIN_LENGTH cur_hash = HASH(nonce) Check if any part of cur_hash is in rainbow table If yes and generated by different nonce: add to result list For each chunk in cur_hash rainbow_map[chunk] = element number nonce = compress(cur_hash) For the next (SEARCH_SPACE - RAINBOW_ENTRIES) elements: nonce = element number For i = 1 to CHAIN_LENGTH cur_hash = Hash(nonce) Check if any part of cur_hash is in rainbow table If yes and generated by different nonce: add to result list nonce = compress(cur_hash) return result list // MATH SECTION ///// Given: CALLS(CHAIN_LENGTH) = total calls to SHA512 using a rainbow table with chain length CHAIN_LENGTH CALLS(CHAIN_LENGTH) = (SEARCH_SPACE / CHAIN_LENGTH) + (SEARCH_SPACE * CHAIN_LENGTH) - (SEARCH_SPACE / CHAIN_LENGTH) + (SEARCH_SPACE / CHAIN_LENGTH) = Calls to generate rainbow table + (SEARCH_SPACE * CHAIN_LENGTH) = Calls to check SEARCH_SPACE nonce values - (SEARCH_SPACE * CHAIN_LENGTH) = Except we skip the nonces used to make the rainbow table CALLS(CHAIN_LENGTH) = (SEARCH_SPACE * CHAIN_LENGTH) Given: MEMORY(CHAIN_LENGTH) = number of bytes required to store a rainbow table representing SEARCH_SPACE elements with chain length CHAIN_LENGTH MEMORY(CHAIN_LENGTH) = rainbow table elements * (size of hash chunk + size of nonce) MEMORY(CHAIN_LENGTH) = (8 + 4) * SEARCH_SPACE / CHAIN_LENGTH Examples: Minimum number of SHA512 calls is CHAIN_LENGTH = 1 (equivalent to the "default" semiordered map setup) Requires (SEARCH_SPACE * 12) = 768 MB memory Requires (SEARCH_SPACE) calls to SHA512 For CHAIN_LENGTH = N Requires (SEARCH_SPACE * 12 / N) bytes of memory Requires (SEARCH_SPACE * N) calls to SHA512 For CHAIN_LENGTH = 1024 Requires (2^26 * 12 / 1024) = 768 KB of memory Requires (2*26 * 1024) calls to SHA512 Bottom line: Using a rainbow table map of chain length N, memory usage is multiplied by a factor of 1/N and SHA512 calls are multiplied by a factor of N This means that regardless the CHAIN_LENGTH selected, the ratio of CALLS(CHAIN_LENGTH) to MEMORY(CHAIN_LENGTH) is inversely proportional. EFFICIENCY IS CONSTANT AT ANY MEMORY TARGET. All that matters is brute computational strength. Example: If a GPU/ASIC/FPGA has PROCESSOR number of parallel computing units, and MEM_MAX bytes of available memory, then each Momentum hash instance can use up to (MEM_MAX / PROCESSOR) bytes of memory. To calculate the CHAIN_LEN to acheive this: CHAIN_LEN(MEM_MAX, PROCESSOR) = (8 + 4) * SEARCH_SPACE / (MEM_MAX / PROCESSOR) CHAIN_LEN(MEM_MAX, PROCESSOR) = (8 + 4) * SEARCH_SPACE * PROCESSOR / MEM_MAX Note: This calculation is just the inverse of MEMORY(CHAIN_LEN) Assuming: GPU with 128 compute units, 1 GB memory, and SEARCH_SPACE = MAX_MOMENTUM_NONCE CHAIN_LEN(2^30, 2^7) = (8 + 4) * SEARCH_SPACE * PROCESSOR / MEM_MAX CHAIN_LEN(2^30, 2^7) = (8 + 4) * SEARCH_SPACE * 2^7 / 2^30 CHAIN_LEN(2^30, 2^7) = 12 * SEARCH_SPACE / 2^23 CHAIN_LEN(2^30, 2^7) = 12 * 2^26 / 2^23 CHAIN_LEN(2^30, 2^7) = 12 * 2^3 CHAIN_LEN(2^30, 2^7) = 96 Verification: MEMORY(CHAIN_LENGTH) = (8 + 4) * SEARCH_SPACE / CHAIN_LENGTH MEMORY(96) = (8 + 4) * 2^26 / 96 MEMORY(96) = 8388608 bytes MEMORY(96) = 8 MB CALLS(CHAIN_LENGTH) = (SEARCH_SPACE * CHAIN_LENGTH) Result: 1/96th the normal memory usage of 768 MB, 96x the SHA512 calls. The problem is no longer about memory, it's about computation speed.

I've actually got a somewhat working proof of concept, although it's quite sub-optimal. Right now the main issue is using the overkill boost::unordered_map for storage instead of a more bespoke memory structure. The boost implementation likes to decide it needs to redo the bucket arrangement from time to time, resulting in slowdowns.

It also has an unholy number of debug statements.

If you'd like to check on my progress, the source is right here. It's a drop-in replacement for "momentum.cpp".

Edit: There's other optimizations possible when using the time/memory tradeoff. For example, if we normally took 1 minute per nonce space search, cranking up the time/memory ratio to 16 would mean we'd expect 16 minutes per search, which is considerably longer than the target time per block. To compensate, we can have the algorithm return immediately on first collision discovery. When a new block is detected, all the miner threads should be stopped and recreated.

bytemaster

Quote from: AnonyMint on November 09, 2013, 09:30:06 pm

Quote from: bytemaster on November 03, 2013, 09:28:00 pm
I am looking for algorithmic weaknesses in the Momentum Proof of Work: http://bitsharestalk.org/index.php?topic=21.msg24#msg24

If you can identify an alternative algorithm for solving the proof of work that does not require the target RAM and yet is able to find hashes at a rate greater than MomentumAlgoHashRate / (TARGET_RAM_USE/ALT_RAM_USE) then you could win this bounty.

For example: Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz. If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for. I paid out a tip for showing this property doesn't quite hold as I thought, but this was more a weakness in my devising an effective / objective measure for determining whether or not momentum is "broken' or 'hacked'.

For the record, I did not receive any tip.

You are right... I owe you that tip... sorry I have been very busy can you send me a BTC address. Thanks

AnonyMint · « *Last Edit: November 09, 2013, 09:52:21 pm by AnonyMint* »

Quote from: bytemaster on November 03, 2013, 09:28:00 pm

I am looking for algorithmic weaknesses in the Momentum Proof of Work: http://bitsharestalk.org/index.php?topic=21.msg24#msg24

If you can identify an alternative algorithm for solving the proof of work that does not require the target RAM and yet is able to find hashes at a rate greater than MomentumAlgoHashRate / (TARGET_RAM_USE/ALT_RAM_USE) then you could win this bounty.

For example: Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz. If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for. I paid out a tip for showing this property doesn't quite hold as I thought, but this was more a weakness in my devising an effective / objective measure for determining whether or not momentum is "broken' or 'hacked'.

For the record, I did not receive any tip.

Quote from: AnonyMint on November 08, 2013, 10:47:15 pm

Quote from: bytemaster on November 08, 2013, 09:10:12 pm
1) What is the performance of SCRYPT on CPU vs GPU?

From the Litecoin Wiki, I seem to remember roughly 5 - 10 times faster for high-end GPU compared i7. Compared to roughly 100x faster for Bitcoin. You can double-check.

https://litecoin.info/Comparison_between_Litecoin_and_Bitcoin
https://litecoin.info/Mining_hardware_comparison
https://en.bitcoin.it/wiki/Mining_hardware_comparison

If we compare the i7 2600 both on pooler's cpuminer, then ratio to the HD7970 is roughly 30 for Bitcoin and 15 for Litecoin. So Litecoin is only a 2x improvement over Bitcoin, not the order-of-magnitude I had indicated.

AnonyMint · « *Last Edit: November 08, 2013, 10:54:32 pm by AnonyMint* »

Quote from: bytemaster on November 08, 2013, 10:22:44 pm

The whole theory/practice thing is what makes this whole discussion so hard.

The additional threads masking latency you have already the evidence of your own experience with hyperthreads, and also you can see this in analyzing the reports of cpuminer settings with Litecoin on the GPU.

On cpuminer, the "lookup gap" is the trading off memory for computation. The other main setting is the number of threads to run.

AnonyMint · « *Last Edit: November 08, 2013, 11:43:02 pm by AnonyMint* »

Some high-end CPUs don't have AGPUs. Look at the Bitcointalk thread for your protoshares launch, you see many Xeons and high-end i7. The server chips don't have AGPUs.

Quote from: bytemaster on November 08, 2013, 09:10:12 pm

Integrated Graphics can saturate its memory bus (60 GB/Sec on latest hardware)

Is that limited amount of sideport memory or the full main DRAM?

The fastest I'd seen was the 40 GB/sec for Ivy-E and those don't have AGPUs.

On Haswell, Ivy, Sandy with AGPU I saw roughly 20 GB/sec, but I wasn't clear if the AGPU can attain that since it has worse latency to main memory than the CPU.

Quote from: bytemaster on November 08, 2013, 09:10:12 pm

though I think in this particular use case moving more than 64 bytes (cache line size) per cycle doesn't help due to the random access nature. In fact, it may hurt the GPU by being forced to consume its memory bandwidth in larger than necessary chunks when it is randomly accessed.

If you have enough parallel threads they will to some extent statistically clump in locality of memory. I haven't attempted to quantify that effect.

Hashtables would point to arrays of items in the bucket.

Quote from: bytemaster on November 08, 2013, 09:10:12 pm

So there is a 26x difference between average CPU based systems and high end graphics cards.

Rough estimate agreed. On Haswell or latest, maybe half that.

Then higher if no AGPU integrated.

Quote from: bytemaster on November 08, 2013, 09:10:12 pm

1) What is the performance of SCRYPT on CPU vs GPU?

From the Litecoin Wiki, I seem to remember roughly 5 - 10 times faster for high-end GPU compared i7. Compared to roughly 100x faster for Bitcoin. You can double-check.

Quote from: bytemaster on November 08, 2013, 09:10:12 pm

2) Does momentum have a lower performance ratio than SCRYPT?

Appears to be worse than Litecoin. But you gain fast verification, so for you that might be the correct tradeoff, especially if integrated graphics will be improving on CPUs (so you might reach parity with Litecoin or better in future) and the ASIC risk is not higher than Scrypt (or if you are not so worried about ASIC near-term).

Quote from: bytemaster on November 08, 2013, 09:10:12 pm

3) When it comes to making an ASIC, which would be harder SCRYPT or Momentum?

I haven't analyzed this for Momentum. I have analyzed it for Scrypt. I can comment after others have summarized if I have something to add.

Quote from: bytemaster on November 08, 2013, 09:10:12 pm

As far as how to divided the bounty among the various contributors I am thinking that AnonyMint and gigawatt have both provided some reasonable contributions. I will wait to see gigawatt's proof-of-concept implementation to be sure there is a cycle and that the computational time is not larger than available parallelism before making the final decision about relative payout. Gigawatts proof-of-concept shows more time and energy invested and is also more compelling than theoretical estimations of performance gains provided by AnonyMint. Clearly writing code is of greater value than writing forum posts and also more definitive.

I have no problem with that. I am not expecting anything, but I think it does help you in the community to be fair and reward help. But I see we have a new comment claiming that Gigawatts vulnerability is not valid. I would go slowly and be sure first.

Quote from: bytemaster on November 08, 2013, 09:10:12 pm

Good work everyone, it is a pleasure debating this with you all. I hope you recognize that I am in this to find the best algorithm and NOT to defend my ego and put my head in the sand. This bounty and ProtoShares has brought out great minds and we sharpen one another.

I appreciate your clarification. I know you were sick the prior day, because you mentioned it.

bytemaster

Quote from: iruu on November 08, 2013, 09:58:13 pm

The cycle algorithm idea is wrong, there's no cycle in the 50 bit hash. The cycle is in 512 bit sha output, but that's equivalent to finding sha512 collisions.

Regardless of birthday algorithm, generating 2^23 sha512 on radeon 5870 (no oc) takes 180ms. 64 bit values and 1024 block makes parallelism on current compute units hard. That's close to output of new intel cpu. So gpus aren't going to win by a big margin, like with bitcoin's sha256.
The collision finding time, even on a cpu, is almost insignificant, 2^26 is a small value.
I'm writing gpu miner as an opencl learning experience. Looks like finding all birthdays in 2^26 hashes is going to take 200-400ms on radeon 5870.

iruu, thank you for joining the discussion. You confirmed my belief that there were no cycles and that the GPU would have a hard time in practice doing this. The whole theory/practice thing is what makes this whole discussion so hard.

iruu · « *Last Edit: November 08, 2013, 11:36:59 pm by iruu* »

The cycle algorithm idea is wrong, there's no cycle in the 50 bit hash. The cycle is in 512 bit sha output, but that's equivalent to finding sha512 collisions.

Regardless of birthday algorithm, generating 2^23 sha512 on radeon 5870 (no oc) takes 180ms. 64 bit values and 1024 block makes parallelism on current compute units hard. That's close to output of new intel cpu. So gpus aren't going to win by a big margin, like with bitcoin's sha256.
The collision finding time, even on a cpu, is almost insignificant, 2^26 is a small value.
I'm writing gpu miner as an opencl learning experience. Looks like finding all birthdays in 2^26 hashes is going to take 200-400ms on radeon 5870.

edit: 70ms for sha512, so not that bad. (optimizing opencl is completely different from cpu, and I'm not talking about obvious sequential vs parallel differences), time for the harder part.

Author Topic: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID] (Read 107254 times)

digitalindustry

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

digitalindustry

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

bytemaster

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

barwizi

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

iruu

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

gigawatt

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

gigawatt

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

bytemaster

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

gigawatt

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

bytemaster

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

AnonyMint

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

AnonyMint

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

AnonyMint

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

bytemaster

Re: $5000 Bounty - Momentum Proof-of-Work Hacking

iruu

Re: $5000 Bounty - Momentum Proof-of-Work Hacking