BitShares Forum

Main => General Discussion => Topic started by: bytemaster on November 03, 2013, 09:28:00 pm

Title: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: bytemaster on November 03, 2013, 09:28:00 pm
I am looking for algorithmic weaknesses in the Momentum Proof of Work: http://bitsharestalk.org/index.php?topic=21.msg24#msg24 

If you can identify an alternative algorithm for solving the proof of work that does not require the target RAM and yet is able to find hashes at a rate greater than  MomentumAlgoHashRate / (TARGET_RAM_USE/ALT_RAM_USE)  then you could win this bounty.

For example:  Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz.   If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for.   
      I paid out a tip for showing this property doesn't quite hold as I thought, but this was more a weakness in my devising an effective / objective measure for determining whether or not momentum is "broken' or 'hacked'. 

My original phrasing of this bounty on Bitcoin talk was:

Quote
One goal is to produce an implementation of this proof of work system that negates the algorithmic advantage given to the use of memory and allows much faster solutions.  If you are able to convince me this proof-of-work is no better than Scrypt then you will win the 30 btc bounty.   The objective proof that someone has convinced me will be whether or not I use momentum for BitShares.  I have far more on the line for choosing a bad proof of work than not paying this out.

At the time of the original post 30 BTC was $5000... so I will keep it at $5000.   

Note:  Now that protoshares has been launched based upon Momentum the bounty is much higher if you can implement it and gain a massive mining advantage and then sell the resulting ProtoShares.   Then after you have earned enough profit that way you can share your approach with the community to get another $5000.

Variables I reserve the right to change while still calling it momentum:

Best of luck to you all!
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gatra on November 04, 2013, 08:48:38 pm
is there a deadline for the bounty?
I think I found a weakness but I'm not sure I can prove it before tomorrow.
Is there a working source code that I can download and modify in order to test my theory?

Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 04, 2013, 08:56:11 pm
is there a deadline for the bounty?
I think I found a weakness but I'm not sure I can prove it before tomorrow.
Is there a working source code that I can download and modify in order to test my theory?

Bounty has no deadline. 
https://github.com/InvictusInnovations/ProtoShares/blob/psforkinit/src/momentum.cpp

Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gatra on November 05, 2013, 02:27:47 am
is there a deadline for the bounty?
I think I found a weakness but I'm not sure I can prove it before tomorrow.
Is there a working source code that I can download and modify in order to test my theory?

I thought that I could use rainbow tables in order to store only a small part of the birthday/nonce pairs and still be able to find collisions.
But since the space of nonces is 2^26 and the space of birthdays is 2^50 it wouldn't work because it wuuld find too many artificial collisions.
IF the nonce and birhday spaces were of the same order, then it would work perfectly.... but it looks like you are safe from rainbow tables: I can't think of a way to make them work.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 05, 2013, 02:56:11 am
is there a deadline for the bounty?
I think I found a weakness but I'm not sure I can prove it before tomorrow.
Is there a working source code that I can download and modify in order to test my theory?

I thought that I could use rainbow tables in order to store only a small part of the birthday/nonce pairs and still be able to find collisions.
But since the space of nonces is 2^26 and the space of birthdays is 2^50 it wouldn't work because it wuuld find too many artificial collisions.
IF the nonce and birhday spaces were of the same order, then it would work perfectly.... but it looks like you are safe from rainbow tables: I can't think of a way to make them work.

Thanks for trying!  The more people who try the better :)   One ProtoShares is out there will be a far bigger bounty :)
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 06, 2013, 02:57:58 pm
For example:  Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz.   If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for.   

Best of luck to you all!

I believe there is a weakness where GPUs can run much faster, but it doesn't requiring lowering the memory. If I explain it to you and am correct, do I qualify for at least part of the bounty?

It is possible I am incorrect or have misunderstood some aspect of the algorithm, so let's not jump to conclusions yet.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: Nathaniel B on November 06, 2013, 03:33:39 pm
Would you please clarify exactly what TARGET_RAM_USE is? Is it the total memory required to store the hashtable of all nonce results or some smaller value?

Should we use your example from http://bitsharestalk.org/index.php?topic=21.msg26#msg26 to empirically confirm? Also could you release FreeTrade's current implementation or standalone subset you mentioned there for testing?

Thanks,

Nathaniel
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 06, 2013, 05:13:53 pm
Would you please clarify exactly what TARGET_RAM_USE is? Is it the total memory required to store the hashtable of all nonce results or some smaller value?

Should we use your example from http://bitsharestalk.org/index.php?topic=21.msg26#msg26 to empirically confirm? Also could you release FreeTrade's current implementation or standalone subset you mentioned there for testing?

Thanks,

Nathaniel

The algorithm should work at any scale, but for the sake of this bounty lets set TARGET_RAM_USE at 768MB.

https://github.com/InvictusInnovations/ProtoShares/blob/psforkinit/src/momentum.cpp
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 06, 2013, 05:25:52 pm
For example:  Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz.   If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for.   

Best of luck to you all!

I believe there is a weakness where GPUs can run much faster, but it doesn't requiring lowering the memory. If I explain it to you and am correct, do I qualify for at least part of the bounty?

It is possible I am incorrect or have misunderstood some aspect of the algorithm, so let's not jump to conclusions yet.

There is a big difference between theory and practice when it comes to GPUs and considering the price on the bounty you can afford to provide an implementation.  Your implementation can use a simpler hash function than SHA512 because we are testing the algorithmic complexity not the cryptographic security.

Here are my thoughts on why the GPU cannot do it faster, test them against your theory:
1) I am sure a GPU can calculate the SHA512 birthdays faster.... but it will do no good because the process of comparing every birthday to every other birthday is either:

      a) brute force linear search which could be done in parallel.  The challenge is that such a linear search would be very taxing on the GPU's memory bus to have every thread read every value.

      b) slightly better... use the GPU to sort the data after generating it... such sorting also requires heavy memory bandwidth use and then ultimately ends up with a linear scan of the memory looking for matching neighbors

      d) implement a hash table to some kind on the GPU.  Unfortunately, for the hash table to be effective at finding duplicates items have to be inserted sequentially.

2) Assume we divide and conquer this work between the GPU and CPU... in this case you generate the birthday's in parallel on the GPU then search for collisions on the CPU.  In the middle you have to transfer all of the data across the PCI-E bus which is slower than your memory bus which is already close to the bottleneck. 

3) I count integrated graphics as the CPU


Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 06, 2013, 05:30:29 pm
bytemaster, I don't have spare time to implement. How about I share my thoughts and then you decide if I deserve even a small token payment for my contribution? You might gain some insight into how I think a CPU-only PoW could be implemented, but I won't be providing a complete algorithm today.

I don't even have a GPU handy, nor the development environment for them. I would be speaking on a theoretical basis after studying Scrypt and understanding why Litecoin failed (yet still improved the ratio for GPU advantage compared to Bitcoin).
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 06, 2013, 05:35:07 pm
I will consider a tip if it leads me to make improvements.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 06, 2013, 05:45:09 pm
Okay if I understand correctly your proposed PoW algorithm, we are generating pseudo-random values that can range over the address space of the memory size we target until we generate a duplicate (or triplicate). The Birthday math tells us that we will find a solution with 50% probability before visiting less than roughly 10% of the addresses, or 99% before visiting less than roughly 20% of the addresses. Thus this is a way to require large memory without actually writing to every memory address, i.e. a sparse memory access algorithm. Btw, I also considered this sort of algorithm before and dumped it.

Assuming the generation of the pseudo-random values can be accelerated by parallelism, i.e. that the algorithm is main memory latency bound, then the GPU is going to be much faster, because the GPU masks memory latency by employing 1000s of threads, i.e. we can test 1000s of pseudo-random values in parallel. This was the key myopia that caused Litecoin to fail at stopping GPUs from outperforming CPUs, although it did improve the difference relative to Bitcoin's PoW.

I don't see how the use of a hash table really changes the outcome significantly, as it will only serialize perhaps every 10 buckets and we assume the pseudo-random values are uniformly distributed over the entire address space. Collisions with a 1000 threads are going to be rare.

I have revealed my key insight into PoW. If you weren't aware of this and I am correct, I hope you tip me consumerately regardless if it leads you to make changes or not (I should not be responsible for your success, only for proving what doesn't work).

Edited: to insert word "not" after "I should" that was missing.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 06, 2013, 06:05:10 pm
Okay if I understand correctly your proposed PoW algorithm, we are generating pseudo-random values that can range over the address space of the memory size we target until we generate a duplicate (or triplicate). The Birthday math tells us that we will find a solution with 50% probability before visiting less than roughly 10% of the addresses, or 99% before visiting less than roughly 20% of the addresses. Thus this is a way to require large memory without actually writing to every memory address, i.e. a sparse memory access algorithm. Btw, I also considered this sort of algorithm before and dumped it.

Assuming the generation of the pseudo-random values can be accelerated by parallelism, i.e. that the algorithm is main memory latency bound, then the GPU is going to be much faster, because the GPU masks memory latency by employing 1000s of threads, i.e. we can test 1000s of pseudo-random values in parallel. This was the key myopia that caused Litecoin to fail at stopping GPUs from outperforming CPUs, although it did improve the difference relative to Bitcoin's PoW.

I don't see how the use of a hash table really changes the outcome significantly, as it will only serialize perhaps every 10 buckets and we assume the pseudo-random values are uniformly distributed over the entire address space. Collisions with a 1000 threads are going to be rare.

I have revealed my key insight into PoW. If you weren't aware of this and I am correct, I hope you tip me consumerately.

Ok... a few corrections for you to consider.   The address space of the birthdays is 50 bits.   The TARGET_MEMORY is the number of birthdays that must be stored to have a 90+% probability of finding 1 collision once you have filled TARGET_MEMORY.    As a result, this isn't sparse memory access unless you are operating on some MULTIPLE of the TARGET_MEMORY. 

Your GPU algorithm is still O(C * N^2) where C is some constant factor improvement.
CPU algorithm is O( C * N )

While the GPU can hide latency it cannot change bandwidth and your bandwidth usage is still N^2.

Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 06, 2013, 06:08:47 pm
Okay if I understand correctly your proposed PoW algorithm, we are generating pseudo-random values that can range over the address space of the memory size we target until we generate a duplicate (or triplicate). The Birthday math tells us that we will find a solution with 50% probability before visiting less than roughly 10% of the addresses, or 99% before visiting less than roughly 20% of the addresses. Thus this is a way to require large memory without actually writing to every memory address, i.e. a sparse memory access algorithm. Btw, I also considered this sort of algorithm before and dumped it.

Assuming the generation of the pseudo-random values can be accelerated by parallelism, i.e. that the algorithm is main memory latency bound, then the GPU is going to be much faster, because the GPU masks memory latency by employing 1000s of threads, i.e. we can test 1000s of pseudo-random values in parallel. This was the key myopia that caused Litecoin to fail at stopping GPUs from outperforming CPUs, although it did improve the difference relative to Bitcoin's PoW.

I don't see how the use of a hash table really changes the outcome significantly, as it will only serialize perhaps every 10 buckets and we assume the pseudo-random values are uniformly distributed over the entire address space. Collisions with a 1000 threads are going to be rare.

I have revealed my key insight into PoW. If you weren't aware of this and I am correct, I hope you tip me consumerately.

Ok... a few corrections for you to consider.   The address space of the birthdays is 50 bits.   The TARGET_MEMORY is the number of birthdays that must be stored to have a 90+% probability of finding 1 collision once you have filled TARGET_MEMORY.    As a result, this isn't sparse memory access unless you are operating on some MULTIPLE of the TARGET_MEMORY and that multiple would have to be very high (scaling with the number of GPU threads) if you want to avoid threads stomping on each other with high probability. 

Your GPU algorithm is still O(C * N^2) where C is some constant factor improvement.
CPU algorithm is O( C * N )

While the GPU can hide latency it cannot change bandwidth and your bandwidth usage is still N^2.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 06, 2013, 06:13:51 pm
Here is how I would implement this on a GPU:

Generate 2^26 BIRTHDAYS
GPU SORT
Perform 2^25 checks against neighbors for collisions in parallel.

This would probably be the FASTEST way to implement this proof of work on a GPU but still requires TARGET_MEMORY and thus does not qualify for the bounty.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 06, 2013, 09:36:12 pm
While the GPU can hide latency it cannot change bandwidth and your bandwidth usage is still N^2.

Memory bandwidth is several times faster on the GPU than CPU and you won't get close to saturating memory bandwidth on the CPU because the full random memory access latency is too high and you can only run at most 8 hardware hyperthreads on a core i7 thus unable to hide the latency entirely. Software threads above that 8 count don't matter.

You have leveraged a weakness of the CPU relative to the GPU. To make a CPU-only PoW, you must leverage a strength of the CPU relative to the GPU and ASIC.

On the CPU your algorithm is random main memory latency bound while on the GPU it is main memory bandwidth bound. Thus on the CPU approximately at best 64 bytes cache line size at several hundred clock cycles per random access, and on the GPU up to 260 GB per second if utilizing the full 128 byte cache line size.

Also if the GPU has 6 GB and your target memory is 0.75 GB, then the GPU can run 8 instances and each can reach maximum memory bandwidth.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 06, 2013, 09:39:51 pm
While the GPU can hide latency it cannot change bandwidth and your bandwidth usage is still N^2.

Memory bandwidth is several times faster on the GPU than CPU and you won't get close to saturating memory bandwidth on the CPU because the full random memory access latency is too high and you can only run at most 8 threads on a core i7 thus unable to hide the latency entirely. Software threads above that 8 count don't matter.

Your constant time hardware gains in memory bandwidth and parallelism are combating a N^2 algorithmic disadvantage.    It may be possible to develop a GPU based solution, but how much faster it will be is TBD... will it be 10x faster?  I don't think so.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 06, 2013, 09:55:09 pm
Ok... a few corrections for you to consider.   The address space of the birthdays is 50 bits.   The TARGET_MEMORY is the number of birthdays that must be stored to have a 90+% probability of finding 1 collision once you have filled TARGET_MEMORY.    As a result, this isn't sparse memory access unless you are operating on some MULTIPLE of the TARGET_MEMORY.

Correct that by using a hash table it is no longer physically sparse, so CPU and GPU only need to have the target physical memory. So thus the birthday aspect is useless except for causing the collisions on the serial hash bucks.

Your GPU algorithm is still O(C * N^2) where C is some constant factor improvement.
CPU algorithm is O( C * N )

While the GPU can hide latency it cannot change bandwidth and your bandwidth usage is still N^2.

Only if you assume the GPU can't use a hash table too. I see no reason for that to be the case, and as I wrote in my prior post. And I expect the collisions to be irrelevant. If that last sentence is the only one that is in contention, I can explain more?

Also re-read my prior post, I added to it.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 06, 2013, 09:59:06 pm
This would probably be the FASTEST way to implement this proof of work on a GPU but still requires TARGET_MEMORY and thus does not qualify for the bounty.

The CPU requires the target memory also and would thus be significantly slower due to not even saturating its slower memory bandwidth due to being memory latency bound.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 06, 2013, 10:02:13 pm
While the GPU can hide latency it cannot change bandwidth and your bandwidth usage is still N^2.

Memory bandwidth is several times faster on the GPU than CPU and you won't get close to saturating memory bandwidth on the CPU because the full random memory access latency is too high and you can only run at most 8 threads on a core i7 thus unable to hide the latency entirely. Software threads above that 8 count don't matter.

Your constant time hardware gains in memory bandwidth and parallelism are combating a N^2 algorithmic disadvantage.    It may be possible to develop a GPU based solution, but how much faster it will be is TBD... will it be 10x faster?  I don't think so.

Why do you assume the GPU can't use a hash table also?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: iruu on November 06, 2013, 11:41:17 pm
What about the bounty on bitcointalk. So if someone provides an implementation before or at 15th November, he gets max(30 BTC, $5000), but after, only $5000?

Also, do you have access to windows with a mining capable gpu?

Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: digitalindustry on November 07, 2013, 04:07:53 am
I will independently verify that Anonymint engendered  these ideas first , (well spoke about then anyhow) and if an implementation come from these ideas , surely (some) credit has to go his way.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: Nathaniel B on November 07, 2013, 04:09:44 am
Here's a proof that your algorithm scales sublinear with RAM i.e. adding more RAM up to the TARGET_RAM_USE actually has slight diminishing returns and even in the asymptomatic case is at best linear in speed up.
Informally, my impression is that you misunderstand what the actual steps in your algorithm look like. Because of the difficulty function, a user wants to find as many BirthdayHash collisions as possible rather than just the first match and each time a new nonce is BirthdayHashed and put into the hashtable it is compared to the others rather than all hashes being compared in one n^2 computation at the end.
I'll reference my original post on the bitcoin forum and quote extensively from it.
https://bitcointalk.org/index.php?topic=313479.msg3365654#msg3365654

Let B = the number of ‘birthdays’ (hashspace of BirthdayHash).
Let S = the number of BirthdayHash(nonce+H) stored in RAM.
Let M = maximum number that S can be given a particular system’s RAM.

If we find a new BirthdayHash(A + H), the expected number of matches ( BirthdayHash(A+H)==BirthdayHash(B+H) ) = S / B

Now we calculate the total number of expected matches during a time frame (average block creation time for instance). we take the number of BirthdayHash() we compute in that time (N). S will increase from 0 to N unless M<N, meaning we do not have enough RAM to store N results, from your comments I gather that you intend for M (TARGET_RAM_USE/(nounce size +hashtable overhead per nounce)) >= N.

N * Mean(S) / B = Total expected matches
If M>=N:
Mean(S) = (1 + 2 + 3 + … + N)/N = (N+1)/2

Now let's see what happens when a user only has M/2 RAM available and M/2<N<=M. To when your challenge I must prove that with M/2 RAM a user would still have at least half as many matches ((N+1)/2)*.5.

For the first M/2 hashes both users produce the same results. Then, the user stuck at M/2 RAM only has M/2 nonces to compare each new one to like this:
(1+2+3 + … M/2 + M/2 + … + M/2) = ((M/2)*(M/2 + 1)/2 + (M/2)*(N-M/2))
The mean then is  ((M/2)*(M/2 + 1)/2 + (M/2)*(N-M/2))/N
Total expected matches for M/2 RAM is (Ns cancel out): ((M/2)*(M/2 + 1)/2 + (M/2)*(N-M/2))/B

(M/2 scenario) ((M/2)*(M/2 + 1)/2 + (M/2)*(N-M/2))/B vs (N*(N+1)/2)/B (N<=M scenario)
Plug in the numbers M = 2^26 nonces worth of RAM, N = 2^26 hashes during the time period since it’s always more efficient to continue hashing until the nonce space is complete to avoid extra cost in rebuild the hashtable of potential matches.
(M/2 scenario)
((2^26/2)*(2^26/2 + 1)/2 + (2^26/2)*(2^26-2^26/2)) /2^50= 1.5000000149
vs,
(M scenario)
(2^26*(2^26+1)/2)/2^50 = 2.0000000298

So in the same time frame with half the RAM, I will find ~1.5 matches vs your ~2 matches your scaling is actually worse than a linear increase when you double the RAM.

Let me know if you have any questions. I suggest you run the empirical experiment yourself if you’re not convinced. Just change the code to stop adding new results to the hash table once you’re half way through the nonce space to simulate halving your RAM.

Hope this helps your design. You can get asymptotically linear increases per RAM increase if you make the TARGET_RAM_USE ridiculously huge like 1TB+.


Thanks,

Nathaniel
16EeLa9BrZc1A4sasztaRdnfp3aU8DmuVC
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 04:55:10 am
Nathaniel B,
   Let me say that your post has a lot of math that I am too sick to follow completely to identify the specific problem.  Instead, I would like to go back to some first principles of the algorithm and see if I can summarize what you say your math proves.

   (http://upload.wikimedia.org/wikipedia/commons/thumb/e/e7/Birthday_Paradox.svg/800px-Birthday_Paradox.svg.png)
     
   Based upon this graph it would appear that it all depends upon how far up the curve you set your target.    If you set the target below 23 then you have non linear increases in probability of finding someone.  If you set it above 23 it scales approximately linearly until you hit 40 at which point you start getting less and less benefit by increasing your RAM. 

   Now, with this hash algorithm the total hash power is proportional to the AREA under the curve.   The first person you add has 0 chance.  Everyone less than 10 has less than 10% chance.  Every hash over 23 has a 50% chance.    So performance is clearly tied to how quickly you can increase your probability of finding a result which of course has diminishing returns. 

   Clearly, once you get near 50 people your memory usage can become constant and you will have 95% of the hash power that you would have if your doubled your memory.   Fortunately, we limit the NONCE search space which prevents you from ever going past ~23.    So I stay on the portion of the curve that does gain by increasing RAM.   

  (http://upload.wikimedia.org/wikipedia/commons/thumb/9/9f/Birthday_paradox.svg/575px-Birthday_paradox.svg.png)
 
   Taken to the extreme, of only having memory for 1 birthday you would end up on Q(n) curve. 

   There are two approaches to finding a collision and you can make a trade off of memory for parallelism.    So let me try some math here...

   Lets assume you only have memory for 10 items (50% of target) then your probability of finding a result is .1 * P  where P is the level of parallelism.  As a result if P is 10 then the probability of finding a result after 1 step is 50/50.

   Lets assume P is 1 in the CPU case.   Now instead of doing 10 calculating in parallel you do them sequentially.   In this case after you have performed 10 steps your next calculation is 50/50 which is equal to the parallel case, except that you get the benefit of the area under the curve which means that on average it will take you less than 10 steps to find the result.  Also, in the GPU case the cost of trying again is another 10 checks in parallel, where as the cost of the CPU case with memory increase is less than 1 additional calculation.   

   Conclusion:  reducing memory does cause a linear reduction in the probability of finding one collision,  but you must take the area under the curve to see the true benefits.

   Now at this point you may point out that the GPU could do the first 10 steps in parallel and thus the GPU could have a 50/50 chance of having solved it in 2 steps while the CPU would be 10 steps behind.    I do not think this would be the case however with a 50 bit birthday space. 

Anyway, the proof is in the implementation.   
   
 

Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 07, 2013, 06:14:58 am
Anyway, the proof is in the implementation.   

I think we are sufficiently expert with 3 decades of programming experience, that we can make correct analysis without implementing.

I am still waiting for your replies. Hope you feel better. Try taking 50,000 IU of Vit D3. That kicks my colds to the curb within 24 hours.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: Nathaniel B on November 07, 2013, 07:03:21 am
All your graphs are about finding 1 birthday collision, but miners want to find as many collisions as possible which completely changes the problem. Run the empirical experiment I suggested. Just add the one check so that after half the nounce space is exhausted you stop adding new birthdayHashed nounces to the hashmap to simulate having half the memory. You will see the result I presented.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: Nathaniel B on November 07, 2013, 05:49:47 pm
To clarify, the 50% chance for at least 1 collision for 23 people in your example is with doing n^2 operations, but when you're hashing you check each one against the ones you already have saved each time you add it to the hash table. You're not doing all n^2 operations each time since you've already compared the previous hashes to each other. You're just going n comparisons each time to see if the new hash matches. There's a disconnect between what you describe in your graphs and the actual algorithm. That 50% chance would be after 23 hashes you'd find AT LEAST 1 collision, but miners want to find MULTIPLE collisions which changes the whole scaling.

I like the core idea of using the bday hash to force high memory usage. I think you're idea of saturating the memory bandwidth has promise. I think that with the right parameters you could design mining that would have a very low improvement (perhaps none without the scale of ATI behind it) going to ASICs over GPUs due to GPU optimization of parallel operations and memory bandwidth. You could give CPUs a fighting chance as well since they can leverage high amounts of memory to make up for less raw hashes/memory bandwidth.

Your algorithm as it is though does not scale the way you think it does.

Nathaniel
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 06:43:40 pm
To clarify, the 50% chance for at least 1 collision for 23 people in your example is with doing n^2 operations, but when you're hashing you check each one against the ones you already have saved each time you add it to the hash table. You're not doing all n^2 operations each time since you've already compared the previous hashes to each other. You're just going n comparisons each time to see if the new hash matches. There's a disconnect between what you describe in your graphs and the actual algorithm. That 50% chance would be after 23 hashes you'd find AT LEAST 1 collision, but miners want to find MULTIPLE collisions which changes the whole scaling.

I like the core idea of using the bday hash to force high memory usage. I think you're idea of saturating the memory bandwidth has promise. I think that with the right parameters you could design mining that would have a very low improvement (perhaps none without the scale of ATI behind it) going to ASICs over GPUs due to GPU optimization of parallel operations and memory bandwidth. You could give CPUs a fighting chance as well since they can leverage high amounts of memory to make up for less raw hashes/memory bandwidth.

Your algorithm as it is though does not scale the way you think it does.

Nathaniel

Nathaniel... this is why in my description I reference the AREA under the graph as the sum of probability.    If I cut off the memory then I am losing part of that area with every hash I perform.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 06:52:46 pm
Imagine a coin toss, with only one unit of memory your odds of finding a collision are 1/N.     
With M units of memory your odds are M/N

Because these are independent events.... we can calculate the probability of NOT finding a collision after X birthdays...

(n-1)/N  * (n-2)/N * (n-3)/N............. * (n-x)/N

As x increases so does your chance of finding a match and it multiplies out significantly over time.

If you limit your memory to 2... then the math looks like:
(n-1)/N  * (n-2)/N * (n-2)/N............. * (n-2)/N

Clearly your odds of not finding a match drop much faster the higher X gets... but if you cap X then you lose performance.

The code is out there, the bounty is for someone to prove it wrong with an implementation not fancy handwaving and unbacked claims. 



 
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 07:26:21 pm
I decided I would graph the probability of NOT FINDING a hash after searching the NONCE space using three different amounts of memory... 100%   50% and 25%

(http://the-iland.net/static/images/graphs.png)

Both use cases search the same number of nonces and thus do the same number of SHA512... but you take a performance hit if you reduce your memory...

Obviously, slight reductions in memory don't hurt too much, but if you attempted to reduce your memory from 768 MB down to 1 MB so that you could operate 1000 runs in parallel then your probability of NOT finding a match would be: 
0.368196 WITH MEMORY,
0.997402 WITHOUT MEMORY

So your probability of finding a match is 243x smaller for the same number of computations. 

So I guess this means that you can trade 765 MB of RAM for 243 parallel processes each with 1 MB of RAM and have the SAME hash rate.    Now you only actually decreased your RAM requirements by 33%...  while increasing the number of computations required by 243x...
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 07, 2013, 07:58:57 pm
The point of Scrypt was to make parallel operation difficult and its primary weakness is that validation time suffers as you scale it.

If you don't require that the first match in the sequence of generated pseudo-random numbers, then my upthread described parallelism attack works to allow GPUs to blow away CPUs, assuming the pseudo-random sequence can be generated much faster than the random memory latency bound on the CPU so the GPU can try 100s of values from the same pseudo-random sequence in parallel to hide the memory latency.

If you do require it, then you lose the fast validation time.

Do you get it now or you going to continue to ignore me?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 07, 2013, 08:05:07 pm
From bitcointalk.org thread,  link=topic=313479.msg3363346#msg3363346 date=1382116292

Quote from: bytemaster
If you are able to convince me this proof-of-work is no better than Scrypt then you will win the 30 btc bounty.

Given my immediately prior post, I hereby claim the 30 BTC bounty.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 08:07:50 pm
The point of Scrypt was to make parallel operation difficult and its primary weakness is that validation time suffers as you scale it.

If you don't require that the first match in the sequence of generated pseudo-random numbers, then my upthread described parallelism attack works to allow GPUs to blow away CPUs, assuming the pseudo-random sequence can be generated much faster than the random memory latency bound on the CPU so the GPU can try 100s of values from the same pseudo-random sequence in parallel to hide the memory latency.

If you do require it, then you lose the fast validation time.

Do you get it now or you going to continue to ignore me?

Ok, lets do this on a GPU..... I have found through experience that theoretical gains on the GPU often do not pan out as you expect.   So I need to see an implementation.  $5000 is more than enough to pay for an implementation if you are so sure that you are right.

Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 07, 2013, 08:12:39 pm
Your original bounty didn't ask for an implementation. And $5000 doesn't buy enough time from me.

It is up to you how far you want to go down the wrong road.

There is no way to make CPU-only and fast verification. I realized that recently in analysis and research. I realize this is big problem for your coin design.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 08:15:06 pm
From bitcointalk.org thread,  link=topic=313479.msg3363346#msg3363346 date=1382116292

Quote from: bytemaster
If you are able to convince me this proof-of-work is no better than Scrypt then you will win the 30 btc bounty.

Given my immediately prior post, I hereby claim the 30 BTC bounty.

Wow... so you are a mind reader and a troll.    I have dealt with you extensively in the past and set bitcointalk to IGNORE you... So I will do the same here.   I kindly suggest you go else where because I do not have the time to reply to you with this attitude.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 08:22:57 pm
Your original bounty didn't ask for an implementation. And $5000 doesn't buy enough time from me.

It is up to you how far you want to go down the wrong road.

There is no way to make CPU-only and fast verification. I realized that recently in analysis and research. I realize this is big problem for your coin design.

If you cannot build a GPU algorithm in 6 days at $100 an hour... someone else will.   Even so, you have to CONVINCE ME it is not better than SCRYPT and a GPU implementation would have to have as much gain in performance vs the CPU as the GPU implementation of SCRYPT vs the CPU implementation of SCRYPT just to be in the running.       
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 07, 2013, 08:45:27 pm
[redacted my explanation of why $100 per hour is an insult to me]

You wanted help on insights into weaknesses. I have explained it. If there is something you don't understand, ask.

Your attitude sucks. If you want to discuss why you doubt my claim, then fine we can engage. But if you expect everyone to be your servant, then I will stop helping you.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 07, 2013, 08:50:36 pm
Ok, lets do this on a GPU..... I have found through experience that theoretical gains on the GPU often do not pan out as you expect.   So I need to see an implementation.

Your bounty only called for showing it is NO BETTER than Scrypt. I already did. I don't have to prove it is much faster on the GPU (but it will also be, yet that is irrelevant to how you stated your bounty).

If you choose to blissfully ignore what has been revealed to you, then so be it. I love it if you go down the wrong road. Please do. Worth much more to me than the $5000.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 08:59:21 pm
Ok, lets do this on a GPU..... I have found through experience that theoretical gains on the GPU often do not pan out as you expect.   So I need to see an implementation.

Your bounty only called for showing it is NO BETTER than Scrypt. I already did. I don't have to prove it is much faster on the CPU (but it will also be, yet that is irrelevant to how you stated your bounty).

If you choose to blissfully ignore what has been revealed to you, then so be it. I love it if you go down the wrong road. Please do. Worth much more to me than the $5000.

You have NOT CONVINCED ME... perhaps you have convinced yourself, but that is not the same thing.   If I had to choose between momentum and scrypt I would pick momentum... I have hashed (pun intended) this out with many other intelligent people who also disagree with your assessments.    It is possible to convince me, but it will require much more than handwaving about theoretical performance on theoretical GPUs with a theoretical GPU algorithm...   

ProtoShares is ultimately a big bounty on Momentum that can easily be claimed by anyone who is able to execute the designs you describe with any meaningful effect.    Trust me or not... I have much more to lose than $5000 if this is no better than SCRYPT.   
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 07, 2013, 09:26:00 pm
You have NOT CONVINCED ME...

Perfect. Confirmed my suspicions about what I've heard.

I have hashed (pun intended) this out with many other intelligent people who also disagree with your assessments.

And so why don't you list the technical reasons why you and they disagree.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: Nathaniel B on November 07, 2013, 09:35:07 pm
I decided I would graph the probability of NOT FINDING a hash after searching the NONCE space using three different amounts of memory... 100%   50% and 25%

(http://the-iland.net/static/images/graphs.png)

Both use cases search the same number of nonces and thus do the same number of SHA512... but you take a performance hit if you reduce your memory...

Obviously, slight reductions in memory don't hurt too much, but if you attempted to reduce your memory from 768 MB down to 1 MB so that you could operate 1000 runs in parallel then your probability of NOT finding a match would be: 
0.368196 WITH MEMORY,
0.997402 WITHOUT MEMORY

So your probability of finding a match is 243x smaller for the same number of computations. 

So I guess this means that you can trade 765 MB of RAM for 243 parallel processes each with 1 MB of RAM and have the SAME hash rate.    Now you only actually decreased your RAM requirements by 33%...  while increasing the number of computations required by 243x...

The issue is that mining is NOT about finding 1 or more matches it's about finding as many matches as possible. Your numbers and graphs ALL describe mining as if the goal is to find just the first match whereas your algorithm let's any match be tested against the difficulty function to find a block. A miner wants to find many matches not just the first one.

What sort of implementation proof do you want? If I find the hashtable in the code base and make the change I suggested to artificially limit the number of nonces stored to half and then show it mining at more than half the speed would that be sufficient to convince you and win the bounty?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 10:10:59 pm
Ok... my own graphs show that the area 'over' the green curve is less than half the area under the 'blue' curve which would seem to indicate the principle you are referring to.

Namely, that reducing memory by 50% only hurts performance by ~20% (eyeballing it).       However, it does hurt performance.   

I calculated the area under the curve for a 1/1000 reduction in memory requirement (down to the size of SCRYPT) and performance is 253.58x slower...

To get back to equal performance with single threading you would have to go parallel 254 threads each using 1 MB of memory for a total of 253 MB of memory, a 75% memory reduction.

To run this on a GPU the most parallel threads you could get going on a graphics card your would need  2GB video memory.   The most CPU-core equivalent performance you could get is 8 x.   

All of this assumes of course that the GPU threads are of equal performance to the CPU threads (which they are not).   So an 8-core CPU is still in the same neighborhood has a 2GB GPU using your trick to reduce memory usage.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 07, 2013, 10:23:26 pm
Nathaniel B...  I have been thinking about it and will tip you 0.5 BTC for your contribution to understanding this proof of work and because I specified a poor metric for evaluating whether or not the proof of work was 'broken'.   You have clearly helped.

Thank you.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: Nathaniel B on November 07, 2013, 10:42:43 pm
Thanks. I still think you're underestimating the issue. I think you should change it to have a large nonce space much larger than anyone could store in memory so that you have asymptotically linear scaling in terms of RAM.

Could you clarify what exactly you require in order to claim the full bounty?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 12:40:37 am
Thanks. I still think you're underestimating the issue. I think you should change it to have a large nonce space much larger than anyone could store in memory so that you have asymptotically linear scaling in terms of RAM.

Could you clarify what exactly you require in order to claim the full bounty?

Nathaniel,  to collect the full bounty I must be convinced that a momentum derived proof-of-work is not much better than SCRYPT at hindering an ASIC take over.    There is clearly a lot to learn about the structure and behavior of this class of proof-of-work.   From what I can tell it is still clearly the best proof-of-work system conceived of thus far.     

Let me repost an attack on this proof of work from bitcointalk for discussion here.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 12:41:29 am
gigawatt:
Quote
I've managed to make a huge step forward in proving Momentum being not nearly as good for proof-of-work as intended.

The magic lies in using a Bloom filter to store the intermediate hashes.
As a result, instead of using 12 bytes per hash/nonce in a Semi-Ordered Map (which results in ~750 MB of memory), the required memory is (-1 * (2^26 * ln(0.01)) / (ln(2)^2)), or about ~76 MB.

This number can be reduced arbitrarily if we're willing to have a false positive rate greater than 1%.  For example, if we allowed up to a 50% chance of having a false positive, the memory requirement drops to ~11 MB.


Here's a overview of how the algorithm works:

Make a "main" bloom filter of size 2^26 with a false positive rate 1%: ~76 MB
Make a tiny "clash" bloom filter of size 2^16 and false positive rate 2^-26: ~0.7 MB
Make a vector of pairs< hash, nonce > to store candidate birthday collisions.
For each nonce in the search space, check if its hash exists in the "main" bloom filter.  If it is, add it's entry to the "clash" bloom filter.
The "main" bloom is no longer required and can be discarded.
For each nonce in the search check if its hash exists in the "clash" bloom filter.  If it does, add < hash, nonce > to a candidate list for investigation.
Sort the list of candidates by hash.
For each pair in the candidate list, see if the previous element has the same hash.  If it does, add it to the output list.  This step removes false positives by comparing the actual hash instead of the bloom filter's idea of a hash.
Return the output list as normal.


For your testing pleasure, I also built a working proof of concept.
(Most of the magic is right here.  The bloom filter is a modified version of "bloom.cpp" called "bigbloom.cpp")

Unmodified source: http://i.imgur.com/k2cNrmd.png

Source using bloom filters: http://i.imgur.com/w8Enf9e.png

In exchange for lowering the memory requirements by a factor of ~10, the algorithm runs at about 1/4th speed, mainly due to the doubling calls to SHA512 and the hash calls within the bloom filters.  The overall result is a net efficiency gain of approximately 2.5
The reduction in memory requirement means that if we could fit N instances of Momentum in GPU memory, we can instead fit 10*N instances.  If we up the false positive rate in exchange for more time spent sorting, we can achieve ratios of up to 70*N.

Given that bloom filters, SHA512, and sorting data are all parallel/GPU friendly, we can conclude that Momentum as a proof-of-work isn't nearly as GPU-hard as initially intended.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 12:41:59 am
Quote
gigawatt,
     Thank you for providing the first innovative algorithm for reducing the memory requirements.  Let me attempt to post mitigating factors to your algorithm.

     From a CPU miner perspective, your reduction in memory comes at the expense of performance so does break the algorithmic complexity of the algorithm.

     From a GPU perspective you have to populate a bloom filter with 2^26 results... based upon my understanding of how bloom filters operate this would require updating a common data-structure from every thread and the resulting memory race conditions could create false negatives.   If you have to do this step sequentially, then you might as well use a CPU with memory.   

     So do you have any solid algorithms that can populate a bloom filter with 2^26 results in parallel? 
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 12:43:36 am
My replies..
Quote
I suppose the performance of your algorithm would also suffer if instead of SHA512(X),  SCRYPT(X) was used because the cost of doing this step twice would be much higher and less GPU friendly.

Quote
I wonder what would happen if we used NESTED momentum proof of work? 

Change the nonce-space of the outer proof of work to a pair of 16 bit numbers that result in a X-bit collision?

Now you have a more memory intensive inner hash that is still quick to validate, but would significantly complicate GPU miners.

Note the reason we selected a fast hash like SHA512 rather than a slow hash like SCRYPT is to maximize memory bus bandwidth. 
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 08, 2013, 01:38:19 am
Quote
gigawatt,
     Thank you for providing the first innovative algorithm for reducing the memory requirements.  Let me attempt to post mitigating factors to your algorithm.

     From a CPU miner perspective, your reduction in memory comes at the expense of performance so does break the algorithmic complexity of the algorithm.

     From a GPU perspective you have to populate a bloom filter with 2^26 results... based upon my understanding of how bloom filters operate this would require updating a common data-structure from every thread and the resulting memory race conditions could create false negatives.   If you have to do this step sequentially, then you might as well use a CPU with memory.   

     So do you have any solid algorithms that can populate a bloom filter with 2^26 results in parallel? 

There's actually a few academic papers on the topic, mainly because bloom filters have lots of applications.

https://sites.google.com/site/venkateshnrao/projects/parallel-bloom-filter-implementation-on-gpu-using-cuda
https://sbs.cse.wustl.edu/pubs/mcbf11.pdf‎
http://www.gpucomputing.net/?q=node/13381
http://dl.acm.org/citation.cfm?id=1982363
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 01:49:03 am
Quote
The results of the experiment in table 4 evoke some interesting discussion. If one excludes the preprocessing step, the
speed up is significant.

The preprocessing step however is an integral part of the
algorithm to port the Bloom Filter on to the GPU. Thus we
need to come up with better ways to preprocess a given set
of keys.

One more notable observation is that the actual
filter construction time and communication latency between
GPU and CPU is independent of the key size.

I haven't read the other papers... but so far it appears non-trivial.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 08, 2013, 04:20:13 pm
The point of Scrypt was to make parallel operation difficult and its primary weakness is that validation time suffers as you scale it.

If you don't require that the first match in the sequence of generated pseudo-random numbers, then my upthread described parallelism attack works to allow GPUs to blow away CPUs, assuming the pseudo-random sequence can be generated much faster than the random memory latency bound on the CPU so the GPU can try 100s of values from the same pseudo-random sequence in parallel to hide the memory latency.

If you do require it, then you lose the fast validation time.

Do you get it now or you going to continue to ignore me?

I expended the time to explain this in more detail:

https://bitcointalk.org/index.php?topic=325261.msg3520992#msg3520992

I think that is the last effort I will apply to this.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 08, 2013, 04:41:45 pm
The point of Scrypt was to make parallel operation difficult and its primary weakness is that validation time suffers as you scale it.

If you don't require that the first match in the sequence of generated pseudo-random numbers, then my upthread described parallelism attack works to allow GPUs to blow away CPUs, assuming the pseudo-random sequence can be generated much faster than the random memory latency bound on the CPU so the GPU can try 100s of values from the same pseudo-random sequence in parallel to hide the memory latency.

If you do require it, then you lose the fast validation time.

Do you get it now or you going to continue to ignore me?

I expended the time to explain this in more detail:

https://bitcointalk.org/index.php?topic=325261.msg3520992#msg3520992

I think that is the last effort I will apply to this.

Even if you allow that a GPU is "leaps and bounds" faster than CPUs, you still run into major problems.

The time complexity of searching with a CPU using memory lookups is O(n), whereas brute-forcing combinations with a GPU is still O(n^2).
Regardless how much faster a GPU is, it's faster by a constant multiple, but the number of calculations is almost squared.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 08, 2013, 04:45:07 pm
The point of Scrypt was to make parallel operation difficult and its primary weakness is that validation time suffers as you scale it.

If you don't require that the first match in the sequence of generated pseudo-random numbers, then my upthread described parallelism attack works to allow GPUs to blow away CPUs, assuming the pseudo-random sequence can be generated much faster than the random memory latency bound on the CPU so the GPU can try 100s of values from the same pseudo-random sequence in parallel to hide the memory latency.

If you do require it, then you lose the fast validation time.

Do you get it now or you going to continue to ignore me?

I expended the time to explain this in more detail:

https://bitcointalk.org/index.php?topic=325261.msg3520992#msg3520992

I think that is the last effort I will apply to this.

Even if you allow that a GPU is "leaps and bounds" faster than CPUs, you still run into major problems.

The time complexity of searching with a CPU using memory lookups is O(n), whereas brute-forcing combinations with a GPU is still O(n^2).
Regardless how much faster a GPU is, it's faster by a constant multiple, but the number of calculations is almost squared.

bytemaster wrote that upthread but I don't understand why you say so? He never answered.

The GPU is using the same amount of memory and the same number sequence. It is searching the same thing.

Why do you claim that the GPU has a squared time complexity? I see no basis whatsoever for that claim.

I have only described how to speed up the search of the same sequence and memory as the CPU is using, by employing more simultaneous lookups.

I must assume both of you are not reading carefully. I am not writing about replacing any memory storage with computation. I made that clear from my very first post. You are apparently not taking the time to understand what I wrote.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 08, 2013, 05:28:01 pm
It looks like there's a major flaw with birthday collisions as a proof of work:

For any hash algorithm of length L bits, a collision can be found in 2^(L/2) + 1 computations using a constant amount of memory using a cycle detection algorithm (https://en.wikipedia.org/wiki/Cycle_detection).  Given that Momentum uses a 50-bit hash, that means that a collision can be found in 2^25 + 1 steps which 1) is, ironically, less than MAX_MOMENTUM_NONCE, and 2) because it relies almost exclusively on SHA512 calls, can be made massively parallel.
Not only does it require fewer steps, it can also be made stupid parallel.

In fact, there's a whole section of the book "Introduction to Modern Cryptography" dedicated to the subject of collision attacks.  An errata referring to the section can be found here: http://www.cs.umd.edu/~jkatz/imc/hash-erratum.pdf (contains mathematical proofs of the claims listed above)

Additional discussion on the matter can be found here: http://crypto.stackexchange.com/questions/3295/how-does-a-birthday-attack-on-a-hashing-algorithm-work

The fact that the algorithm uses a constant amount of memory regardless of the hash output size means that no matter how large MAX_MOMENTUM_NONCE or SEARCH_SPACE_BITS values are, the memory requirements are essentially zero.  This means that no matter how slow the algorithm, it's infinitely more efficient (memory vs collision rate) than the current implementation.
Given that it requires a constant amount of memory, Momentum as a proof-of-work is essentially equivalent to scrypt/md5/sha512 on a higher difficulty.


To reinforce these claims, I'm actively working on a proof of concept that uses constant memory and almost nothing but SHA512 calls.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 08, 2013, 05:43:46 pm
It looks like there's a major flaw with birthday collisions as a proof of work:

For any hash algorithm of length L bits, a collision can be found in 2^(L/2) + 1 computations using a constant amount of memory using a cycle detection algorithm (https://en.wikipedia.org/wiki/Cycle_detection).  Given that Momentum uses a 50-bit hash, that means that a collision can be found in 2^25 + 1 steps which 1) is, ironically, less than MAX_MOMENTUM_NONCE, and 2) because it relies almost exclusively on SHA512 calls, can be made massively parallel.
Not only does it require fewer steps, it can also be made stupid parallel.

In fact, there's a whole section of the book "Introduction to Modern Cryptography" dedicated to the subject of collision attacks.  An errata referring to the section can be found here: http://www.cs.umd.edu/~jkatz/imc/hash-erratum.pdf (contains mathematical proofs of the claims listed above)

Additional discussion on the matter can be found here: http://crypto.stackexchange.com/questions/3295/how-does-a-birthday-attack-on-a-hashing-algorithm-work

The fact that the algorithm uses a constant amount of memory regardless of the hash output size means that no matter how large MAX_MOMENTUM_NONCE or SEARCH_SPACE_BITS values are, the memory requirements are essentially zero.  This means that no matter how slow the algorithm, it's infinitely more efficient (memory vs collision rate) than the current implementation.
Given that it requires a constant amount of memory, Momentum as a proof-of-work is essentially equivalent to scrypt/md5/sha512 on a higher difficulty.


To reinforce these claims, I'm actively working on a proof of concept that uses constant memory and almost nothing but SHA512 calls.

This is essentially taking advantage of the same point I made, which is there are numerous solution candidates in each sequence, yet employing an mathematical insight on detecting duplicate candidates method instead of leveraging the GPU's memory and thread advantage with parallelism.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 07:13:15 pm
Wow great work guys!   I am taking some time to digest it all and really wish the bounty would have been taken more seriously prior to launch.

Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 08, 2013, 07:25:53 pm
Wow great work guys!   I am taking some time to digest it all and really wish the bounty would have been taken more seriously prior to launch.

I didn't know about it. You could have PM'ed me.  I had told you in the past I was researching CPU-only.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 07:31:34 pm
Wow great work guys!   I am taking some time to digest it all and really wish the bounty would have been taken more seriously prior to launch.
Quote
If the input is given as a subroutine for calculating ƒ, the cycle detection problem may be trivially solved using only λ+μ function applications, simply by computing the sequence of values xi and using a data structure such as a hash table to store these values and test whether each subsequent value has already been stored. However, the space complexity of this algorithm is λ+μ, unnecessarily large. Additionally, to implement this method as a pointer algorithm would require applying the equality test to each pair of values, resulting in quadratic time overall. Thus, research in this area has concentrated on two goals: using less space than this naive algorithm, and finding pointer algorithms that use fewer equality tests.

So assuming an ASIC can implement f(x) very quickly... then this algorithm can be reduced to 2 memory locations using something like the tortoise and hare algorithm. 

There is one thing it seems to require and that is mapping an input space to itself.   In the case of momentum the input space does not fit this criteria.   Nonces are 20 bit, birthdays are 50 bits.   But a birthday is not just a SHA512(nonce), but SHA512( nonce + header ) and this generates 8+ birthdays which would not have the property of mapping back to itself in a cycle.

Let me look into some of the other techniques and their requirements.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 08, 2013, 08:28:37 pm
bytemaster, hope you saw my response. I think you underestimated the issue:

https://bitcointalk.org/index.php?topic=325261.msg3523201#msg3523201
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 08:30:51 pm
Quote
The fact that the algorithm uses a constant amount of memory regardless of the hash output size means that no matter how large MAX_MOMENTUM_NONCE or SEARCH_SPACE_BITS values are, the memory requirements are essentially zero.  This means that no matter how slow the algorithm, it's infinitely more efficient (memory vs collision rate) than the current implementation.
Given that it requires a constant amount of memory, Momentum as a proof-of-work is essentially equivalent to scrypt/md5/sha512 on a higher difficulty.

Quote
Now, if H is a random function on an m-element set, then, by the birthday paradox, the expected number of steps
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 09:10:12 pm
bytemaster, hope you saw my response. I think you underestimated the issue:

https://bitcointalk.org/index.php?topic=325261.msg3523201#msg3523201

Integrated Graphics can saturate its memory bus (60 GB/Sec on latest hardware) using the same techniques as hyper-threading to perform parallel execution.  GPUs can get 5x faster memory performance, though I think in this particular use case moving more than 64 bytes (cache line size) per cycle doesn't help due to the random access nature.   In fact, it may hurt the GPU by being forced to consume its memory bandwidth in larger than necessary chunks when it is randomly accessed.

I think it is fair to say that performance of the memory-based approach is limited by memory bandwidth and that 'in theory' a GPU could have 260 Gbs while high end desktops can get 60 Gbs and low end systems get 10 Gbs.  So there is a 26x difference between average CPU based systems and high end graphics cards.

By the way, I am very much considering paying the bounty based upon information provided thus far.  I just need to understand the degree to which Momentum has been harmed by the algorithms mentioned. 

1) What is the performance of SCRYPT on CPU vs GPU? 
2) Does momentum have a lower performance ratio than SCRYPT?
3) When it comes to making an ASIC, which would be harder SCRYPT or Momentum?

As far as how to divided the bounty among the various contributors I am thinking that AnonyMint and gigawatt have both provided some reasonable contributions.   I will wait to see gigawatt's proof-of-concept implementation to be sure there is a cycle and that the computational time is not larger than available parallelism before making the final decision about relative payout.   Gigawatts proof-of-concept shows more time and energy invested and is also more compelling than theoretical estimations of performance gains provided by AnonyMint.  Clearly writing code is of greater value than writing forum posts and also more definitive.

Summary of Algorithms Presented
Bloom Filter to Reduce Memory  -  comes at performance cost, may be able to perform in parallel with R&D which leads to GPUs...
Memory Bandwidth of GPU arch vs CPU arch - gives a potential 4 to 24x advantage to GPU based upon specs alone, must be discounted by overhead of accessing memory in larger chunks than necessary and/or bloom filter and/or different time complexity sort/search algorithms
Constant-Memory Cycle Search Algorithms - trade CPU for Memory in a manner that might make parallelism free from memory bus constraints.  Must prove we have cycles and that calculation time does not exceed  level of parallelism. 

Good work everyone, it is a pleasure debating this with you all.  I hope you recognize that I am in this to find the best algorithm and NOT to defend my ego and put my head in the sand.   This bounty and ProtoShares has brought out great minds and we sharpen one another. 



 
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: iruu on November 08, 2013, 09:58:13 pm
The cycle algorithm idea is wrong, there's no cycle in the 50 bit hash. The cycle is in 512 bit sha output, but that's equivalent to finding sha512 collisions.

Regardless of birthday algorithm, generating 2^23 sha512 on radeon 5870 (no oc) takes 180ms. 64 bit values and 1024 block makes parallelism on current compute units hard. That's close to output of new intel cpu. So gpus aren't going to win by a big margin, like with bitcoin's sha256. 
The collision finding time, even on a cpu, is almost insignificant, 2^26 is a small value. 
I'm writing gpu miner as an opencl learning experience. Looks like finding all birthdays in 2^26 hashes is going to take 200-400ms on radeon 5870.

edit: 70ms for sha512, so not that bad. (optimizing opencl is completely different from cpu, and I'm not talking about obvious sequential vs parallel differences), time for the harder part.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 08, 2013, 10:22:44 pm
The cycle algorithm idea is wrong, there's no cycle in the 50 bit hash. The cycle is in 512 bit sha output, but that's equivalent to finding sha512 collisions.

Regardless of birthday algorithm, generating 2^23 sha512 on radeon 5870 (no oc) takes 180ms. 64 bit values and 1024 block makes parallelism on current compute units hard. That's close to output of new intel cpu. So gpus aren't going to win by a big margin, like with bitcoin's sha256. 
The collision finding time, even on a cpu, is almost insignificant, 2^26 is a small value. 
I'm writing gpu miner as an opencl learning experience. Looks like finding all birthdays in 2^26 hashes is going to take 200-400ms on radeon 5870.

iruu, thank you for joining the discussion.  You confirmed my belief that there were no cycles and that the GPU would have a hard time in practice doing this.  The whole theory/practice thing is what makes this whole discussion so hard.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 08, 2013, 10:47:15 pm
Some high-end CPUs don't have AGPUs. Look at the Bitcointalk thread for your protoshares launch, you see many Xeons and high-end i7. The server chips don't have AGPUs.

Integrated Graphics can saturate its memory bus (60 GB/Sec on latest hardware)

Is that limited amount of sideport memory or the full main DRAM?

The fastest I'd seen was the 40 GB/sec for Ivy-E and those don't have AGPUs.

On Haswell, Ivy, Sandy with AGPU I saw roughly 20 GB/sec, but I wasn't clear if the AGPU can attain that since it has worse latency to main memory than the CPU.

though I think in this particular use case moving more than 64 bytes (cache line size) per cycle doesn't help due to the random access nature.   In fact, it may hurt the GPU by being forced to consume its memory bandwidth in larger than necessary chunks when it is randomly accessed.

If you have enough parallel threads they will to some extent statistically clump in locality of memory. I haven't attempted to quantify that effect.

Hashtables would point to arrays of items in the bucket.

So there is a 26x difference between average CPU based systems and high end graphics cards.

Rough estimate agreed. On Haswell or latest, maybe half that.

Then higher if no AGPU integrated.

1) What is the performance of SCRYPT on CPU vs GPU? 

From the Litecoin Wiki, I seem to remember roughly 5 - 10 times faster for high-end GPU compared i7. Compared to roughly 100x faster for Bitcoin. You can double-check.

2) Does momentum have a lower performance ratio than SCRYPT?

Appears to be worse than Litecoin. But you gain fast verification, so for you that might be the correct tradeoff, especially if integrated graphics will be improving on CPUs (so you might reach parity with Litecoin or better in future) and the ASIC risk is not higher than Scrypt (or if you are not so worried about ASIC near-term).

3) When it comes to making an ASIC, which would be harder SCRYPT or Momentum?

I haven't analyzed this for Momentum. I have analyzed it for Scrypt. I can comment after others have summarized if I have something to add.

As far as how to divided the bounty among the various contributors I am thinking that AnonyMint and gigawatt have both provided some reasonable contributions.   I will wait to see gigawatt's proof-of-concept implementation to be sure there is a cycle and that the computational time is not larger than available parallelism before making the final decision about relative payout.   Gigawatts proof-of-concept shows more time and energy invested and is also more compelling than theoretical estimations of performance gains provided by AnonyMint.  Clearly writing code is of greater value than writing forum posts and also more definitive.

I have no problem with that. I am not expecting anything, but I think it does help you in the community to be fair and reward help. But I see we have a new comment claiming that Gigawatts vulnerability is not valid. I would go slowly and be sure first.

Good work everyone, it is a pleasure debating this with you all.  I hope you recognize that I am in this to find the best algorithm and NOT to defend my ego and put my head in the sand.   This bounty and ProtoShares has brought out great minds and we sharpen one another.

I appreciate your clarification. I know you were sick the prior day, because you mentioned it.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 08, 2013, 10:52:34 pm
The whole theory/practice thing is what makes this whole discussion so hard.

The additional threads masking latency you have already the evidence of your own experience with hyperthreads, and also you can see this in analyzing the reports of cpuminer settings with Litecoin on the GPU.

On cpuminer, the "lookup gap" is the trading off memory for computation. The other main setting is the number of threads to run.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 09, 2013, 09:30:06 pm
I am looking for algorithmic weaknesses in the Momentum Proof of Work: http://bitsharestalk.org/index.php?topic=21.msg24#msg24 

If you can identify an alternative algorithm for solving the proof of work that does not require the target RAM and yet is able to find hashes at a rate greater than  MomentumAlgoHashRate / (TARGET_RAM_USE/ALT_RAM_USE)  then you could win this bounty.

For example:  Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz.   If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for.   
      I paid out a tip for showing this property doesn't quite hold as I thought, but this was more a weakness in my devising an effective / objective measure for determining whether or not momentum is "broken' or 'hacked'.

For the record, I did not receive any tip.

1) What is the performance of SCRYPT on CPU vs GPU? 

From the Litecoin Wiki, I seem to remember roughly 5 - 10 times faster for high-end GPU compared i7. Compared to roughly 100x faster for Bitcoin. You can double-check.

https://litecoin.info/Comparison_between_Litecoin_and_Bitcoin
https://litecoin.info/Mining_hardware_comparison
https://en.bitcoin.it/wiki/Mining_hardware_comparison

If we compare the i7 2600 both on pooler's cpuminer, then ratio to the HD7970 is roughly 30 for Bitcoin and 15 for Litecoin. So Litecoin is only a 2x improvement over Bitcoin, not the order-of-magnitude I had indicated.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 09, 2013, 11:57:36 pm
I am looking for algorithmic weaknesses in the Momentum Proof of Work: http://bitsharestalk.org/index.php?topic=21.msg24#msg24 

If you can identify an alternative algorithm for solving the proof of work that does not require the target RAM and yet is able to find hashes at a rate greater than  MomentumAlgoHashRate / (TARGET_RAM_USE/ALT_RAM_USE)  then you could win this bounty.

For example:  Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz.   If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for.   
      I paid out a tip for showing this property doesn't quite hold as I thought, but this was more a weakness in my devising an effective / objective measure for determining whether or not momentum is "broken' or 'hacked'.

For the record, I did not receive any tip.

You are right... I owe you that tip... sorry I have been very busy  can you send me a BTC address.  Thanks
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 10, 2013, 05:55:46 am
You are right... I owe you that tip... sorry I have been very busy  can you send me a BTC address.  Thanks

Ditto. (https://bitcointalk.org/index.php?topic=313479.msg3522649#msg3522649)  ;D



In other news, I think I've finally, finally, actually done it (again).
As we discussed previously, it looks like using a cycle-detection algorithm is no-go because the hash doesn't map to itself.

However, what does work is making a pseudo-rainbow table in memory, on the fly.
The result: the ratio of memory used vs search rate remains exactly the same, but we can scale the size arbitrarily.  For example, instead of taking 768 MB memory and 1 minute per search, we can use (768 MB / X) memory and X minutes per search.

This sounds useless at first, but consider the following: given that we can arbitrarily scale the memory requirements, the only limiting factor how fast we can find collisions is our brute computing power.  What's good at brute computing power?  GPUs and ASICs.


Here's my proof:

Overview:
https://en.wikipedia.org/wiki/Rainbow_table
Make a "rainbow table" map to hold the equivalent hash space of SEARCH_SPACE
  For chains of length CHAIN_LENGTH,
    rainbow table needs (SEARCH_SPACE / CHAIN_LENGTH) elements [name: RAINBOW_ENTRIES]
  Rainbow table will require (8 bytes (hash) + 4 bytes (nonce)) * RAINBOW_ENTRIES bytes total
To compress each SEARCH_SPACE_BITS hash to a new nonce in the range MAX_MOMENTUM_NONCE,
  simply use (hash % MAX_MOMENTUM_NONCE)

Pseudocode:
For the first RAINBOW_ENTRIES
  nonce = element number
  For i = 1 to CHAIN_LENGTH
    cur_hash = HASH(nonce)
    Check if any part of cur_hash is in rainbow table
      If yes and generated by different nonce: add to result list
    For each chunk in cur_hash
      rainbow_map[chunk] = element number
    nonce = compress(cur_hash)

For the next (SEARCH_SPACE - RAINBOW_ENTRIES) elements:
  nonce = element number
  For i = 1 to CHAIN_LENGTH
    cur_hash = Hash(nonce)
    Check if any part of cur_hash is in rainbow table
      If yes and generated by different nonce: add to result list
    nonce = compress(cur_hash)

return result list


// MATH SECTION /////
Given:
CALLS(CHAIN_LENGTH) = total calls to SHA512 using a rainbow table with chain length CHAIN_LENGTH
CALLS(CHAIN_LENGTH) = (SEARCH_SPACE / CHAIN_LENGTH) + (SEARCH_SPACE * CHAIN_LENGTH) - (SEARCH_SPACE / CHAIN_LENGTH)
  + (SEARCH_SPACE / CHAIN_LENGTH) = Calls to generate rainbow table
  + (SEARCH_SPACE * CHAIN_LENGTH) = Calls to check SEARCH_SPACE nonce values
  - (SEARCH_SPACE * CHAIN_LENGTH) = Except we skip the nonces used to make the rainbow table
CALLS(CHAIN_LENGTH) = (SEARCH_SPACE * CHAIN_LENGTH)

Given:
MEMORY(CHAIN_LENGTH) = number of bytes required to store a rainbow table representing
                         SEARCH_SPACE elements with chain length CHAIN_LENGTH
MEMORY(CHAIN_LENGTH) = rainbow table elements * (size of hash chunk + size of nonce)
MEMORY(CHAIN_LENGTH) = (8 + 4) * SEARCH_SPACE / CHAIN_LENGTH

Examples:
Minimum number of SHA512 calls is CHAIN_LENGTH = 1 (equivalent to the "default" semiordered map setup)
  Requires (SEARCH_SPACE * 12) = 768 MB memory
  Requires (SEARCH_SPACE) calls to SHA512
For CHAIN_LENGTH = N
  Requires (SEARCH_SPACE * 12 / N) bytes of memory
  Requires (SEARCH_SPACE * N) calls to SHA512
For CHAIN_LENGTH = 1024
  Requires (2^26 * 12 / 1024) = 768 KB of memory
  Requires (2*26 * 1024) calls to SHA512

Bottom line:
Using a rainbow table map of chain length N,
  memory usage is multiplied by a factor of 1/N
  and SHA512 calls are multiplied by a factor of N
This means that regardless the CHAIN_LENGTH selected,
  the ratio of CALLS(CHAIN_LENGTH) to MEMORY(CHAIN_LENGTH)
  is inversely proportional.
EFFICIENCY IS CONSTANT AT ANY MEMORY TARGET.
All that matters is brute computational strength.

Example:
If a GPU/ASIC/FPGA has PROCESSOR number of parallel computing units,
  and MEM_MAX bytes of available memory, then each
  Momentum hash instance can use up to (MEM_MAX / PROCESSOR) bytes of memory.
To calculate the CHAIN_LEN to acheive this:
  CHAIN_LEN(MEM_MAX, PROCESSOR) = (8 + 4) * SEARCH_SPACE / (MEM_MAX / PROCESSOR)
  CHAIN_LEN(MEM_MAX, PROCESSOR) = (8 + 4) * SEARCH_SPACE * PROCESSOR / MEM_MAX
  Note: This calculation is just the inverse of MEMORY(CHAIN_LEN)

Assuming: GPU with 128 compute units, 1 GB memory, and SEARCH_SPACE = MAX_MOMENTUM_NONCE
  CHAIN_LEN(2^30, 2^7) = (8 + 4) * SEARCH_SPACE * PROCESSOR / MEM_MAX
  CHAIN_LEN(2^30, 2^7) = (8 + 4) * SEARCH_SPACE * 2^7 / 2^30
  CHAIN_LEN(2^30, 2^7) = 12 * SEARCH_SPACE / 2^23
  CHAIN_LEN(2^30, 2^7) = 12 * 2^26 / 2^23
  CHAIN_LEN(2^30, 2^7) = 12 * 2^3
  CHAIN_LEN(2^30, 2^7) = 96
  Verification:
  MEMORY(CHAIN_LENGTH) = (8 + 4) * SEARCH_SPACE / CHAIN_LENGTH
  MEMORY(96) = (8 + 4) * 2^26 / 96
  MEMORY(96) = 8388608 bytes
  MEMORY(96) = 8 MB
  CALLS(CHAIN_LENGTH) = (SEARCH_SPACE * CHAIN_LENGTH)
Result: 1/96th the normal memory usage of 768 MB, 96x the SHA512 calls.
  The problem is no longer about memory, it's about computation speed.



I've actually got a somewhat working proof of concept, although it's quite sub-optimal.  Right now the main issue is using the overkill boost::unordered_map for storage instead of a more bespoke memory structure.  The boost implementation likes to decide it needs to redo the bucket arrangement from time to time, resulting in slowdowns.

It also has an unholy number of debug statements.

If you'd like to check on my progress, the source is right here (http://pastebin.com/VTCMFQXg).  It's a drop-in replacement for "momentum.cpp".



Edit: There's other optimizations possible when using the time/memory tradeoff.  For example, if we normally took 1 minute per nonce space search, cranking up the time/memory ratio to 16 would mean we'd expect 16 minutes per search, which is considerably longer than the target time per block.  To compensate, we can have the algorithm return immediately on first collision discovery.  When a new block is detected, all the miner threads should be stopped and recreated.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 10, 2013, 06:30:51 am
I don't have time at the moment to go through your method, but does momentum_verify(...) work on your results?

Quote
This sounds useless at first, but consider the following: given that we can arbitrarily scale the memory requirements, the only limiting factor how fast we can find collisions is our brute computing power.  What's good at brute computing power?  GPUs and ASICs.

This may be true for bandwidth of calculations, but what of latency?    Suppose you turn this from a 1 minute search into a 768 minute search... if you perform it a billion times in parallel you will have amazing bandwidth, but terrible latency.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 10, 2013, 07:22:51 am
I don't have time at the moment to go through your method, but does momentum_verify(...) work on your results?

I'm working out a few kinks right now.  It said it found a hit earlier, but I accidentally had it dump the wrong info, so I had no way to verify.

This may be true for bandwidth of calculations, but what of latency?    Suppose you turn this from a 1 minute search into a 768 minute search... if you perform it a billion times in parallel you will have amazing bandwidth, but terrible latency.

Absolutely correct.  I edited my main post to reflect handling that.
The simple solution: as soon as you find any birthday collisions, return immediately.  If it found a hit 2 seconds in, spending the next ~750 minutes to finish off the run before returning makes the collisions stale, regardless how many were found.

The second is allowing the parent thread to terminate the children whenever a new block is found.  This is already handled in the main ProtoShares branch since commit 3dfc3cc33 (https://github.com/InvictusInnovations/ProtoShares/commit/3dfc3cc33cff63b184a24f60ffbee7e6acb546b6).

Given both of these capabilities, latency is no longer an issue.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 10, 2013, 08:48:14 am
I'm happy to report that it works like a charm.
I've updated the source code. (http://pastebin.com/FJMb2hL8)
Feel free to test it for yourself.


It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio.  Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:

Step 31 of 128...
  HITS: 743672
  MISS: 1353480
 TOTAL: 1966081
   PCT: 35
Step 32 of 128...
  HITS: 760208
  MISS: 1336944
 TOTAL: 2031617
   PCT: 36
RAINBOW HIT
  SEEK HASH: 423473256423089
 ORIG NONCE: 28524699
 RAIN NONCE: 5439512 + 1
         0: 203104387400228
         1: 423473256423089
         2: 1041792957055939
         3: 780112116212973
         4: 629783454762940
         5: 675829228796259
         6: 738539531672861
         7: 577969205224283
NAILED IT! Nonce: 28524696 + 3
         0: 959239184726270
         1: 186912846539772
         2: 633556876464486
         3: 423473256423089
         4: 553069074389047
         5: 746169572180730
         6: 539623141106693
         7: 588368479534609
Checking Work:
    Nonce A: 5439513
    Nonce B: 28524699
     Hash A: 423473256423089
     Hash B: 423473256423089



And the view from the outside: http://i.imgur.com/rMUKK5t.png
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: iruu on November 10, 2013, 09:20:16 am
So what are the precise terms of the $5000 bounty now? So that there're no doubts whether something meets it or not. 

Let's say a gpu miner uses x amount of memory and searches the 2^23 sha space in y seconds, on a modern graphics card. What're the x and y for the bounty? 
Or is memory unimportant, just that it can work effectively on a gpu? Effectively defined how?
Or is it the relation? When memory use is cut in half, speed drops less than a half (ie. relation is lower than linear). 

Can a proof of concept use simpler hash functions than sha512, if the algorithm is hash algorithm-agnostic? 

Or is it off?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: barwizi on November 10, 2013, 09:31:38 am
I'm happy to report that it works like a charm.
I've updated the source code. (http://pastebin.com/FJMb2hL8)
Feel free to test it for yourself.


It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio.  Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:

Step 31 of 128...
  HITS: 743672
  MISS: 1353480
 TOTAL: 1966081
   PCT: 35
Step 32 of 128...
  HITS: 760208
  MISS: 1336944
 TOTAL: 2031617
   PCT: 36
RAINBOW HIT
  SEEK HASH: 423473256423089
 ORIG NONCE: 28524699
 RAIN NONCE: 5439512 + 1
         0: 203104387400228
         1: 423473256423089
         2: 1041792957055939
         3: 780112116212973
         4: 629783454762940
         5: 675829228796259
         6: 738539531672861
         7: 577969205224283
NAILED IT! Nonce: 28524696 + 3
         0: 959239184726270
         1: 186912846539772
         2: 633556876464486
         3: 423473256423089
         4: 553069074389047
         5: 746169572180730
         6: 539623141106693
         7: 588368479534609
Checking Work:
    Nonce A: 5439513
    Nonce B: 28524699
     Hash A: 423473256423089
     Hash B: 423473256423089



And the view from the outside: http://i.imgur.com/rMUKK5t.png

so this is for cpu? how to compile for windows?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 10, 2013, 10:39:23 am
I'm happy to report that it works like a charm.
I've updated the source code. (http://pastebin.com/FJMb2hL8)
Feel free to test it for yourself.


It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio.  Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:

Step 31 of 128...
  HITS: 743672
  MISS: 1353480
 TOTAL: 1966081
   PCT: 35
Step 32 of 128...
  HITS: 760208
  MISS: 1336944
 TOTAL: 2031617
   PCT: 36
RAINBOW HIT
  SEEK HASH: 423473256423089
 ORIG NONCE: 28524699
 RAIN NONCE: 5439512 + 1
         0: 203104387400228
         1: 423473256423089
         2: 1041792957055939
         3: 780112116212973
         4: 629783454762940
         5: 675829228796259
         6: 738539531672861
         7: 577969205224283
NAILED IT! Nonce: 28524696 + 3
         0: 959239184726270
         1: 186912846539772
         2: 633556876464486
         3: 423473256423089
         4: 553069074389047
         5: 746169572180730
         6: 539623141106693
         7: 588368479534609
Checking Work:
    Nonce A: 5439513
    Nonce B: 28524699
     Hash A: 423473256423089
     Hash B: 423473256423089



And the view from the outside: http://i.imgur.com/rMUKK5t.png

Wow, this looks good.... I need to understand it a bit better first.    You have come up with a new algorithm, how many HPM do you get with it on a CPU and how much RAM are you using?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: digitalindustry on November 10, 2013, 03:24:50 pm
I'm happy to report that it works like a charm.
I've updated the source code. (http://pastebin.com/FJMb2hL8)
Feel free to test it for yourself.


It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio.  Latency has been made a non-issue due to immediate return on collision detection and external thread termination.
I'm happy to report that it works like a charm.
I've updated the source code. (http://pastebin.com/FJMb2hL8)
Feel free to test it for yourself.


It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio.  Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:

Step 31 of 128...
  HITS: 743672
  MISS: 1353480
 TOTAL: 1966081
   PCT: 35
Step 32 of 128...
  HITS: 760208
  MISS: 1336944
 TOTAL: 2031617
   PCT: 36
RAINBOW HIT
  SEEK HASH: 423473256423089
 ORIG NONCE: 28524699
 RAIN NONCE: 5439512 + 1
         0: 203104387400228
         1: 423473256423089
         2: 1041792957055939
         3: 780112116212973
         4: 629783454762940
         5: 675829228796259
         6: 738539531672861
         7: 577969205224283
NAILED IT! Nonce: 28524696 + 3
         0: 959239184726270
         1: 186912846539772
         2: 633556876464486
         3: 423473256423089
         4: 553069074389047
         5: 746169572180730
         6: 539623141106693
         7: 588368479534609
Checking Work:
    Nonce A: 5439513
    Nonce B: 28524699
     Hash A: 423473256423089
     Hash B: 423473256423089



And the view from the outside: http://i.imgur.com/rMUKK5t.png

so this is for cpu? how to compile for windows?

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:

Step 31 of 128...
  HITS: 743672
  MISS: 1353480
 TOTAL: 1966081
   PCT: 35
Step 32 of 128...
  HITS: 760208
  MISS: 1336944
 TOTAL: 2031617
   PCT: 36
RAINBOW HIT
  SEEK HASH: 423473256423089
 ORIG NONCE: 28524699
 RAIN NONCE: 5439512 + 1
         0: 203104387400228
         1: 423473256423089
         2: 1041792957055939
         3: 780112116212973
         4: 629783454762940
         5: 675829228796259
         6: 738539531672861
         7: 577969205224283
NAILED IT! Nonce: 28524696 + 3
         0: 959239184726270
         1: 186912846539772
         2: 633556876464486
         3: 423473256423089
         4: 553069074389047
         5: 746169572180730
         6: 539623141106693
         7: 588368479534609
Checking Work:
    Nonce A: 5439513
    Nonce B: 28524699
     Hash A: 423473256423089
     Hash B: 423473256423089



And the view from the outside: http://i.imgur.com/rMUKK5t.png

so this is for cpu? how to compile for windows?

^
This is barwizi hating life lol , I think ill ramp up the core 2 duo barwizi...
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: digitalindustry on November 10, 2013, 03:29:29 pm
Seriously though I have no idea how to compile it anyhow, but if it works I guess its a share changer. 
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: Proto on November 10, 2013, 04:18:56 pm
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 10, 2013, 06:42:53 pm
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

Here is my GPU algorithm...

Calculate SHA512((2^26)/8) times and store it in memory... very GPU friendly.
GPU SORT( DATA )
PARALLEL COMPARE LEFT

Over all it should be simple and efficient and faster than CPU...  though it still requires the full memory, it would be much lower latency(better).   
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 10, 2013, 07:25:07 pm
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

What were you using for CHAIN_LENGTH?

I've been running a test with CHAIN_LENGTH = 2 on a 1 CPU/1GB test box overnight and got this:
http://i.imgur.com/wknTeLb.png

If I try to run the default implementation, it crashes because it requires too much memory.  However, on a 2 CPU/2 GB test box running only one thread, I get around 2.2 hpm.  However, I need to rerun these tests because the hashperminute calculation was updated with a recent commit (https://github.com/InvictusInnovations/ProtoShares/commit/776832f419649211cb7b173ffb437e7d2af68b8b).


I've also updated the source code: http://pastebin.com/VTCMFQXg
It now uses the thread interruption from the recent commit and makes debug messages optional.


To compile: Pull the source from the main github branch, then take my code and overwrite momentum.cpp with it.  Compile as normal.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 10, 2013, 08:32:44 pm
I am looking for algorithmic weaknesses in the Momentum Proof of Work: http://bitsharestalk.org/index.php?topic=21.msg24#msg24 

If you can identify an alternative algorithm for solving the proof of work that does not require the target RAM and yet is able to find hashes at a rate greater than  MomentumAlgoHashRate / (TARGET_RAM_USE/ALT_RAM_USE)  then you could win this bounty.

For example:  Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz.   If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for.   
      I paid out a tip for showing this property doesn't quite hold as I thought, but this was more a weakness in my devising an effective / objective measure for determining whether or not momentum is "broken' or 'hacked'.

For the record, I did not receive any tip.

You are right... I owe you that tip... sorry I have been very busy  can you send me a BTC address.  Thanks

1KxTtf1AjoS5hne974YoKjWL6fqcBgJUpE

I will make a post here in this thread after I've received the tip. If will not mention the amount of the tip publicly, it is your choice whether to mention it or not. The gesture is most significant I think, not the amount, at least in my case since I did not provide an implementation only an algorithm. However, I did help reveal how you were thinking incorrectly about the fundamental tradeoffs.

Good work gigawatt!
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 10, 2013, 11:42:04 pm
I am about to crash, but wanted to acknowledge your post and contribution.   I will be away for at least 12 hours recovering from this week!

Later.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: digitalindustry on November 11, 2013, 04:01:31 am
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

What were you using for CHAIN_LENGTH?

I've been running a test with CHAIN_LENGTH = 2 on a 1 CPU/1GB test box overnight and got this:
http://i.imgur.com/wknTeLb.png

If I try to run the default implementation, it crashes because it requires too much memory.  However, on a 2 CPU/2 GB test box running only one thread, I get around 2.2 hpm.  However, I need to rerun these tests because the hashperminute calculation was updated with a recent commit (https://github.com/InvictusInnovations/ProtoShares/commit/776832f419649211cb7b173ffb437e7d2af68b8b).


I've also updated the source code: http://pastebin.com/VTCMFQXg
It now uses the thread interruption from the recent commit and makes debug messages optional.


To compile: Pull the source from the main github branch, then take my code and overwrite momentum.cpp with it.  Compile as normal.

hey thanks - silly me i was :

g++ -o mpow_1 mpow_1.cpp

i renamed it duh ....

cheers - interesting just to see ..

so i was thinking and was going to ask, but it looks like its been answered - an implementation like this would be up to 100x efficient because of the ability to scale .  then run parallel.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: digitalindustry on November 11, 2013, 04:07:50 am
If this is the case and it is confirmed - I have one last question

AnonyMint and gigawatt :

will you know help  bytemaster either implement a more effective code ?

though either 

1. forking this design and continuing

or

2. reissuing a new design

for the correct incentives would you both be willing to contribute - I think Anony its time to realize the single persons don't create cryptocurrencies , its a community effort.

I couldn't see why those incentives wouldn't include a payment in the  currency itself .

I think its time to step up - clearly the community likes the idea of something new and innovative -
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: o3u on November 11, 2013, 04:41:09 am
If this is the case and it is confirmed - I have one last question

AnonyMint and gigawatt :

will you know help  bytemaster either implement a more effective code ?

though either 

1. forking this design and continuing

or

2. reissuing a new design

for the correct incentives would you both be willing to contribute - I think Anony its time to realize the single persons don't create cryptocurrencies , its a community effort.

I couldn't see why those incentives wouldn't include a payment in the  currency itself .

I think its time to step up - clearly the community likes the idea of something new and innovative -
+1

CPU only miner would definitely help cryptos at a lot of levels. Hopefully those who can help do so!
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 11, 2013, 05:23:51 am

CPU only miner would definitely help cryptos at a lot of levels. Hopefully those who can help do so!

I completely agree.



I also have a few ideas on how the momentum algorithm can be salvaged, but I'd like to work directly with the developers on the topic if possible.
If one of the Invictus staff could drop me a PM, it would be much appreciated.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: digitalindustry on November 11, 2013, 05:50:45 am

CPU only miner would definitely help cryptos at a lot of levels. Hopefully those who can help do so!

I completely agree.



I also have a few ideas on how the momentum algorithm can be salvaged, but I'd like to work directly with the developers on the topic if possible.
If one of the Invictus staff could drop me a PM, it would be much appreciated.

+1

Great work .

AnonyMint I hope you step up to this you will benifit everyone will .
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: Proto on November 11, 2013, 01:16:08 pm
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

What were you using for CHAIN_LENGTH?

I've been running a test with CHAIN_LENGTH = 2 on a 1 CPU/1GB test box overnight and got this:
http://i.imgur.com/wknTeLb.png

If I try to run the default implementation, it crashes because it requires too much memory.  However, on a 2 CPU/2 GB test box running only one thread, I get around 2.2 hpm.  However, I need to rerun these tests because the hashperminute calculation was updated with a recent commit (https://github.com/InvictusInnovations/ProtoShares/commit/776832f419649211cb7b173ffb437e7d2af68b8b).


I've also updated the source code: http://pastebin.com/VTCMFQXg
It now uses the thread interruption from the recent commit and makes debug messages optional.


To compile: Pull the source from the main github branch, then take my code and overwrite momentum.cpp with it.  Compile as normal.

I first tested with CHAIN_LENGTH = 2 (default), this got me 14.33 collisionspermin. If I remember it well, it still used a lot of memory: 20 GB instead of 26GB stock client.

Setting it to 4 got me around 2 collisionspermin, with a notable reduction in memory usage.

Setting it to 32 didn't get me a collision at all in a couple of minutes. But it used very little memory.

I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: FreeTrade on November 11, 2013, 02:07:19 pm
I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.

Would it be possible to set it for so little memory that everything takes place on L2 cache? That memory should be a lot faster than main RAM - so might see some dramatic improvements at that level - don't know if they could be reproduced on GPU though.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: gigawatt on November 11, 2013, 03:57:54 pm
I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.

Would it be possible to set it for so little memory that everything takes place on L2 cache? That memory should be a lot faster than main RAM - so might see some dramatic improvements at that level - don't know if they could be reproduced on GPU though.

It should be possible, but I doubt it would be advisable.  You'd be looking at using a CHAIN_LENGTH of 768 to get it down to 1MB, but the flip side is requiring 768x the processing.  I'm not an expert, but I doubt the miner would run fast enough to make up the difference.


bytemaster: Could you send me a PM?  I'd like to send you a few ideas on how to harden momentum hash.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: AnonyMint on November 11, 2013, 04:53:08 pm
I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.

Would it be possible to set it for so little memory that everything takes place on L2 cache? That memory should be a lot faster than main RAM - so might see some dramatic improvements at that level - don't know if they could be reproduced on GPU though.

That won't beat the GPU, because then you use much less memory so the GPU can run so many more threads and more assuredly hit its 10x faster main memory bandwidth. At best if you increase throughput compared to main memory and if the GPU was already at full memory bandwidth (had hid all the latency) on the existing algorithm, you might cut the advantage of the GPU down by a multiple, which is what Litecoin obtained. However, the GPU will then also be running more copies of the hash in that case too (more threads and memory available), so I doubt this would be an improvement at all.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking
Post by: bytemaster on November 15, 2013, 06:25:22 pm
I have decided to award gigawatt the bounty for his excellent contributions to this effort combined with a working algorithm.  He has also privately shared with me useful enhancements to the momentum proof of work.   

After careful consideration of AnonyMints posts I have concluded that he did not provide any new algorithms that break the complexity and merely argued that the existing algorithm could be ported to GPU with relatively little modification and that if that were done it would be faster.   

Thank you gigawatt... I could use 1.21 of you working on this project to really accelerate the timeline.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: AnonyMint on November 19, 2013, 12:26:17 am
Hahaha. You will soon see how incorrect that selfish appraisal is. As well, you reneged on what you already said you would do, which was send a tip. And I told you, I didn't care the amount, only the gesture.

But this is fine, because it is excellent motivation for what you will soon see.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: iruu on November 19, 2013, 04:26:58 am
@AnonyMint at this point gpu miner isn't going to prove anything really
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: bytemaster on November 19, 2013, 05:57:12 am
Hahaha. You will soon see how incorrect that selfish appraisal is. As well, you reneged on what you already said you would do, which was send a tip. And I told you, I didn't care the amount, only the gesture.

But this is fine, because it is excellent motivation for what you will soon see.

Tip paid.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: barwizi on November 19, 2013, 06:08:02 am
I think a gpu implementation is still a relevant idea. though if kept private it would further skew the distribution of PTS
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: Brekyrself on November 19, 2013, 07:33:08 am
bytemaster

As we see a lowered memory requirement for mining and possibility of a gpu miner are there any changes planned to the basis of momentum algorithm?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: bytemaster on November 19, 2013, 04:19:00 pm
bytemaster

As we see a lowered memory requirement for mining and possibility of a gpu miner are there any changes planned to the basis of momentum algorithm?

I am working with Gigawatt on some enhancements.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: AnonyMint on November 20, 2013, 03:00:59 am
Hahaha. You will soon see how incorrect that selfish appraisal is. As well, you reneged on what you already said you would do, which was send a tip. And I told you, I didn't care the amount, only the gesture.

But this is fine, because it is excellent motivation for what you will soon see.

Tip paid.

Confirmed. Thank you. Favor returned here:

https://bitcointalk.org/index.php?topic=339902.msg3646754#msg3646754

Gpu miner still matters because perhaps if he thought deeply about the information I gave him, he could eliminate gpus, if that is desirable. Yet all my work on  a CPU-only miner, I can't get the hash to be fast enough for what I think bytemaster needs in his design.

What is the slowest hash your design can tolerate?

Where you able to defeat gpus with Gigawatt's help?

Not having a gpu miner doesn't mean someone won't have one and thus this is very risky unless you've proven it can't run faster on a gpu. So the way I presented thinking about it, is very important I think. But I don't know what Gigawatt has come up with.

To recap, either you provide a gpu miner to all, or you prove beyond any reasonable doubt that no one can make one. That is very important I think. But you may have other priorities?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: bytemaster on November 20, 2013, 03:23:59 am
Hahaha. You will soon see how incorrect that selfish appraisal is. As well, you reneged on what you already said you would do, which was send a tip. And I told you, I didn't care the amount, only the gesture.

But this is fine, because it is excellent motivation for what you will soon see.

Tip paid.

Confirmed. Thank you. Favor returned here:

https://bitcointalk.org/index.php?topic=339902.msg3646754#msg3646754

Gpu miner still matters because perhaps if he thought deeply about the information I gave him, he could eliminate gpus, if that is desirable. Yet all my work on  a CPU-only miner, I can't get the hash to be fast enough for what I think bytemaster needs in his design.

What is the slowest hash your design can tolerate?

Where you able to defeat gpus with Gigawatt's help?

Not having a gpu miner doesn't mean someone won't have one and thus this is very risky unless you've proven it can't run faster on a gpu. So the way I presented thinking about it, is very important I think. But I don't know what Gigawatt has come up with.

To recap, either you provide a gpu miner to all, or you prove beyond any reasonable doubt that no one can make one. That is very important I think. But you may have other priorities?

Someone is being paid to attempt to build a GPU miner.  If they are successful I will know about it and will then let the forum know and will open source it. 
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: AnonyMint on November 20, 2013, 03:37:18 am
Thanks. So to clarify, the reason I didn't build a gpu miner and go for the $5000 is I didn't think it was most helpful to bytemaster nor to myself. A cpu-only coin needs a very different structure that is afaics incompatible with the need for a fast hash. So I shared my thoughts and left the $5000 for someone who might provide more help for maximizing benefits for bytemaster. I didn't want to expend so much effort and then end up disappointed or arguing, especially that is not enough to get me so excited. I am busy working on my own project(s) that is most exciting for me. Yet I did try to share and help in a small way. Thanks again for the tip.

Perhaps bytemaster and gigawatt will find some way to leverage the on die AGPU + CPU or otherwise keep the GPU at minimal advantage to the CPU. The main goal is for the GPU to not have a huge performance/watt advantage.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: SpeedDemon13 on January 06, 2014, 07:25:22 am
So, is a gpu miner for PTS still in the works? Surprised it hasn't been released yet since MMC has a GPU miner now.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: bytemaster on January 06, 2014, 07:29:14 am
So, is a gpu miner for PTS still in the works? Surprised it hasn't been released yet since MMC has a GPU miner now.

We are not building one, rumors indicate one may exist in the wild and that it is 3-4x faster than a CPU but that is all I know.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: Bitcoinfan on January 06, 2014, 11:47:41 am
Just to sum things up, is momentum (or the enhanced version) still resistent to GPU mining?
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: bytemaster on January 06, 2014, 04:53:03 pm
Just to sum things up, is momentum (or the enhanced version) still resistent to GPU mining?

So far I believe it is the most GPU resistant algorithm.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: 5chdn on January 06, 2014, 05:46:21 pm
Just to sum things up, is momentum (or the enhanced version) still resistent to GPU mining?

So far I believe it is the most GPU resistant algorithm.

I think the quest for prime chains in Primecoin is the most GPU-resistant algorithm as they calculate with pure integers.
Title: Re: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]
Post by: tromp on January 09, 2014, 03:14:48 pm
Just to sum things up, is momentum (or the enhanced version) still resistent to GPU mining?

So far I believe it is the most GPU resistant algorithm.

I believe the new Cuckoo Cycle proof-of-work is more GPU resistant, since it doesn't allow
for any time-memory trade-off. See https://github.com/tromp/cuckoo