Author Topic: $5000 Bounty - Momentum Proof-of-Work Hacking [PAID]  (Read 54060 times)

0 Members and 1 Guest are viewing this topic.

Offline bytemaster

I have decided to award gigawatt the bounty for his excellent contributions to this effort combined with a working algorithm.  He has also privately shared with me useful enhancements to the momentum proof of work.   

After careful consideration of AnonyMints posts I have concluded that he did not provide any new algorithms that break the complexity and merely argued that the existing algorithm could be ported to GPU with relatively little modification and that if that were done it would be faster.   

Thank you gigawatt... I could use 1.21 of you working on this project to really accelerate the timeline.
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline AnonyMint

  • Jr. Member
  • **
  • Posts: 30
    • View Profile
I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.

Would it be possible to set it for so little memory that everything takes place on L2 cache? That memory should be a lot faster than main RAM - so might see some dramatic improvements at that level - don't know if they could be reproduced on GPU though.

That won't beat the GPU, because then you use much less memory so the GPU can run so many more threads and more assuredly hit its 10x faster main memory bandwidth. At best if you increase throughput compared to main memory and if the GPU was already at full memory bandwidth (had hid all the latency) on the existing algorithm, you might cut the advantage of the GPU down by a multiple, which is what Litecoin obtained. However, the GPU will then also be running more copies of the hash in that case too (more threads and memory available), so I doubt this would be an improvement at all.
« Last Edit: November 11, 2013, 04:56:05 pm by AnonyMint »

Offline gigawatt

  • Jr. Member
  • **
  • Posts: 27
    • View Profile
I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.

Would it be possible to set it for so little memory that everything takes place on L2 cache? That memory should be a lot faster than main RAM - so might see some dramatic improvements at that level - don't know if they could be reproduced on GPU though.

It should be possible, but I doubt it would be advisable.  You'd be looking at using a CHAIN_LENGTH of 768 to get it down to 1MB, but the flip side is requiring 768x the processing.  I'm not an expert, but I doubt the miner would run fast enough to make up the difference.


bytemaster: Could you send me a PM?  I'd like to send you a few ideas on how to harden momentum hash.
BTC: 1E2egHUcLDAmcxcqZqpL18TPLx9Xj1akcV
PTS: Pstkk1gZCxc4GEwS1eBAykYwVmcubU1P8L

Offline FreeTrade

  • Board Moderator
  • Hero Member
  • *****
  • Posts: 700
    • View Profile
I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.

Would it be possible to set it for so little memory that everything takes place on L2 cache? That memory should be a lot faster than main RAM - so might see some dramatic improvements at that level - don't know if they could be reproduced on GPU though.
“People should be more sophisticated? How are you gonna get that done?” - Jerry Seinfeld reply to Bill Maher

Offline Proto

  • Full Member
  • ***
  • Posts: 72
    • View Profile
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

What were you using for CHAIN_LENGTH?

I've been running a test with CHAIN_LENGTH = 2 on a 1 CPU/1GB test box overnight and got this:
http://i.imgur.com/wknTeLb.png

If I try to run the default implementation, it crashes because it requires too much memory.  However, on a 2 CPU/2 GB test box running only one thread, I get around 2.2 hpm.  However, I need to rerun these tests because the hashperminute calculation was updated with a recent commit.


I've also updated the source code: http://pastebin.com/VTCMFQXg
It now uses the thread interruption from the recent commit and makes debug messages optional.


To compile: Pull the source from the main github branch, then take my code and overwrite momentum.cpp with it.  Compile as normal.

I first tested with CHAIN_LENGTH = 2 (default), this got me 14.33 collisionspermin. If I remember it well, it still used a lot of memory: 20 GB instead of 26GB stock client.

Setting it to 4 got me around 2 collisionspermin, with a notable reduction in memory usage.

Setting it to 32 didn't get me a collision at all in a couple of minutes. But it used very little memory.

I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.

Offline digitalindustry

  • Jr. Member
  • **
  • Posts: 41
    • View Profile

CPU only miner would definitely help cryptos at a lot of levels. Hopefully those who can help do so!

I completely agree.



I also have a few ideas on how the momentum algorithm can be salvaged, but I'd like to work directly with the developers on the topic if possible.
If one of the Invictus staff could drop me a PM, it would be much appreciated.

+1

Great work .

AnonyMint I hope you step up to this you will benifit everyone will .

Offline gigawatt

  • Jr. Member
  • **
  • Posts: 27
    • View Profile

CPU only miner would definitely help cryptos at a lot of levels. Hopefully those who can help do so!

I completely agree.



I also have a few ideas on how the momentum algorithm can be salvaged, but I'd like to work directly with the developers on the topic if possible.
If one of the Invictus staff could drop me a PM, it would be much appreciated.
BTC: 1E2egHUcLDAmcxcqZqpL18TPLx9Xj1akcV
PTS: Pstkk1gZCxc4GEwS1eBAykYwVmcubU1P8L

Offline o3u

  • Full Member
  • ***
  • Posts: 58
  • Money comes, money goes.
    • View Profile
    • Fistful of Bitcoins (IN DEVELOPMENT)
If this is the case and it is confirmed - I have one last question

AnonyMint and gigawatt :

will you know help  bytemaster either implement a more effective code ?

though either 

1. forking this design and continuing

or

2. reissuing a new design

for the correct incentives would you both be willing to contribute - I think Anony its time to realize the single persons don't create cryptocurrencies , its a community effort.

I couldn't see why those incentives wouldn't include a payment in the  currency itself .

I think its time to step up - clearly the community likes the idea of something new and innovative -
+1

CPU only miner would definitely help cryptos at a lot of levels. Hopefully those who can help do so!
{BTC: 1BCnkcUPNd3jRnwzkT3EeAn8KLBNfZF41r , PTS: Pc71VSazYe1QoYZ9RKeW1cyoetrbZnFGnm, LTC: Lbiy1khbsnLxGbxv3WeKE6h753YFCGJ99t }
http://fistfulofbitcoins.com/

Offline digitalindustry

  • Jr. Member
  • **
  • Posts: 41
    • View Profile
If this is the case and it is confirmed - I have one last question

AnonyMint and gigawatt :

will you know help  bytemaster either implement a more effective code ?

though either 

1. forking this design and continuing

or

2. reissuing a new design

for the correct incentives would you both be willing to contribute - I think Anony its time to realize the single persons don't create cryptocurrencies , its a community effort.

I couldn't see why those incentives wouldn't include a payment in the  currency itself .

I think its time to step up - clearly the community likes the idea of something new and innovative -

Offline digitalindustry

  • Jr. Member
  • **
  • Posts: 41
    • View Profile
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

What were you using for CHAIN_LENGTH?

I've been running a test with CHAIN_LENGTH = 2 on a 1 CPU/1GB test box overnight and got this:
http://i.imgur.com/wknTeLb.png

If I try to run the default implementation, it crashes because it requires too much memory.  However, on a 2 CPU/2 GB test box running only one thread, I get around 2.2 hpm.  However, I need to rerun these tests because the hashperminute calculation was updated with a recent commit.


I've also updated the source code: http://pastebin.com/VTCMFQXg
It now uses the thread interruption from the recent commit and makes debug messages optional.


To compile: Pull the source from the main github branch, then take my code and overwrite momentum.cpp with it.  Compile as normal.

hey thanks - silly me i was :

g++ -o mpow_1 mpow_1.cpp

i renamed it duh ....

cheers - interesting just to see ..

so i was thinking and was going to ask, but it looks like its been answered - an implementation like this would be up to 100x efficient because of the ability to scale .  then run parallel.

Offline bytemaster

I am about to crash, but wanted to acknowledge your post and contribution.   I will be away for at least 12 hours recovering from this week!

Later.
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline AnonyMint

  • Jr. Member
  • **
  • Posts: 30
    • View Profile
I am looking for algorithmic weaknesses in the Momentum Proof of Work: http://bitsharestalk.org/index.php?topic=21.msg24#msg24 

If you can identify an alternative algorithm for solving the proof of work that does not require the target RAM and yet is able to find hashes at a rate greater than  MomentumAlgoHashRate / (TARGET_RAM_USE/ALT_RAM_USE)  then you could win this bounty.

For example:  Suppose momentum as specked requires about 1 GB of RAM for maximum performance and produces hashes at 1hz.   If you design an algorithm that only requires 1 MB of RAM and can find hashes at .001 hz then you win because you will have proven that computational complexity is linear with RAM use rather than non-linear like I believe the problem calls for.   
      I paid out a tip for showing this property doesn't quite hold as I thought, but this was more a weakness in my devising an effective / objective measure for determining whether or not momentum is "broken' or 'hacked'.

For the record, I did not receive any tip.

You are right... I owe you that tip... sorry I have been very busy  can you send me a BTC address.  Thanks

1KxTtf1AjoS5hne974YoKjWL6fqcBgJUpE

I will make a post here in this thread after I've received the tip. If will not mention the amount of the tip publicly, it is your choice whether to mention it or not. The gesture is most significant I think, not the amount, at least in my case since I did not provide an implementation only an algorithm. However, I did help reveal how you were thinking incorrectly about the fundamental tradeoffs.

Good work gigawatt!

Offline gigawatt

  • Jr. Member
  • **
  • Posts: 27
    • View Profile
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

What were you using for CHAIN_LENGTH?

I've been running a test with CHAIN_LENGTH = 2 on a 1 CPU/1GB test box overnight and got this:
http://i.imgur.com/wknTeLb.png

If I try to run the default implementation, it crashes because it requires too much memory.  However, on a 2 CPU/2 GB test box running only one thread, I get around 2.2 hpm.  However, I need to rerun these tests because the hashperminute calculation was updated with a recent commit.


I've also updated the source code: http://pastebin.com/VTCMFQXg
It now uses the thread interruption from the recent commit and makes debug messages optional.


To compile: Pull the source from the main github branch, then take my code and overwrite momentum.cpp with it.  Compile as normal.
BTC: 1E2egHUcLDAmcxcqZqpL18TPLx9Xj1akcV
PTS: Pstkk1gZCxc4GEwS1eBAykYwVmcubU1P8L

Offline bytemaster

I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

Here is my GPU algorithm...

Calculate SHA512((2^26)/8) times and store it in memory... very GPU friendly.
GPU SORT( DATA )
PARALLEL COMPARE LEFT

Over all it should be simple and efficient and faster than CPU...  though it still requires the full memory, it would be much lower latency(better).   
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline Proto

  • Full Member
  • ***
  • Posts: 72
    • View Profile
I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.
« Last Edit: November 10, 2013, 04:24:40 pm by Proto »