Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - gigawatt

Pages: 1 [2]

BitShares PTS / Re: Ubuntu build problem - undefined reference to bts::momentum_*

« on: November 11, 2013, 08:58:32 pm »

Quote from: victor on November 11, 2013, 07:51:10 pm

My patch should fix this issue:

Code: [Select]
--- ProtoShares/src/makefile.unix 2013-11-11 11:46:13.019922787 -0800 +++ ProtoShares-new/src/makefile.unix 2013-11-11 11:47:52.255921671 -0800 @@ -141,7 +141,8 @@ obj/bloom.o \ obj/noui.o \ obj/leveldb.o \ - obj/txdb.o + obj/txdb.o \ + obj/momentum.o all: bitcoind
save the code above as makefile.patch and patch -p0 < makefile.patch, you should be able to compile it again

That's weird. I wonder why "momentum.o" wasn't include in the makefile to begin with? The makefile shows the last edit date was 7 months ago, so it must have been like that from the beginning.

How the hell did mine compile?

BitShares PTS / Re: Client keeps crashing

« on: November 11, 2013, 04:50:55 pm »

Running a few tests to see if I can replicate the issue. I've got a watchdog wrapper for my protoshare client, so if it crashes I'll capture the debug.log file for analysis.

I'll let you know if I spot anything.

BitShares PTS / Re: Ubuntu build problem - undefined reference to bts::momentum_*

« on: November 11, 2013, 04:40:06 pm »

Check the "src" folder. Are the "momentum.cpp" and "momentum.h" files there?

You might want to delete the whole directory and pull it from git again. I ran into a few issues of git not fetching the files correctly.

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 11, 2013, 03:57:54 pm »

Quote from: FreeTrade on November 11, 2013, 02:07:19 pm

Quote from: Proto on November 11, 2013, 01:16:08 pm
I can play with a 16 core (32 threads w hyperthreading) machine and 64 gig memory. Let me know if you want me to test something new.

Would it be possible to set it for so little memory that everything takes place on L2 cache? That memory should be a lot faster than main RAM - so might see some dramatic improvements at that level - don't know if they could be reproduced on GPU though.

It should be possible, but I doubt it would be advisable. You'd be looking at using a CHAIN_LENGTH of 768 to get it down to 1MB, but the flip side is requiring 768x the processing. I'm not an expert, but I doubt the miner would run fast enough to make up the difference.

bytemaster: Could you send me a PM? I'd like to send you a few ideas on how to harden momentum hash.

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 11, 2013, 05:23:51 am »

Quote from: o3u on November 11, 2013, 04:41:09 am

CPU only miner would definitely help cryptos at a lot of levels. Hopefully those who can help do so!

I completely agree.

I also have a few ideas on how the momentum algorithm can be salvaged, but I'd like to work directly with the developers on the topic if possible.
If one of the Invictus staff could drop me a PM, it would be much appreciated.

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 10, 2013, 07:25:07 pm »

Quote from: Proto on November 10, 2013, 04:18:56 pm

I tested gigawatts implementation.

14.33 collisionspermin instead of 180 collisionspermin stock client, 32 threads.

The point is that gigawatts algorithm could yield a GPU implementation which could be a 100x faster version of a slower algorithm, with a net gain.

But I will try to change the memory requirements and see what happens on the CPU if we bring it down to using very little memory.

What were you using for CHAIN_LENGTH?

I've been running a test with CHAIN_LENGTH = 2 on a 1 CPU/1GB test box overnight and got this:
http://i.imgur.com/wknTeLb.png

If I try to run the default implementation, it crashes because it requires too much memory. However, on a 2 CPU/2 GB test box running only one thread, I get around 2.2 hpm. However, I need to rerun these tests because the hashperminute calculation was updated with a recent commit.

I've also updated the source code: http://pastebin.com/VTCMFQXg
It now uses the thread interruption from the recent commit and makes debug messages optional.

To compile: Pull the source from the main github branch, then take my code and overwrite momentum.cpp with it. Compile as normal.

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 10, 2013, 08:48:14 am »

I'm happy to report that it works like a charm.
I've updated the source code.
Feel free to test it for yourself.

It correctly finds nonce collisions and follows all the predictions made in my earlier post.
I've conclusively demonstrated that memory requirements can be scaled arbitrarily while retaining an equivalent memory-to-computation ratio. Latency has been made a non-issue due to immediate return on collision detection and external thread termination.

Here's a snippet of what the output looks like using CHAIN_LENGTH = 4:
Step 31 of 128... HITS: 743672 MISS: 1353480 TOTAL: 1966081 PCT: 35 Step 32 of 128... HITS: 760208 MISS: 1336944 TOTAL: 2031617 PCT: 36 RAINBOW HIT SEEK HASH: 423473256423089 ORIG NONCE: 28524699 RAIN NONCE: 5439512 + 1 0: 203104387400228 1: 423473256423089 2: 1041792957055939 3: 780112116212973 4: 629783454762940 5: 675829228796259 6: 738539531672861 7: 577969205224283 NAILED IT! Nonce: 28524696 + 3 0: 959239184726270 1: 186912846539772 2: 633556876464486 3: 423473256423089 4: 553069074389047 5: 746169572180730 6: 539623141106693 7: 588368479534609 Checking Work: Nonce A: 5439513 Nonce B: 28524699 Hash A: 423473256423089 Hash B: 423473256423089

And the view from the outside: http://i.imgur.com/rMUKK5t.png

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 10, 2013, 07:22:51 am »

Quote from: bytemaster on November 10, 2013, 06:30:51 am

I don't have time at the moment to go through your method, but does momentum_verify(...) work on your results?

I'm working out a few kinks right now. It said it found a hit earlier, but I accidentally had it dump the wrong info, so I had no way to verify.

Quote from: bytemaster on November 10, 2013, 06:30:51 am

This may be true for bandwidth of calculations, but what of latency? Suppose you turn this from a 1 minute search into a 768 minute search... if you perform it a billion times in parallel you will have amazing bandwidth, but terrible latency.

Absolutely correct. I edited my main post to reflect handling that.
The simple solution: as soon as you find any birthday collisions, return immediately. If it found a hit 2 seconds in, spending the next ~750 minutes to finish off the run before returning makes the collisions stale, regardless how many were found.

The second is allowing the parent thread to terminate the children whenever a new block is found. This is already handled in the main ProtoShares branch since commit 3dfc3cc33.

Given both of these capabilities, latency is no longer an issue.

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 10, 2013, 05:55:46 am »

Quote from: bytemaster on November 09, 2013, 11:57:36 pm

You are right... I owe you that tip... sorry I have been very busy can you send me a BTC address. Thanks

Ditto.

In other news, I think I've finally, finally, actually done it (again).
As we discussed previously, it looks like using a cycle-detection algorithm is no-go because the hash doesn't map to itself.

However, what does work is making a pseudo-rainbow table in memory, on the fly.
The result: the ratio of memory used vs search rate remains exactly the same, but we can scale the size arbitrarily. For example, instead of taking 768 MB memory and 1 minute per search, we can use (768 MB / X) memory and X minutes per search.

This sounds useless at first, but consider the following: given that we can arbitrarily scale the memory requirements, the only limiting factor how fast we can find collisions is our brute computing power. What's good at brute computing power? GPUs and ASICs.

Here's my proof:
Overview: https://en.wikipedia.org/wiki/Rainbow_table Make a "rainbow table" map to hold the equivalent hash space of SEARCH_SPACE For chains of length CHAIN_LENGTH, rainbow table needs (SEARCH_SPACE / CHAIN_LENGTH) elements [name: RAINBOW_ENTRIES] Rainbow table will require (8 bytes (hash) + 4 bytes (nonce)) * RAINBOW_ENTRIES bytes total To compress each SEARCH_SPACE_BITS hash to a new nonce in the range MAX_MOMENTUM_NONCE, simply use (hash % MAX_MOMENTUM_NONCE) Pseudocode: For the first RAINBOW_ENTRIES nonce = element number For i = 1 to CHAIN_LENGTH cur_hash = HASH(nonce) Check if any part of cur_hash is in rainbow table If yes and generated by different nonce: add to result list For each chunk in cur_hash rainbow_map[chunk] = element number nonce = compress(cur_hash) For the next (SEARCH_SPACE - RAINBOW_ENTRIES) elements: nonce = element number For i = 1 to CHAIN_LENGTH cur_hash = Hash(nonce) Check if any part of cur_hash is in rainbow table If yes and generated by different nonce: add to result list nonce = compress(cur_hash) return result list // MATH SECTION ///// Given: CALLS(CHAIN_LENGTH) = total calls to SHA512 using a rainbow table with chain length CHAIN_LENGTH CALLS(CHAIN_LENGTH) = (SEARCH_SPACE / CHAIN_LENGTH) + (SEARCH_SPACE * CHAIN_LENGTH) - (SEARCH_SPACE / CHAIN_LENGTH) + (SEARCH_SPACE / CHAIN_LENGTH) = Calls to generate rainbow table + (SEARCH_SPACE * CHAIN_LENGTH) = Calls to check SEARCH_SPACE nonce values - (SEARCH_SPACE * CHAIN_LENGTH) = Except we skip the nonces used to make the rainbow table CALLS(CHAIN_LENGTH) = (SEARCH_SPACE * CHAIN_LENGTH) Given: MEMORY(CHAIN_LENGTH) = number of bytes required to store a rainbow table representing SEARCH_SPACE elements with chain length CHAIN_LENGTH MEMORY(CHAIN_LENGTH) = rainbow table elements * (size of hash chunk + size of nonce) MEMORY(CHAIN_LENGTH) = (8 + 4) * SEARCH_SPACE / CHAIN_LENGTH Examples: Minimum number of SHA512 calls is CHAIN_LENGTH = 1 (equivalent to the "default" semiordered map setup) Requires (SEARCH_SPACE * 12) = 768 MB memory Requires (SEARCH_SPACE) calls to SHA512 For CHAIN_LENGTH = N Requires (SEARCH_SPACE * 12 / N) bytes of memory Requires (SEARCH_SPACE * N) calls to SHA512 For CHAIN_LENGTH = 1024 Requires (2^26 * 12 / 1024) = 768 KB of memory Requires (2*26 * 1024) calls to SHA512 Bottom line: Using a rainbow table map of chain length N, memory usage is multiplied by a factor of 1/N and SHA512 calls are multiplied by a factor of N This means that regardless the CHAIN_LENGTH selected, the ratio of CALLS(CHAIN_LENGTH) to MEMORY(CHAIN_LENGTH) is inversely proportional. EFFICIENCY IS CONSTANT AT ANY MEMORY TARGET. All that matters is brute computational strength. Example: If a GPU/ASIC/FPGA has PROCESSOR number of parallel computing units, and MEM_MAX bytes of available memory, then each Momentum hash instance can use up to (MEM_MAX / PROCESSOR) bytes of memory. To calculate the CHAIN_LEN to acheive this: CHAIN_LEN(MEM_MAX, PROCESSOR) = (8 + 4) * SEARCH_SPACE / (MEM_MAX / PROCESSOR) CHAIN_LEN(MEM_MAX, PROCESSOR) = (8 + 4) * SEARCH_SPACE * PROCESSOR / MEM_MAX Note: This calculation is just the inverse of MEMORY(CHAIN_LEN) Assuming: GPU with 128 compute units, 1 GB memory, and SEARCH_SPACE = MAX_MOMENTUM_NONCE CHAIN_LEN(2^30, 2^7) = (8 + 4) * SEARCH_SPACE * PROCESSOR / MEM_MAX CHAIN_LEN(2^30, 2^7) = (8 + 4) * SEARCH_SPACE * 2^7 / 2^30 CHAIN_LEN(2^30, 2^7) = 12 * SEARCH_SPACE / 2^23 CHAIN_LEN(2^30, 2^7) = 12 * 2^26 / 2^23 CHAIN_LEN(2^30, 2^7) = 12 * 2^3 CHAIN_LEN(2^30, 2^7) = 96 Verification: MEMORY(CHAIN_LENGTH) = (8 + 4) * SEARCH_SPACE / CHAIN_LENGTH MEMORY(96) = (8 + 4) * 2^26 / 96 MEMORY(96) = 8388608 bytes MEMORY(96) = 8 MB CALLS(CHAIN_LENGTH) = (SEARCH_SPACE * CHAIN_LENGTH) Result: 1/96th the normal memory usage of 768 MB, 96x the SHA512 calls. The problem is no longer about memory, it's about computation speed.

I've actually got a somewhat working proof of concept, although it's quite sub-optimal. Right now the main issue is using the overkill boost::unordered_map for storage instead of a more bespoke memory structure. The boost implementation likes to decide it needs to redo the bucket arrangement from time to time, resulting in slowdowns.

It also has an unholy number of debug statements.

If you'd like to check on my progress, the source is right here. It's a drop-in replacement for "momentum.cpp".

Edit: There's other optimizations possible when using the time/memory tradeoff. For example, if we normally took 1 minute per nonce space search, cranking up the time/memory ratio to 16 would mean we'd expect 16 minutes per search, which is considerably longer than the target time per block. To compensate, we can have the algorithm return immediately on first collision discovery. When a new block is detected, all the miner threads should be stopped and recreated.

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 08, 2013, 05:28:01 pm »

It looks like there's a major flaw with birthday collisions as a proof of work:

For any hash algorithm of length L bits, a collision can be found in 2^(L/2) + 1 computations using a constant amount of memory using a cycle detection algorithm. Given that Momentum uses a 50-bit hash, that means that a collision can be found in 2^25 + 1 steps which 1) is, ironically, less than MAX_MOMENTUM_NONCE, and 2) because it relies almost exclusively on SHA512 calls, can be made massively parallel.
Not only does it require fewer steps, it can also be made stupid parallel.

In fact, there's a whole section of the book "Introduction to Modern Cryptography" dedicated to the subject of collision attacks. An errata referring to the section can be found here: http://www.cs.umd.edu/~jkatz/imc/hash-erratum.pdf (contains mathematical proofs of the claims listed above)

Additional discussion on the matter can be found here: http://crypto.stackexchange.com/questions/3295/how-does-a-birthday-attack-on-a-hashing-algorithm-work

The fact that the algorithm uses a constant amount of memory regardless of the hash output size means that no matter how large MAX_MOMENTUM_NONCE or SEARCH_SPACE_BITS values are, the memory requirements are essentially zero. This means that no matter how slow the algorithm, it's infinitely more efficient (memory vs collision rate) than the current implementation.
Given that it requires a constant amount of memory, Momentum as a proof-of-work is essentially equivalent to scrypt/md5/sha512 on a higher difficulty.

To reinforce these claims, I'm actively working on a proof of concept that uses constant memory and almost nothing but SHA512 calls.

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 08, 2013, 04:41:45 pm »

Quote from: AnonyMint on November 08, 2013, 04:20:13 pm

Quote from: AnonyMint on November 07, 2013, 07:58:57 pm
Quote from: bytemaster on October 19, 2013, 06:42:06 pm
The point of Scrypt was to make parallel operation difficult and its primary weakness is that validation time suffers as you scale it.

If you don't require that the first match in the sequence of generated pseudo-random numbers, then my upthread described parallelism attack works to allow GPUs to blow away CPUs, assuming the pseudo-random sequence can be generated much faster than the random memory latency bound on the CPU so the GPU can try 100s of values from the same pseudo-random sequence in parallel to hide the memory latency.

If you do require it, then you lose the fast validation time.

Do you get it now or you going to continue to ignore me?

I expended the time to explain this in more detail:

https://bitcointalk.org/index.php?topic=325261.msg3520992#msg3520992

I think that is the last effort I will apply to this.

Even if you allow that a GPU is "leaps and bounds" faster than CPUs, you still run into major problems.

The time complexity of searching with a CPU using memory lookups is O(n), whereas brute-forcing combinations with a GPU is still O(n^2).
Regardless how much faster a GPU is, it's faster by a constant multiple, but the number of calculations is almost squared.

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

« on: November 08, 2013, 01:38:19 am »

Quote from: bytemaster on November 08, 2013, 12:41:59 am

Quote
gigawatt,
Thank you for providing the first innovative algorithm for reducing the memory requirements. Let me attempt to post mitigating factors to your algorithm.

From a CPU miner perspective, your reduction in memory comes at the expense of performance so does break the algorithmic complexity of the algorithm.

From a GPU perspective you have to populate a bloom filter with 2^26 results... based upon my understanding of how bloom filters operate this would require updating a common data-structure from every thread and the resulting memory race conditions could create false negatives. If you have to do this step sequentially, then you might as well use a CPU with memory.

So do you have any solid algorithms that can populate a bloom filter with 2^26 results in parallel?

There's actually a few academic papers on the topic, mainly because bloom filters have lots of applications.

https://sites.google.com/site/venkateshnrao/projects/parallel-bloom-filter-implementation-on-gpu-using-cuda
https://sbs.cse.wustl.edu/pubs/mcbf11.pdf‎
http://www.gpucomputing.net/?q=node/13381
http://dl.acm.org/citation.cfm?id=1982363

Pages: 1 [2]

Profile Info

Show Posts

Messages - gigawatt

BitShares PTS / Re: Ubuntu build problem - undefined reference to bts::momentum_*

BitShares PTS / Re: Client keeps crashing

BitShares PTS / Re: Ubuntu build problem - undefined reference to bts::momentum_*

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking

General Discussion / Re: $5000 Bounty - Momentum Proof-of-Work Hacking