Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - dga

Pages: 1 2 3 4 [5] 6 7 8 9
61
For those who don't feel like reading through a bunch of source code but find the algorithmic issues interesting, I've posted a little writeup on my blog about the basic problem and solution used in the GPU PTS miner:

http://da-data.blogspot.com/2014/01/gaining-momentum-duplicate-detection-in.html

  -Dave

62
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 13, 2014, 05:56:41 pm »
Why not AVX2 for SHA256 as well?

Why bother?

Won't improve anything?

There are 4-10 hash collisions per group of 2^23 SHA512 hashes that have to be pushed through SHA256.  Making SHA256 faster would make 1/1,000,000th of the computation faster. :-)

63
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 13, 2014, 05:54:03 pm »
Why not AVX2 for SHA256 as well?

Why bother?

64
MemoryCoin / Re: Typo in comments in momentum.cpp
« on: January 13, 2014, 04:43:52 pm »
Not too much of a big deal, still functional. Although well done for pointing it out I'm sure he will appreciate it

Yup, It's just a comment typo.  :)  It just makes it a little harder to understand the PoW function when reading the source.

65
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 13, 2014, 03:29:26 pm »

Is there any reason not to make it perm by putting vm.nr_hugepages = 2048 in sysctl.conf?

It's what I do on my machines.  It may tie up a little more memory on your system, but if it's used for a lot of mining, it's a good plan.

My miner uses them less than yam does, I believe, so you'll get a boost on yam also, if it's not going under the covers and enabling them for you. :-)

  -Dave

66
MemoryCoin / Typo in comments in momentum.cpp
« on: January 13, 2014, 12:52:51 pm »
Hi, FreeTrade -

Line 94 of momentum.cpp has an incorrect comment:

//use last 4 bits of first cache as next location

4 should be 14 (or abstracted in terms of the other constants).

67
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 13, 2014, 12:29:16 pm »
you replaced by "Hello, World!" log output >:(

keep up the good work & thank you :D

Oops.  I didn't think anyone would notice.  *grins*

Also, I've reduced the severity of that "oh my god no huge pages" message.  It now presents it as a suggestion for how to get better throughput.

68
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 13, 2014, 01:56:12 am »
Looked like it was faster than yam on my two junk servers, but slower on the rest

it was also dumping cores everywhere with mmap failing

Thanks for the report.  The mmap failure is just a warning - it falls back to malloc.

To silence - and run a little faster with both yam and my code - run:

echo "2048" > /proc/sys/vm/nr_hugepages

But the dumping cores is bad.  Could you send me a little more detail, or a stack trace?  (And on what kind of machine?)

The slower on the rest isn't too surprising.  There are a lot of optimizations to be done yet, particularly for huge servers with respect to thread affinity and other things.  And the SHA512 code is virtually untouched from the Intel release.  The goal isn't to beat yam with this release, it's just to start the ball rolling a little bit.

There are some constants to play with to tune for different platforms, but it's not worth going there yet (unless you're interested in poking in the code).

  -Dave

69
BitShares PTS / Re: GPU miners comparsion
« on: January 13, 2014, 01:39:51 am »
What is "cudapts"?  shrug

I am getting 1750 cpm with gtx 780 +150 core -100 memory on "PtsGPUz0.2ab"

ptsgpuz0.2ab is a windows port that combines jhprotominer with the GPU core code from cudapts.  Your performance should be similar to that achieved by cudapts, at least as of a version or two ago.

70
BitShares PTS / Open source optimized PTS CPU miner (BETA)
« on: January 13, 2014, 01:22:25 am »
Following up on yvg1900's release of yam, I figured I'd improve the state of the art of the open source versions a bit:

https://github.com/dave-andersen/ptsminer

I haven't made it build yet on windows (it just needs to compile the avx2 assembly code - should be straightforward if someone wants to clue me in on how to appropriately invoke gcc there), but it should work on other platforms.  As a warning, I've only really tried it on avx2, since I'm a fan of Haswell.  THIS SOFTWARE SHOULD BE CONSIDERED A BETA QUALITY RELEASE.  At best. 

As with my GPU release, this one is based very directly on ptsminer, so it's tied to beeeeer for the moment.  I plan to fix that and let it be used with other pools in the near future, but that's going to take some dev work.  sigh.

There's a lot of optimization to be done, but this gets the basics as far as memory subsystem optimization, and bridges a lot of the gap between the old OSS version and yam M7i.  I haven't tried out M7j, mind you -- it's probably a bit faster still, but this release should bridge the gap considerably.

It incorporates the same optional, extendible 1% dev fee that the gpu miner does.  Prior ptsminer devs, if you feel like you should be in the list, please PM me and I'll get you added!

With gratitude to FreeTrade for the donation that kept me interested in hacking on and releasing this stuff, and to yvg1900 for some very engaging unofficial competition. *grin*

  -Dave

71
BitShares PTS / Re: GPU miners comparsion
« on: January 12, 2014, 10:51:48 pm »
1gh v1.2 - windows 8 64 bit.  sapphire 7970 ghz edition 3gb, 1040 core 1500 ram clock speed. ~1120 CPM

Just for fun:

cudapts (linux) v2014-01-12 GTX690, stock clock settings, 1780 c/m.
Do you mind to try 1gh miner on this one? That would give a general idea how much OpenCL is inferior to CUDA.

Unfortunately, that machine also has one of my coin wallets on it, and I don't run untrusted binaries on it. :(

(Mind you, I'm not saying there's anything wrong with the 1gh miner - but there's been a lot of malware floating around.)

But - probably the best case thus far is the EC2 results showing 480 c/m vs about 780.  One might expect the same per-core results to extend to the 690, so maybe it would get about 1150?  (total SWAG)

72
BitShares PTS / Re: GPU miners comparsion
« on: January 12, 2014, 10:44:16 pm »
1gh v1.2 - windows 8 64 bit.  sapphire 7970 ghz edition 3gb, 1040 core 1500 ram clock speed. ~1120 CPM

Just for fun:

cudapts (linux) v2014-01-12 GTX690, stock clock settings, 1780 c/m.
Do you mind to try 1gh miner on this one? That would give a general idea how much OpenCL is inferior to CUDA.

Unfortunately, that machine also has one of my coin wallets on it, and I don't run untrusted binaries on it. :(

(Mind you, I'm not saying there's anything wrong with the 1gh miner - but there's been a lot of malware floating around.)

73
BitShares PTS / Re: GPU miners comparsion
« on: January 12, 2014, 10:32:57 pm »
1gh v1.2 - windows 8 64 bit.  sapphire 7970 ghz edition 3gb, 1040 core 1500 ram clock speed. ~1120 CPM

Just for fun:

cudapts (linux) v2014-01-12 GTX690, stock clock settings, 1780 c/m.

74

(...)

I am adapting my version of ptsminer to work with opencl. I can't get the same performance as cudapts yet (on an amazon g2.2xlarge his version gets 550 cpm and mine gets 290 cpm, so lots of room to improve). But you are welcome to try it (on AMD GPUs, for example, where there's no CUDA), and of course, suggest improvements.

https://github.com/girino/ptsminer

Got 480 CPM on amazon g2.2xlarge. Still not as good as cudapts, but getting close!
What is cudapts cpm there?

It's currently about 750-780.  480 is pretty good from OpenCL on Nvidia.

  -Dave

75
with the new version it's look like im down 100c/m per gtx 295
each gpu in the gtx 295 was getting 340+ now it is only getting 290c/m
+1 to this - my GT210 dropped from 80c/m to 50c/m with the latest update.

That's no good. :(

Did you both let it run for a while before comparing the #s?  It can take an hour or two (particularly with slower cards) for the rate to become steady.  I'm going to add another "speed" metric that's a bit more relevant to quick benchmarking, but for now, c/m it is. :)

I'm surprised.  The changes I made shouldn't be large, and slower cards, if anything, should see almost no effect.

Pages: 1 2 3 4 [5] 6 7 8 9