Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - dga

Pages: 1 [2] 3 4 5 6 7 8 9
16
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: February 01, 2014, 08:24:32 pm »
beta9 for AVX2 is now online in the usual place:  http://www.cs.cmu.edu/~dga/ptsminer/

beta9 for AVX is also now online.  This one should be a good speed boost - I'm seeing my test machine go from about 780cpm to 1020cpm.

Note:  Unlike prior avxsse releases, this avx release really does require AVX.  It's compiled to target sandy bridge and higher.  I've changed the name of the binary to reflect this, and left the old avxsse one (which will run on sse4) online.

Direct link:  http://www.cs.cmu.edu/~dga/ptsminer/ptsminer-dga-beta9-avx-linux64-static.bin

Happy mining!

Update:  This one is producing very mixed results.  Try beta8 and beta9 and use whichever is better for you.  Beta9 is rocking on my AMD test CPU, but it seems slower on some others.  Definitely needs improvement still.

17
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: February 01, 2014, 07:19:38 pm »
beta9 for AVX2 is now online in the usual place:  http://www.cs.cmu.edu/~dga/ptsminer/

This is a speed-boost release.  I'm still doing the benchmarking runs, but on my i7-4770, it's the first of my releases to crack 600 cpm.  Looks like it's going to settle in between 610 and 620 cpm with 7 threads running on my test box.

beta9 is haswell-only right now;  its optimizations are specific to avx2.  I plan to address some of the portability/pool selection issues soon (because I'm running out of great ideas for how to make this thing faster without getting ugly).

18
Huge thanks for leaving yours open source.  It's educational to see what you did for optimizing it for OpenCL - I haven't quite wrapped my head around when it's best to use the int4/int8 types, and I appreciate you providing an example of how the CUDA version translates into optimized CL.

As an appreciation:

An optimization that I know the CUDA compiler gets through automated loop unrolling, but that I'm not sure the OpenCL compiler gets, is that in your step0to15 function, you can eliminate the + w[ i ]  for values of i between 5 and 14.  It might be worth testing a variant with two versions of step0to15, one "normal" and one for the known-zero-values in w.

  -Dave

19
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 30, 2014, 11:04:27 am »
Hi dga,

After 3 or 4 days, the v8 avx slow down from 320 to 260 cpm.
v8 avx2 not sure since run as daemon, but profit seems drop to 75%。

kill the ptsminer and restart seems fix it.

I'm not sure what's happening, since the yam get a lot of reject from beeeeer.org too.

yam on 1GH has no problem using new xpt2h protocol and port 18120.

Now ptsminer is beeeeer.org lock-in.

Any plan to support xpt2h protocol?

I haven't seen avx2 slow down, but beeeeer has had a bad string of luck lately with block finding - my profit is also down a fair bit. 

My own avx beta8 client hasn't slowed down:

2014-Jan-25 17:00:19 | 760.0 c/m | 14.1 sh/m
2014-Jan-30 06:00:07 | 774.9 c/m | 12.2 sh/m

but that doesn't mean there's not something wrong.  What CPU are you running the avx one on and with how many threads?

I do hope to add more protocol support.  I have real work taking up all of my time until this weekend, but I'll check out xpt2h then.

  -Dave

20
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 26, 2014, 12:40:56 am »
haswell e3-1230 v3 upgraded.

beta7 avx2 cpm : 530
beta8 avx2 cpm : 542

It's now faster than my 530 cpm GT 560!

by the way, seems ptsminer work 60 s for developer then 200 s for miner.
next round 1200 s for developer and 40,000 s for miner.

That's good if I'm running a server and never power off.

But when I run ptsminer on my desktop, in most case I run about 20,000 s
then power off.

So 1200 / 20000 = 6%.

for avx2 even 6% the ptsminer still far better than yam, so just for your information.

Glad to hear it's running well on the E3.

Noted about the devmine fee.  I'll fix that in a few betas.  What it really should be is an exponentially increasing sequence (with a cap) -- dev 60, user 2000;  dev 120, user 4000;  dev 240, user 8000; etc., which would reduce the problem you're seeing if you kill at exactly the wrong time, while still reducing the amount of interruption due to mining switches.  There are a few other things I want to do to make the dev mining more robustly fair under disconnects/etc., which is why I haven't just thrown out the exponential version.

21
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 25, 2014, 08:18:14 pm »
I've placed beta8 for haswell/avx2 and now for avxsse online.  It fixes (I believe) the bug that was causing failures when not enough hugepages were available, and incorporates the latest round of speed improvements for Haswell/AVX2.  Speed isn't changed much for avxsse, but I'm narrowing in on some more general improvements that should help there too.

http://www.cs.cmu.edu/~dga/ptsminer/

This is a quite worthwhile upgrade for the Haswell/AVX2 crowd.  Expect at least a 20cpm jump and probably more - I'm still letting the cpm benchmarks run, but my dev benchmarks suggest somewhere between a 5-10% speedup over beta7.  I'll update this post tonight with some actual CPM numbers from an i7-4770.

Update 2:
Totally rough guesstimate:

[STATS] 2014-Jan-25 17:34:57 | 570.1 c/m | 8.9 sh/m | VL: 299 (99.7%), RJ: 1 (0.3%), ST: 0 (0.0%)

I expect sustained rates of 565 c/m over a longer period of time.  Not bad, little CPU, not bad.

  -Dave

22
BitShares PTS / Re: "Protoshares is a CPU-mined coin"
« on: January 25, 2014, 01:33:49 am »
Quote

Dude common, it's not even relevant...A (big) bounty was brought up just to see if it it could withstand GPU mining, of course there are people going after the bounty and try their best to 'defeat' it  and yes they succeeded but it's certainly not like GPU mining has taken over. I suggest you read up on what this all is about and what the people are doing here with passion and I'm sure you'd change your opinion instantly. Seems like you just wanted some quick 'n easy cash while 'investing' $20 but didn't work out and now your looking for someone to blame.

Chiming in here as one of the bounty claimants:

Almost anything can be done on a GPU.  The real question is what the efficiency ratio is relative to CPU.  I've been pretty public in stating that I believe that the "GPU proof"-ness is bogus, but it's worth taking a closer look at the coins/sec/$ earned on CPUs vs GPUs for different coins - and, while we haven't seen the end of advances in either area, at this point, PTS and MMC are actually really interesting:  The coins/sec/$ is much closer between GPU and CPU than for most other currencies (except XPM, because nobody's bothered yet).

I keep a little spreadsheet for my own miners, and the entries in there are really interesting:

GTX 640 GDDR5 PTS:  250 c/m for $85 - about 2.8 c/m/$.
Intel Xeon E3 Haswell - estimated 525 c/m for about $268 - about 1.96 c/m/$.

Shopping the GPU around might double that c/m/$ if you found the right cheap card, but the difference there -- 50%?  100% -- is negligible compared to the 10-50x difference with scrypt and the massive gap for SHA512 ASICs.

I don't personally think that having a "CPU-friendly" coin is very important - all of the mining blah is just a mechanism for double-spend prevention - but the algorithms are technically interesting in that they do go a long way to evening the field compared to prior ones, if an even field is your goal.

23
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 24, 2014, 07:02:20 pm »
Hm.  What happens if you run with only 1 thread?

What happens if you first run, as root:
   echo 2048 > /proc/sys/vm/nr_hugepages

and then run with only one thread?

If it works with 1 thread, does it also work with 2?

  -Dave

I downloaded the beta7.1 static binary, and run it on my centos6 box but failed:

Code: [Select]
using SSE4
spawning 4 worker thread(s)
[WORKER[WORKER[WORKER2] starting
3] starting
1] starting
[WORKER0] starting
Couldn't use the hugepage speed optimization for big table.  Enable huge pages for a slight speed boost.
Couldn't use the hugepage speed optimization for big table.  Enable huge pages for a slight speed boost.
Couldn't use the hugepage speed optimization for big table.  Enable huge pages for a slight speed boost.
Couldn't use the hugepage speed optimization for big table.  Enable huge pages for a slight speed boost.
Couldn't use the hugepage speed optimization for small table.  Enable huge pages for a slight speed boost
Couldn't use the hugepage speed optimization for small table.  Enable huge pages for a slight speed boost
Couldn't use the hugepage speed optimization for small table.  Enable huge pages for a slight speed boost
Couldn't use the hugepage speed optimization for small table.  Enable huge pages for a slight speed boost
[WORKER1] GoGoGo!
[WORKER2] GoGoGo!
[WORKER0] GoGoGo!
[WORKER3] GoGoGo!
connecting to 54.201.26.128:1337
Mining for approx 60 seconds to support further development
Payments to: Pr8cnhz5eDsUegBZD4VZmGDARcKaozWbBc
[MASTER] work received - sharetarget: 03ffffffffffffffffffffffffffffffffffffffffffffffffffffffbeefde4d
ptsminer-dga-beta7.1-avxsse-linux64-static.bin: malloc.c:2369: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t)) < __alignof__ (long double) ? __alignof__ (long double) : 2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t)) < __alignof__ (long double) ? __alignof__ (long double) : 2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
aborted (core dumped)

it failed also with sph mode:
Code: [Select]
using SPHLIB
spawning 4 worker thread(s)
[WORKER[WORKER12[WORKER] starting] starting

3] starting
[WORKER0] starting
Couldn't use the hugepage speed optimization for big table.  Enable huge pages for a slight speed boost.
Couldn't use the hugepage speed optimization for big table.  Enable huge pages for a slight speed boost.
Couldn't use the hugepage speed optimization for big table.  Enable huge pages for a slight speed boost.
Couldn't use the hugepage speed optimization for big table.  Enable huge pages for a slight speed boost.
Couldn't use the hugepage speed optimization for small table.  Enable huge pages for a slight speed boost
Couldn't use the hugepage speed optimization for small table.  Enable huge pages for a slight speed boost
Couldn't use the hugepage speed optimization for small table.  Enable huge pages for a slight speed boost
Couldn't use the hugepage speed optimization for small table.  Enable huge pages for a slight speed boost
[WORKER2] GoGoGo!
[WORKER3] GoGoGo!
[WORKER0] GoGoGo!
[WORKER1] GoGoGo!
connecting to 54.201.26.128:1337
Mining for approx 60 seconds to support further development
Payments to: Pr8cnhz5eDsUegBZD4VZmGDARcKaozWbBc
[MASTER] work received - sharetarget: 03ffffffffffffffffffffffffffffffffffffffffffffffffffffffbeefde4d
Segmentation fault (core dumped)

24
BitShares PTS / Re: GPU miners comparsion
« on: January 22, 2014, 09:23:35 pm »
2200 cpm on my gtx titan with my SM35 varient of DGAS code with a improved sha512 core.

three thumbs up.

A good place to admit that I didn't think this optimization would be worth doing (and my naive test of it didn't work), and I was completely wrong.  go go open source version.  *grin*

25
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 22, 2014, 08:40:21 pm »
using same Internet connecion, my ivybridge (avx sse beta7) RJ is higher than haswell (avx2 beta7).

Now on several machine, VL 1013 and avxsse 14.3% : avx2 0.1%.

Ok, replying to a few of these:
 - I've put beta 7.1 online just for avxsse.  I've let a few small tweaks from what will be beta8 slip in, but it's basically the same as beta7 from a performance perspective.
 Changes:
    - It should improve the reject rate.  It's a bit more aggressive about checking for updates now without slowing down mining. (ptsrush)
    - Better handling of and diagnostic messages for out-of-memory / allocation errors. (dclark)

Update:  After about an hour of testing on a 64 core AMD machine:

    789.7 c/m | 12.3 sh/m | VL: 623 (99.7%), RJ: 2 (0.3%), ST: 0 (0.0%)

Looks like this one successfully pulls the submitted rejects down for avxsse also, though an hour isn't quite long enough to say what the overall reject rate will be.

Update 2:  After a day (the c/m and sh/m got reset but VL/RJ didn't):
    758.6 c/m | 12.0 sh/m | VL: 16570 (98.6%), RJ: 228 (1.4%), ST: 0 (0.0%)

Looks solid on rejects.

RJ is reject, ST is stale, VL is valid. 

I'm pondering the open-sourceness.  In the case of GPU, I'm happy - Invictus paid for the release.  Now that I'm doing unpaid improvements to the CPU miner, I want to see how it plays out - but I'm increasingly leaning towards keeping at least some of the cutting edge private as a way to get a bit of return on development time, and trying to keep the open source version updated at a reasonable level that lags a bit behind the latest and greatest but is still a good basis for people who want to learn / explore / improve.  It's a tough question.

26
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 19, 2014, 09:37:15 am »
it's a centos6 box
Code: [Select]
%gcc --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)

Ow.  gcc doesnt support avx2 until somewhere in the 4.7 release.  Best suggestion for now is to try the static avxsse binary - does it work for you?

Or if you're on an avx2 machine, upgrade. :-)

Next best is that it's getting higher on the TODO to make it easy to disable avx2 building.

I installed gcc4.7.2 but got the same errors,
I met exactly the same problem when compiling girino's opencl miner,
what version of gcc do you use?

Odd.  I use a more recent one:

gcc --version
gcc (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1

This may also be about your version of the assembler, though:

as --version
GNU assembler (GNU Binutils for Ubuntu) 2.23.52.20130913

27
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 18, 2014, 09:50:08 pm »
is a new sse4 version going to be available?

i get

core 2 duo t8100 @ [STATS] 2014-Jan-18 21:44:54 | 60.6 c/m | 1.0 sh/m | VL: 502 (83.8%), RJ: 97 (16.2%), ST: 0 (0.0%)

core i3 380um @ [STATS] 2014-Jan-18 21:43:03 | 47.2 c/m | 0.7 sh/m | VL: 409 (85.9%), RJ: 67 (14.1%), ST: 0 (0.0%)

damn i wish I had faster cpu
thanks

Yes.  beta7 is now online for both avx2 and sse/avx. 

There aren't major changes from beta6 for avx2 users;  if you're running really happily, I wouldn't bother upgrading.  The changes are mostly internal to trying to make it easier to build, and to make available the reject-reducing improvements for SSE and AMD CPUs.  It should be even more aggressive for slower CPUs, but it's quite a bit better than it was.  Note:  You may still see a batch of rejects at the very start when the miner switches out of dev-mining mode for the first time.  Just depends on where things were when it switches.

One note:  More individually slower cores will result in a higher reject rate.  I'm seeing this, for example, if I push the hyperthreading too hard, and on AMD CPUs, which have more cores but no hyperthreading, with each core a bit slower.  Not horrible, but something you'll notice.

Updated:  I've also put the Mac build online for beta7, and tried to improve the static-ness of this one so it should be easier to run.  This also meets a second personal goal of mine:  It's now about as fast to mine with the CPU on the Macbook Pro than it is to use the GPU with cudapts.  Take that, cudapts!  It's about time to put the GPU-hardness back in Protoshares.  <grin>  (It does, however, make the fans spin more.)  I'm getting about 200 cpm using 4 threads on MBP, which is pretty close to GPU.  I don't recommend mining on a laptop, though, unless you don't like your laptop.

28
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 18, 2014, 07:12:55 pm »
beta6 is now available as a dynamically linked build as well.  I think the static is a better way to go in general, but I threw this one up there in case anyone wants to test it.  I've removed the dependencies upon boost_filesystem and boost_chrono (which is the first step towards getting rid of at least one of those darned makefiles, and simplifying compilation on other platforms).

Me being the high-quality software engineering house that I am, there are a few other hopefully-insignificant tweaks in the one I just put online vs the static beta6, 'cause you're just getting builds out of my dev directory as I muddle through this, but (ha ..) nothing that should cause noticeable performance difference.

Happy mining.

29
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 18, 2014, 02:48:16 pm »
I'm not native speaker, but maybe in such case guess I should say it's "damn fast"!

Anyway, once a new block start, seems all my 6 worker print "Aborting scan run because of new work."

It's is a little annoying. Could you output them only 1 time?

And sometime it says "Not inserted: <298324783427234980742398> at 7744". Is it OK?

The RJ is about 5% in China. I'm not sure is due to broken internet or something else.

Thanks for these bug reports.  I've fixed both and pushed beta6 for haswell/avx2.

My benchmark hasn't been running long enough to really stabilize, but...

Beta5:   546.7 c/m | 8.7 sh/m | VL: 3937 (97.8%), RJ: 90 (2.2%), ST: 0 (0.0%)

Beta6:   543.0 c/m | 8.7 sh/m | VL: 142 (100.0%), RJ: 0 (0.0%), ST: 0 (0.0%)

I wouldn't read too much into the c/m and sh/m differences - it hasn't been running long enough - but the reject rate is reduced substantially.  The speed should be within 10c/m plus or minus once it's been running long enough to tell.

With more data:  540.2 c/m | 7.8 sh/m | VL: 1219 (99.5%), RJ: 6 (0.5%), ST: 0 (0.0%)

Much better reject rate with beta6.
  -Dave

30
BitShares PTS / Re: Open source optimized PTS CPU miner (BETA)
« on: January 18, 2014, 02:09:30 pm »
it's a centos6 box
Code: [Select]
%gcc --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)

Ow.  gcc doesnt support avx2 until somewhere in the 4.7 release.  Best suggestion for now is to try the static avxsse binary - does it work for you?

Or if you're on an avx2 machine, upgrade. :-)

Next best is that it's getting higher on the TODO to make it easy to disable avx2 building.

Pages: 1 [2] 3 4 5 6 7 8 9