Author Topic: Open source OpenCL GPU miner by girino - Binaries for win, linux, osx (1% fee)  (Read 14845 times)

0 Members and 1 Guest are viewing this topic.

Offline girino

  • Full Member
  • ***
  • Posts: 73
    • View Profile
Re: Open source OpenCL GPU miner by girino
« Reply #75 on: January 23, 2014, 06:18:59 am »
Using some optimizations by DGA I could get another 10% improvement in performance. Still not the fastest  around, but getting closer ;)

Please use the link on the first page to test!

Offline girino

  • Full Member
  • ***
  • Posts: 73
    • View Profile
Helo all,

I just released binaries for this miner (see first post). Those binaries use a new sha512 code, tuned and optimized by my during the last few days. It is at least 15% faster than the old code. Binaries have a 1% fee added.

Short Version: faster! gpuv7 is default, gpuv9 is the low memory mode.

Download link: https://www.dropbox.com/sh/n4ta5olqp2g5i9l/xkr0sCTrUu

TLDR version: The new "-a" options "gpuv7", "gpuv8" and "gpuv9" all use the new code. "gpuv9" is the low memory version, same as "gpuv3", using the new sha512 code. "gpuv8" is the same as "gpuv4" with the new code, and "gpuv7" is based on "gpuv6", but uses a linear collision avoidance hashtable instead of the plain hashtable used before. It is bound to find slightly more hashes at the cost of being a little bit slower. since in all my tests it produces better CPM, i made "gpuv7" the default mode. The old modes are still kept for compatibility (maybe my optimizations are not compatible with all GPUs, then you can always fall back to the old modes).

Please download and enjoy.

Thanks,
girino.

Offline dully

  • Newbie
  • *
  • Posts: 2
    • View Profile
Thanks Girino,

Good PTS miner. I've tried it today and can say it is quite quick compared to the crz one (cudaPTSwin). I am getting 50-60 c/m more with this one.
I really appreciate making the source available! Thank you.

Offline hammurabi

  • Full Member
  • ***
  • Posts: 63
    • View Profile
Awesome piece of software
AMD R9 280x ~1040 c/m
built on OpenSuse 13.1

Thank You.

Offline Instigater

  • Newbie
  • *
  • Posts: 5
    • View Profile
Compilation on Ubuntu 13.10 stops here
Code: [Select]
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE[_ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE]+0x281): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `CMasterThread::run()':
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x127): undefined reference to `boost::thread::start_thread_noexcept()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x1dd): undefined reference to `vtable for boost::detail::thread_data_base'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x529): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x697): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7a9): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7c0): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x877): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x896): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `_GLOBAL__sub_I_collision_table_bits':
main_poolminer.cpp:(.text.startup+0xf63): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf6f): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf7b): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text.startup+0xfa1): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:(.rodata._ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE[_ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE]+0x10): undefined reference to `typeinfo for boost::detail::thread_data_base'
obj/CProtoshareProcessor.o: In function `_GLOBAL__sub_I__Z31protoshares_revalidateCollisionP13blockHeader_tPhjjmP14CBlockProviderPFvS1_jS1_Ej':
CProtoshareProcessor.cpp:(.text.startup+0x33): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x3f): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x4b): undefined reference to `boost::system::system_category()'
CProtoshareProcessor.cpp:(.text.startup+0x71): undefined reference to `boost::system::system_category()'
collect2: error: ld returned 1 exit status
make: *** [ptsminer] Error 1
What is wrong?

Offline girino

  • Full Member
  • ***
  • Posts: 73
    • View Profile
Compilation on Ubuntu 13.10 stops here
Code: [Select]
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE[_ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE]+0x281): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `CMasterThread::run()':
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x127): undefined reference to `boost::thread::start_thread_noexcept()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x1dd): undefined reference to `vtable for boost::detail::thread_data_base'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x529): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x697): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7a9): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7c0): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x877): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x896): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `_GLOBAL__sub_I_collision_table_bits':
main_poolminer.cpp:(.text.startup+0xf63): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf6f): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf7b): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text.startup+0xfa1): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:(.rodata._ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE[_ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE]+0x10): undefined reference to `typeinfo for boost::detail::thread_data_base'
obj/CProtoshareProcessor.o: In function `_GLOBAL__sub_I__Z31protoshares_revalidateCollisionP13blockHeader_tPhjjmP14CBlockProviderPFvS1_jS1_Ej':
CProtoshareProcessor.cpp:(.text.startup+0x33): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x3f): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x4b): undefined reference to `boost::system::system_category()'
CProtoshareProcessor.cpp:(.text.startup+0x71): undefined reference to `boost::system::system_category()'
collect2: error: ld returned 1 exit status
make: *** [ptsminer] Error 1
What is wrong?

you need to install boost. Not sure what is the package name on ubuntu 13.10. try this:

Code: [Select]
sudo apt-get install libboost-all-dev

Offline dga

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Huge thanks for leaving yours open source.  It's educational to see what you did for optimizing it for OpenCL - I haven't quite wrapped my head around when it's best to use the int4/int8 types, and I appreciate you providing an example of how the CUDA version translates into optimized CL.

As an appreciation:

An optimization that I know the CUDA compiler gets through automated loop unrolling, but that I'm not sure the OpenCL compiler gets, is that in your step0to15 function, you can eliminate the + w[ i ]  for values of i between 5 and 14.  It might be worth testing a variant with two versions of step0to15, one "normal" and one for the known-zero-values in w.

  -Dave

Offline girino

  • Full Member
  • ***
  • Posts: 73
    • View Profile
Huge thanks for leaving yours open source.  It's educational to see what you did for optimizing it for OpenCL - I haven't quite wrapped my head around when it's best to use the int4/int8 types, and I appreciate you providing an example of how the CUDA version translates into optimized CL.

As an appreciation:

An optimization that I know the CUDA compiler gets through automated loop unrolling, but that I'm not sure the OpenCL compiler gets, is that in your step0to15 function, you can eliminate the + w[ i ]  for values of i between 5 and 14.  It might be worth testing a variant with two versions of step0to15, one "normal" and one for the known-zero-values in w.

  -Dave

I am really quite new to openCL. I am used to optimize C and C++ code, so I just tried to apply the same techniques. I have still a hard time thinking in parallel ;) As far as read in articles and specs, CUDA does a really better job in automatically vectorizing and in loop unrolling than OpenCL. Usually it's recommended that those be left to the compiler in CUDA while in OpenCL people tend to do it manually. In all my tests, only "long2" really improves speed (there are 128 registers in most gpus), but i used long8 anyway because it really makes the code more readable ;)

I was planning to inspect the intermediary values of vectors so I could optimize a little more, but my son went through a surgery (removing the amygdalae, nothing serious) and i am spending all my free time with him instead of coding ;)

Possibly next week i can continue working on this.

Offline Rhorho

  • Newbie
  • *
  • Posts: 1
    • View Profile
I'm not getting any errors upon execution in Windows7, other than this:

Code: [Select]
C:\[edited]\ptsminer-win64-cygwin64-build1\ptsmi
ner\ptsminer.exe -u [edited] -device 1 -m 26
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
***    if you like, donate:
***    PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
***    BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
**SSE4/AVX auto-detection
using AVX
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
connecting to 54.201.26.128:1337
Mining for approx 30 seconds to support further development
Payments to: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
[MASTER] work received - sharetarget: 03ffffffffffffffffffffffffffffffffffffffff
ffffffffffffffbeefde4d
[STATS] 2014-Feb-04 06:13:22 | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
      0 [unknown (0x2A3C)] ptsminer 9552 open_stackdumpfile: Dumping stack trace
 to ptsminer.exe.stackdump

The stackdumpfile contains this:

Code: [Select]
Exception: STATUS_ACCESS_VIOLATION at rip=00100401339
rax=0000000000000000 rbx=00000000FFFFC760 rcx=00000000FFFFC7C0
rdx=00000000FFFFC778 rsi=0000000000000010 rdi=0000000000000000
r8 =0000000000000001 r9 =0000000000000000 r10=0000000000000000
r11=00000000FFFFC830 r12=00000000FFFFC7C0 r13=0000000000000080
r14=00000000FFFFC778 r15=00000000FFFFC8E0
rbp=00000000FFFFC720 rsp=00000000FFFFC3D0
program=C:\Users\[edited]\ptsminer-win64-cygwin64-build1\ptsminer\ptsminer.exe, pid 7680, thread unknown (0x1F6C)
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame        Function    Args
000FFFFC720  00100401339 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace

Any idea how to fix this?

Offline girino

  • Full Member
  • ***
  • Posts: 73
    • View Profile
I'm not getting any errors upon execution in Windows7, other than this:

Code: [Select]
C:\[edited]\ptsminer-win64-cygwin64-build1\ptsmi
ner\ptsminer.exe -u [edited] -device 1 -m 26
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
***    if you like, donate:
***    PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
***    BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
**SSE4/AVX auto-detection
using AVX
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
connecting to 54.201.26.128:1337
Mining for approx 30 seconds to support further development
Payments to: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
[MASTER] work received - sharetarget: 03ffffffffffffffffffffffffffffffffffffffff
ffffffffffffffbeefde4d
[STATS] 2014-Feb-04 06:13:22 | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
      0 [unknown (0x2A3C)] ptsminer 9552 open_stackdumpfile: Dumping stack trace
 to ptsminer.exe.stackdump

The stackdumpfile contains this:

Code: [Select]
Exception: STATUS_ACCESS_VIOLATION at rip=00100401339
rax=0000000000000000 rbx=00000000FFFFC760 rcx=00000000FFFFC7C0
rdx=00000000FFFFC778 rsi=0000000000000010 rdi=0000000000000000
r8 =0000000000000001 r9 =0000000000000000 r10=0000000000000000
r11=00000000FFFFC830 r12=00000000FFFFC7C0 r13=0000000000000080
r14=00000000FFFFC778 r15=00000000FFFFC8E0
rbp=00000000FFFFC720 rsp=00000000FFFFC3D0
program=C:\Users\[edited]\ptsminer-win64-cygwin64-build1\ptsminer\ptsminer.exe, pid 7680, thread unknown (0x1F6C)
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame        Function    Args
000FFFFC720  00100401339 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace

Any idea how to fix this?

You need to run with "-a gpu" (or any of the variations, gpuv2 to gpuv9). The way you did it is trying to CPU mine with AVX instruction set that your machine does not support.

Offline curiouser

  • Newbie
  • *
  • Posts: 5
    • View Profile
I appreciate the open-source miner, girino!

I needed to do the following, in addition to the instructions, to get a successful compile and run on Ubuntu 12.04:

1.  export BOOST_LIB_PATH=/usr/local/lib/boost
2.  export BOOST_INCLUDE_PATH=/usr/include/boost
3.  export LD_LIBRARY_PATH=/usr/local/lib/boost
4.  make osfinder.sh executable

Note, boost install locations can differ, adjust accordingly.

Also, it is not necessary to logout/login or reboot as per the AMD SDK instructions.  Just source /etc/profile.

And, does your code honor the "-m" memory buffer size options?  I suspect not, because it errors out regardless of specified memory size when I am already running the clpts miner.  My hope was that, on my 2GB R9 270x, I could run both, just as how one can run multiple clpts threads on higher-memory GPUs.
« Last Edit: February 18, 2014, 04:39:07 pm by curiouser »
PTS: PgMZYufqBS5MSoRuX1sMTgP7Tv37P4owRx

Offline barwizi

  • Hero Member
  • *****
  • Posts: 764
  • Noirbits, NoirShares, NoirEx.....lol, noir anyone?
    • View Profile
    • Noirbitstalk.org
can this be used to mine to a client?
--Bar--  PiNEJGUv4AZVZkLuF6hV4xwbYTRp5etWWJ

The magical land of crypto, no freebies people.

Offline girino

  • Full Member
  • ***
  • Posts: 73
    • View Profile
I appreciate the open-source miner, girino!

I needed to do the following, in addition to the instructions, to get a successful compile and run on Ubuntu 12.04:

1.  export BOOST_LIB_PATH=/usr/local/lib/boost
2.  export BOOST_INCLUDE_PATH=/usr/include/boost
3.  export LD_LIBRARY_PATH=/usr/local/lib/boost
4.  make osfinder.sh executable

Note, boost install locations can differ, adjust accordingly.

Also, it is not necessary to logout/login or reboot as per the AMD SDK instructions.  Just source /etc/profile.

And, does your code honor the "-m" memory buffer size options?  I suspect not, because it errors out regardless of specified memory size when I am already running the clpts miner.  My hope was that, on my 2GB R9 270x, I could run both, just as how one can run multiple clpts threads on higher-memory GPUs.

the -m is added of 512Mb for algorithms v4, v6, v7 and v8. On algorithms v3 and v9 the value selected by -m is exact. (-m determines the size of the hash-table to be used. algos v4, v6, v7 and v8 also have all the hashes pre-calculated in batch, which uses an extra 512Mb. V3 and v9 calculate hashes on the fly, so no extra memory)

Offline girino

  • Full Member
  • ***
  • Posts: 73
    • View Profile
can this be used to mine to a client?

I'm not sure to what this means.