BitShares Forum
Other => Graveyard => BitShares PTS => Topic started by: girino on January 16, 2014, 03:37:55 am
-
I have posted some time ago in the beeeeer.org thread about my OpenCL version of ptsminer, and now I think it is mature enough to be "released". So here it goes. You can find it at the following address:
http://github.com/girino/ptsminer (http://github.com/girino/ptsminer)
Binaries are available here: https://www.dropbox.com/sh/n4ta5olqp2g5i9l/xkr0sCTrUu
This is even faster than the old version. (it is faster than CUDA miners on amazon ec2). Same options are available. Added the new modes: gpuv7 (default), gpuv8 and gpuv9 (low memory version). gpuv7 is generally faster, but choose the one most apropriate for you.
This is an experimental miner, use at your own risk. Fully open source, no hidden "mining for the author" tricks. So please, donate some PTS if you like it or if it is useful to you. Edit: There is a 1% fee now. (30 seconds every hour or so). You can still remove it in the source code and recompile.
It is preconfigured to mine at beeeeer.org pool, but this can be changed with the -o command line switch.
Since this is a modified CPU miner, default is to mine using CPU. To mine using GPU, use the "-a gpu" switch.
Please read installation instructions, and good luck.
Edit: Added the option "-device" where you can choose which devices will be used. Use a comma separated list, with no spaces, like this:
./ptsminer -u PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h -a gpu -device 0,3,4
The old way (assigning threads sequentially to each device) still works, so if you want to use the first devices or all of them you can still use "-t" option instead.
-
What's gpuv2, gpuv3, gpuv4 and gpuv5?
-
What's gpuv2, gpuv3, gpuv4 and gpuv5?
hidden options for testing different algorithms.
v4 is faster but uses huge ammounts of memory (3 passes: 1 precalculates all the hashes, 2 fills the hash table, and 3 finds collisions).
v3 is slower but uses way less memory (calculates the hash and populates the DB in the same step).
v2 is even slower (does everything in a single step, just like the original algorithm from ptsminer).
v5 is garbage (i as experimenting with binary search instead of hash table, but sorting is still too slow).
Default is gpuv4, which should work for all high end GPUs.
-
What's the memory requirement for v4?
-
Hi. thank you!
can you release any windows binary for test?
cygwin installing taking a lot of time.
-
Hi. thank you!
can you release any windows binary for test?
cygwin installing taking a lot of time.
I can, soon
-
What's the memory requirement for v4?
1.3 Gb. Not sure how much of it is allocated in main memory and how much in the graphic card, but my 32 bit Linux with an old GeForce couldn't run it.
-
i get segmentation faults... any idea why?
happens with any setting .. using cygwin under win7 64bit with an ATI card (Vram 512 mb) - from what i understand at least the normal mode should work....
or is there a chance to get windows binaries maybe? :)
-
Soon, I am implementing this in my miner
-
Soon, I am implementing this in my miner
your aropencl miner is crashing when the actual work should begun :/
i can encourage you to at least catch all exceptions and print them to the console, so that we can give you at least some kind of feedback and not just the "it crashed" feedback... just an idea
-
i get segmentation faults... any idea why?
happens with any setting .. using cygwin under win7 64bit with an ATI card (Vram 512 mb) - from what i understand at least the normal mode should work....
or is there a chance to get windows binaries maybe? :)
can you post the output of the program and the parameters used to launch it? Did you try with the latest version or some previous one? Are you using cygwin 32 or 64 bits? What about the SDK?
-
its cygwin64 and i used it with different parameters - all led to the same result - tryed the latest version - the amd sdk is installed (otherwise i wouldnt be able to build it) and the output is something like "segmentation fault (core dumped)" or something. Maybe catching all exceptions and priniting the stacktrace to the console could help to debug it, so that you at least know where in the code the exception occured...
-
That's a very good suggestion and i might implement it in a later version. Or someone else might do it, since the project is open source. Now, for the moment being, I don't have this option and if you really want to make this work, i need you to send me the exact parameters AND the exact output of the program. So please, please, pretty please, pretty pretty please, be so kind as to send me a screenshot or to copy and paste the contents of your terminal.
-
girino, iruu told me that the radeons before r9x didn't allow continuous buffers to be larger than 512 MB. Did you implement a work around for that?
-
@ archit: your miner is also crashing on 2gb VRam... so the problem is not fixed to that... even tho a version that requires less vram would be awesome
@ girino:
it wont help you much more than what i already noted, but here you go (partial german output):
$ ./ptsminer -u x.x -p x -t 1 -m 28 -a gpu -o http://ypool.net -q 10034
********************************************
*** ptsminer - Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
********************************************
using GPU
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
Segmentation fault (Speicherabzug geschrieben)
-
it wont help you much more than what i already noted, but here you go (partial german output):
$ ./ptsminer -u x.x -p x -t 1 -m 28 -a gpu -o http://ypool.net -q 10034
But it WAS much help! The problem is, ypool does not use ptsminer protocol. This miner does not work with ypool, since it is based on ptsminer.
-
i see ..
tho same problem using this:
./ptsminer -u x.x -p x -t 1 -m 28 -o http://ptsmine.beeeeer -q 1337 -a gpu
:)
-
i see ..
tho same problem using this:
./ptsminer -u x.x -p x -t 1 -m 28 -o http://ptsmine.beeeeer -q 1337 -a gpu
:)
Those are not http protocols either. Try removing the "http://". Also, there's a missing bit on the domain name, should be: "ptsmine.beeeeer.org".
But in this case instead of a segfault you should have an "host not found" message. Can you please send the full output you get from the following command:
./ptsminer -u YOUR_ADDRESS -p x -t 1 -m 28 -o ptsmine.beeeeer.org -q 1337 -a gpu
thanks.
-
girino, iruu told me that the radeons before r9x didn't allow continuous buffers to be larger than 512 MB. Did you implement a work around for that?
Not really, but by using "-a gpuv4 -m 26" you can avoid this problem since only 2^26 * sizeof(*uint32_t) will be allocated.
-
i see ..
tho same problem using this:
./ptsminer -u x.x -p x -t 1 -m 28 -o http://ptsmine.beeeeer -q 1337 -a gpu
:)
comes to mind, what GPU do you have? does it have updated drivers? looks like the miner is not finding any GPU available and thus crashing. I'll see to it that it reports missing GPUs in a friendlier way.
-
Hi, I already installed the OpenCL Headers, but now I get this error:
/usr/bin/ld: cannot find -lOpenCL
collect2: error: ld returned 1 exit status
make: *** [ptsminer] Error 1
I'm using Ubuntu 13.10 64b, with AMD app SDK installed.
Thanks in advance.
-
Hi, I already installed the OpenCL Headers, but now I get this error:
/usr/bin/ld: cannot find -lOpenCL
collect2: error: ld returned 1 exit status
make: *** [ptsminer] Error 1
I'm using Ubuntu 13.10 64b, with AMD app SDK installed.
Thanks in advance.
I do not know where AMD installs libOpencl.so. If you know where AMD installed the SDK, the libraries should be on some subfolder called "lib/" or "lib64/", or even "lib/x86_64". Anyway, when you find the right folder, just point LD_LIBRARY_PATH to it like this:
export LD_LIBRARY_PATH=/PATH/TO/SDK/LIB/FOLDER:$LD_LIBRARY_PATH
where "/PATH/TO/SDK/LIB/FOLDER" is the folder you just found to contain libOpenCL.so
-
[STATS] 2014-Jan-16 19:18:13 | 0.0 c/m | 0.0 sh/m | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
Platform: AMD Accelerated Parallel Processing
Device: Tahiti
Device: Tahiti
Device: AMD Athlon(tm) X2 340 Dual Core Processor
Starting OpenCLMomentum V4
Device: Tahiti
Max work group size: 256
[STATS] 2014-Jan-16 19:18:42 | 0.0 c/m | 0.0 sh/m | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
[STATS] 2014-Jan-16 19:19:42 | 0.0 c/m | 0.0 sh/m | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
Ok after 5 minutes it started, let it run a bit
I think that only one card is working (got only, collision found.... By 0)
-
Hi
I build that and that work.
but perfomance not perfect. on my 7950 I get only ~715 cpm. 1gh binary miner get ~1000 cpm.
any ideas how to up speed?
my options -a gpuv4 -m 28
-
280x ~804
But is it now reconnecting from time to time
-
[STATS] 2014-Jan-16 19:18:13 | 0.0 c/m | 0.0 sh/m | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
Platform: AMD Accelerated Parallel Processing
Device: Tahiti
Device: Tahiti
Device: AMD Athlon(tm) X2 340 Dual Core Processor
Starting OpenCLMomentum V4
Device: Tahiti
Max work group size: 256
[STATS] 2014-Jan-16 19:18:42 | 0.0 c/m | 0.0 sh/m | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
[STATS] 2014-Jan-16 19:19:42 | 0.0 c/m | 0.0 sh/m | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
Ok after 5 minutes it started, let it run a bit
I think that only one card is working (got only, collision found.... By 0)
To mine on more than one card you should use more than one thread. In your case, -t 2 will do (-t 3 will use the CPU also, but using the CPU with openCL code is really slow). It takes some time in the first run because it has to compile the openCL code. After that the driver caches everything and any restart should be fast.
-
Hi
I build that and that work.
but perfomance not perfect. on my 7950 I get only ~715 cpm. 1gh binary miner get ~1000 cpm.
any ideas how to up speed?
my options -a gpuv4 -m 28
You can try the different algorithms, V3 and V2. Maybe one of them is best for your card. Also, if there is more than one card/GPU/OpenCLDevice, you should run more than one thread. (i am implementing code to select which GPUs you want it to run from the command line).
-
1350 cpm for 2x280x (vs abt 2200 on 1gh miner), driver crash after 15 min.
-
1350 cpm for 2x280x (vs abt 2200 on 1gh miner), driver crash after 15 min.
Any screenshot of crash so i can try to find out what happened?
-
ok some updates:
on the mining hardware (with 3 graphic chips) using the following command
./ptsminer -u <ptsadress> -t 3 -m 28 -o ptsmine.beeeeer.org -q 1337 -a gpuv4
it starts working but then crahes after some minutes - see screenshot:
(http://i.pictr.com/9iplacagjg.jpg)
using -a gpuv3 it works like a charm - starting at low c/m but then rising up
on my notebook with 512mb vram graphics i get the following output (identical behavior with "-a gpu"):
$ ./ptsminer -u <pts id> -t 1 -m 26 -o ptsmine.beeeeer.org -q 1337 -a gpuv3
********************************************
*** ptsminer - Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
********************************************
using GPU
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
connecting to 54.201.26.128:1337
[MASTER] work received - sharetarget: 03ffffffffffffffffffffffffffffffffffffffffffffffffffffffbeefde4d
[STATS] 2014-Jan-16 23:41:43 | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
Platform: AMD Accelerated Parallel Processing
Device: ATI RV710
Device: Intel(R) Core(TM)2 Duo CPU U9400 @ 1.40GHz
Starting OpenCLMomentum V3
Device: ATI RV710
Max work group size: 128
ERROR: -48, CL_INVALID_KERNEL, if kernel is not a valid kernel object.
assertion "_MY_ERR_X == CL_SUCCESS" failed: file "OpenCLObjects.cpp", line 456, function: size_t OpenCLKernel::getWorkGroupSize(OpenCLDevice*)
Aborted
-
correction: after half an hour it also crashed with "gpuv3" mode with the following output:
[WORKER] collision found: 29176733 <-> 34093243 #116120 @ 1389914984 by 2
[MASTER] submitted share -> SHARE
[STATS] 2014-Jan-16 15:12:27 | 2152.4 c/m | 33.2 sh/m | VL: 1788 (99.8%), RJ: 4 (0.2%), ST: 0 (0.0%)
ERROR: -4, CL_MEM_OBJECT_ALLOCATION_FAILURE, if there is a failure to allocate memory for buffer object.
assertion "_MY_ERR_X == CL_SUCCESS" failed: file "OpenCLObjects.cpp", line 427, function: _cl_event* OpenCLCommandQueue::enqueueKernel1D(OpenCLKernel*, size_t, size_t, _cl_event**, size_t)
Aborted
-
correction: after half an hour it also crashed with "gpuv3" mode with the following output:
[WORKER] collision found: 29176733 <-> 34093243 #116120 @ 1389914984 by 2
[MASTER] submitted share -> SHARE
[STATS] 2014-Jan-16 15:12:27 | 2152.4 c/m | 33.2 sh/m | VL: 1788 (99.8%), RJ: 4 (0.2%), ST: 0 (0.0%)
ERROR: -4, CL_MEM_OBJECT_ALLOCATION_FAILURE, if there is a failure to allocate memory for buffer object.
assertion "_MY_ERR_X == CL_SUCCESS" failed: file "OpenCLObjects.cpp", line 427, function: _cl_event* OpenCLCommandQueue::enqueueKernel1D(OpenCLKernel*, size_t, size_t, _cl_event**, size_t)
Aborted
looks like there's some memory leak. I'll look into it and fix ASAP.
-
Awesome work!
-
correction: after half an hour it also crashed with "gpuv3" mode with the following output:
[WORKER] collision found: 29176733 <-> 34093243 #116120 @ 1389914984 by 2
[MASTER] submitted share -> SHARE
[STATS] 2014-Jan-16 15:12:27 | 2152.4 c/m | 33.2 sh/m | VL: 1788 (99.8%), RJ: 4 (0.2%), ST: 0 (0.0%)
ERROR: -4, CL_MEM_OBJECT_ALLOCATION_FAILURE, if there is a failure to allocate memory for buffer object.
assertion "_MY_ERR_X == CL_SUCCESS" failed: file "OpenCLObjects.cpp", line 427, function: _cl_event* OpenCLCommandQueue::enqueueKernel1D(OpenCLKernel*, size_t, size_t, _cl_event**, size_t)
Aborted
looks like there's some memory leak. I'll look into it and fix ASAP.
looks like doing clReleaseEvent is not enough to free the memory allocated by cl_events on some platforms. I just refactored the code to eliminate the need for events and should release a new version soon. (new version will also support selecting the devices in which to run).
Those in a hurry to test the changes can download the branch "opencl-code-cleanup" bit doing: git clone -b opencl-code-cleanup https://github.com/girino/ptsminer
edit: sorry for bad english, it's late at night here, and i'm almost sleeping over my keyboard.
-
correction: after half an hour it also crashed with "gpuv3" mode with the following output:
[WORKER] collision found: 29176733 <-> 34093243 #116120 @ 1389914984 by 2
[MASTER] submitted share -> SHARE
[STATS] 2014-Jan-16 15:12:27 | 2152.4 c/m | 33.2 sh/m | VL: 1788 (99.8%), RJ: 4 (0.2%), ST: 0 (0.0%)
ERROR: -4, CL_MEM_OBJECT_ALLOCATION_FAILURE, if there is a failure to allocate memory for buffer object.
assertion "_MY_ERR_X == CL_SUCCESS" failed: file "OpenCLObjects.cpp", line 427, function: _cl_event* OpenCLCommandQueue::enqueueKernel1D(OpenCLKernel*, size_t, size_t, _cl_event**, size_t)
Aborted
update to the new version. There was a memory leak that should be solved in this new version.
-
1350 cpm for 2x280x (vs abt 2200 on 1gh miner), driver crash after 15 min.
try the new version. I corrected some memory leaks that should solve the crashing problem. Also there's a slight performance increase.
-
Hi
I build that and that work.
but perfomance not perfect. on my 7950 I get only ~715 cpm. 1gh binary miner get ~1000 cpm.
any ideas how to up speed?
my options -a gpuv4 -m 28
You can try the different algorithms, V3 and V2. Maybe one of them is best for your card. Also, if there is more than one card/GPU/OpenCLDevice, you should run more than one thread. (i am implementing code to select which GPUs you want it to run from the command line).
with V3 little more. close to 800 cpm. no, I have only one card.
you work is great, but difference still too much.
I will also look code in deep, if I will find any ideas I will let you know.
-
Please edit ti so that we can select platforms also
-
Please edit ti so that we can select platforms also
what do you mean by choosing platforms? you can choose all devices from a single platform by listing them. Supose that you have 2 platforms, one with 3 devices, the other with 4, and you want all devices from the second, just: -device 3,4,5,6. Or is there something i am missing?
-
Please edit ti so that we can select platforms also
what do you mean by choosing platforms? you can choose all devices from a single platform by listing them. Supose that you have 2 platforms, one with 3 devices, the other with 4, and you want all devices from the second, just: -device 3,4,5,6. Or is there something i am missing?
I feel like you are missing something.
OpenCL devices have a platform id and device id too. In your code, you simple assume the platform to be 0 but many of the times it isn't the case
-
Please edit ti so that we can select platforms also
what do you mean by choosing platforms? you can choose all devices from a single platform by listing them. Supose that you have 2 platforms, one with 3 devices, the other with 4, and you want all devices from the second, just: -device 3,4,5,6. Or is there something i am missing?
I feel like you are missing something.
OpenCL devices have a platform id and device id too. In your code, you simple assume the platform to be 0 but many of the times it isn't the case
Actually, i have already changed that. I list all devices from all platforms and put them into a big list. When you choose device 3 you are actually choosing device 0 from platform 1, so you can just use -device 3 to use it. It was easier to do this than to parse another command line for platform.
-
Hi
I build that and that work.
but perfomance not perfect. on my 7950 I get only ~715 cpm. 1gh binary miner get ~1000 cpm.
any ideas how to up speed?
my options -a gpuv4 -m 28
You can try the different algorithms, V3 and V2. Maybe one of them is best for your card. Also, if there is more than one card/GPU/OpenCLDevice, you should run more than one thread. (i am implementing code to select which GPUs you want it to run from the command line).
with V3 little more. close to 800 cpm. no, I have only one card.
you work is great, but difference still too much.
I will also look code in deep, if I will find any ideas I will let you know.
john the ripper has sha512 code specific for AMD, but i was unable to test it since i have no AMD card. I will create a "gpuv6" that uses it instead of the generic version, and see if we get closer to them. Any other tips and tricks for tuning opencl code on AMD cards are welcome.
-
I still can't understand it.
Let's say I have 2 platforms and I want to select device 0 on platform 1. What would I use?
-
Hi
I build that and that work.
but perfomance not perfect. on my 7950 I get only ~715 cpm. 1gh binary miner get ~1000 cpm.
any ideas how to up speed?
my options -a gpuv4 -m 28
You can try the different algorithms, V3 and V2. Maybe one of them is best for your card. Also, if there is more than one card/GPU/OpenCLDevice, you should run more than one thread. (i am implementing code to select which GPUs you want it to run from the command line).
with V3 little more. close to 800 cpm. no, I have only one card.
you work is great, but difference still too much.
I will also look code in deep, if I will find any ideas I will let you know.
john the ripper has sha512 code specific for AMD, but i was unable to test it since i have no AMD card. I will create a "gpuv6" that uses it instead of the generic version, and see if we get closer to them. Any other tips and tricks for tuning opencl code on AMD cards are welcome.
thanks for add gpuv6! I will test asap after adding.
-
I still can't understand it.
Let's say I have 2 platforms and I want to select device 0 on platform 1. What would I use?
just run it the first time, note the numbers that appear in front of the devices you want to use, than use those numbers. Example, with my old desktop:
Platform 00: AMD Accelerated Parallel Processing
Device 00: Intel(R) Core(TM)2 CPU 4300 @ 1.80GHz
Platform 01: NVIDIA CUDA
Device 01: GeForce 8400GS
If i want to use my GeForce 8400GS, I just :
./ptsminer -u xxxxx -a gpu -device 01
if you know exactly how many devices you have on each platform, no need to run the first time, just do the math: if platform 0 has 3 devices, first device on platform 1 is "3".
-
Got it finally
-
Got it finally
just added a new option "-list-devices" that simply list all devices and quit. That way there's no need to try to run it the first time just to find what are the devices numbers. Just "-list-devices" and choose the ones you want.
-
error: illegal implicit conversion between two pointers with different address spaces
It seems that ctx_update is declared as
void ctx_update(sha512_ctx * ctx,
global uint8_t * string, uint32_t len)
but when an int declared in the cl code is passed to this function it gives the error. If I remove the global part it has problem with global char* message parameter
-
Hi. thank you!
can you release any windows binary for test?
cygwin installing taking a lot of time.
It may be that installing mingw by itself:
http://www.mingw.org/ (http://www.mingw.org/)
--and using its provided virtual shell and gcc compiler (with appropriate exported path changes) will do the trick. I've only thought of this as I'm halfway through the seemingly eternal cygwin install myself . . .
-
Windows executable under testing :D :o
-
error: illegal implicit conversion between two pointers with different address spaces
It seems that ctx_update is declared as
void ctx_update(sha512_ctx * ctx,
global uint8_t * string, uint32_t len)
but when an int declared in the cl code is passed to this function it gives the error. If I remove the global part it has problem with global char* message parameter
I was doing some changes yesterday. Should be corrected by now.
-
Yeah it is, windows binary with ypool support under going test
-
./ptsminer -u ***** -p ***** -a gpu -d 0 -o mining.ypool.net -q 10034 -m 27 -t 1
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
using GPU
Available devices:
Platform 00: AMD Accelerated Parallel Processing
Device 00: Pitcairn
Device 01: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz
Adjusting num threads to match device list: 1
Initializing GPU...
Initing device 0.
Starting OpenCLMomentum V4
Device 00: Pitcairn
Max work group size: 256
ERROR: -61, CL_INVALID_BUFFER_SIZE, if size is 0.Implementations may return CL_INVALID_BUFFER_SIZE if size is greater than the CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in the table of allowed values for param_name for clGetDeviceInfo for all devices in context.
ptsminer: OpenCLObjects.cpp:373: OpenCLBuffer* OpenCLContext::createBuffer(size_t, cl_mem_flags, void*): Assertion `_MY_ERR_X == 0' failed.
Aborted (core dumped)
I have used all of the gpuv options, and none of them work. Looking in to it it would seem my graphics card only has 1GB of VRAM, although I do have 16GB of system RAM. For some reason I though it was 3 :S
-
./ptsminer -u ***** -p ***** -a gpu -d 0 -o mining.ypool.net -q 10034 -m 27 -t 1
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
using GPU
Available devices:
Platform 00: AMD Accelerated Parallel Processing
Device 00: Pitcairn
Device 01: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz
Adjusting num threads to match device list: 1
Initializing GPU...
Initing device 0.
Starting OpenCLMomentum V4
Device 00: Pitcairn
Max work group size: 256
ERROR: -61, CL_INVALID_BUFFER_SIZE, if size is 0.Implementations may return CL_INVALID_BUFFER_SIZE if size is greater than the CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in the table of allowed values for param_name for clGetDeviceInfo for all devices in context.
ptsminer: OpenCLObjects.cpp:373: OpenCLBuffer* OpenCLContext::createBuffer(size_t, cl_mem_flags, void*): Assertion `_MY_ERR_X == 0' failed.
Aborted (core dumped)
I have used all of the gpuv options, and none of them work. Looking in to it it would seem my graphics card only has 1GB of VRAM, although I do have 16GB of system RAM. For some reason I though it was 3 :S
You might try -gpuv3 -m 26. This works for me on my old Geforce board. Seems that some cards do not allow to alloc contigous space of more than 512 MB. gpuv3 does not use any extra memory besides the hash table, so using it with -m 26 should guarantee you are bellow 512 Mb.
if 26 does not work, try 25. I do not recommend less that that, since it will become very inefficient.
-
Hi. thank you!
can you release any windows binary for test?
cygwin installing taking a lot of time.
It may be that installing mingw by itself:
http://www.mingw.org/ (http://www.mingw.org/)
--and using its provided virtual shell and gcc compiler (with appropriate exported path changes) will do the trick. I've only thought of this as I'm halfway through the seemingly eternal cygwin install myself . . .
This might do, but should need an entirelly new makefile. mingw and cygwin have very different paths and libs... It's always a pain to port from one to the other. (I use cygwin because it's closer to linux, and since my time with the windows machine is limited, it is faster to port)
-
./ptsminer -u ***** -p ***** -a gpu -d 0 -o mining.ypool.net -q 10034 -m 27 -t 1
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
using GPU
Available devices:
Platform 00: AMD Accelerated Parallel Processing
Device 00: Pitcairn
Device 01: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz
Adjusting num threads to match device list: 1
Initializing GPU...
Initing device 0.
Starting OpenCLMomentum V4
Device 00: Pitcairn
Max work group size: 256
ERROR: -61, CL_INVALID_BUFFER_SIZE, if size is 0.Implementations may return CL_INVALID_BUFFER_SIZE if size is greater than the CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in the table of allowed values for param_name for clGetDeviceInfo for all devices in context.
ptsminer: OpenCLObjects.cpp:373: OpenCLBuffer* OpenCLContext::createBuffer(size_t, cl_mem_flags, void*): Assertion `_MY_ERR_X == 0' failed.
Aborted (core dumped)
I have used all of the gpuv options, and none of them work. Looking in to it it would seem my graphics card only has 1GB of VRAM, although I do have 16GB of system RAM. For some reason I though it was 3 :S
You might try -gpuv3 -m 26. This works for me on my old Geforce board. Seems that some cards do not allow to alloc contigous space of more than 512 MB. gpuv3 does not use any extra memory besides the hash table, so using it with -m 26 should guarantee you are bellow 512 Mb.
if 26 does not work, try 25. I do not recommend less that that, since it will become very inefficient.
That's the mining working with 25 now thanks :)
However, it doesn't seem able to connect to any pools...
-
Only beeer works
-
Only beeer works
Won't connect to there either
./ptsminer -u * -a gpuv3 -d 0 -m 25 -t 1
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
using GPU
Available devices:
Platform 00: AMD Accelerated Parallel Processing
Device 00: Pitcairn
Device 01: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz
Adjusting num threads to match device list: 1
Initializing GPU...
Initing device 0.
Starting OpenCLMomentum V3
Device 00: Pitcairn
Max work group size: 256
Device 0 Inited.
All GPUs Initialized...
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
connecting to 54.201.26.128:1337
no connection to the server, reconnecting in 10 seconds
-
Only beeer works
Won't connect to there either
./ptsminer -u * -a gpuv3 -d 0 -m 25 -t 1
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
using GPU
Available devices:
Platform 00: AMD Accelerated Parallel Processing
Device 00: Pitcairn
Device 01: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz
Adjusting num threads to match device list: 1
Initializing GPU...
Initing device 0.
Starting OpenCLMomentum V3
Device 00: Pitcairn
Max work group size: 256
Device 0 Inited.
All GPUs Initialized...
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
connecting to 54.201.26.128:1337
no connection to the server, reconnecting in 10 seconds
Does any other miner connects? Are you not behind a firewall or something?
(BTW, it works with other pools that support the same protocol as beeeeer.org. I've tried http://ptspool.com/ and http://54.238.185.113/ both work great. not sure if 1GH supports this protocol.) Does this protocol have a name at all?
-
I've a CPU miner running that connects to both just fine, and my firewall allows outgoing connections
-
I've a CPU miner running that connects to both just fine, and my firewall allows outgoing connections
Comes to mind, this error happens when the server refuses the username/password. Are you sure you typed in your pts address correctly?
-
Tried it a few times with both ypool username and password and my PTS address for beeeeer. Same
-
Tried it a few times with both ypool username and password and my PTS address for beeeeer. Same
It does not work on ypool. for ypool try my other miner: https://bitsharestalk.org/index.php?topic=2460.0
About beeeeer.org, please can you post or private message me the exact command line parameters you are using?
-
john the ripper has sha512 code specific for AMD, but i was unable to test it since i have no AMD card. I will create a "gpuv6" that uses it instead of the generic version, and see if we get closer to them. Any other tips and tricks for tuning opencl code on AMD cards are welcome.
thanks for add gpuv6! I will test asap after adding.
Just got it ready. 80% improvement at my low end AMD at work (from 28 to 45 cpm). The options is -a gpuv4amd since it does not work on nvidia. Please test it and tell me if it improves performance for you.
-
I got error on building last version. Linking error (sorry for russian, I don't know how chage error message on cygwin)
g++ -o ptsminer obj/cpuid.o obj/sha512_avx.o obj/sha512_sse4.o obj/sha512.o obj/sph_sha2.o obj/sph_sha2big.o obj/CProtoshareProcessor.o obj/AbstractMomentum.o obj/OpenCLMomentum2.o obj/OpenCLMomentumV3.o obj/OpenCLMomentumV4.o obj/OpenCLMomentumV5.o obj/OpenCLMomentumV4_AMD.o obj/OpenCLObjects.o obj/sha_utils.o obj/fileutils.o obj/sha2.o obj/main_poolminer.o -Wl,-Bdynamic -l boost_system-mt -l boost_filesystem-mt -l boost_program_options-mt -l boost_thread-mt -l boost_chrono-mt -Wl,-Bdynamic -l z -l dl -l pthread -L/opt/AMD_SDK/lib/x86_64 -lOpenCL
obj/sha512.o:sha512.c:(.rdata$.refptr.sha512_transform_single_rorx[.refptr.sha512_transform_single_rorx]+0x0): undefined reference to `sha512_transform_single_rorx'
obj/sha512.o:sha512.c:(.rdata$.refptr.sha512_transform_rorx[.refptr.sha512_transform_rorx]+0x0): undefined reference to `sha512_transform_rorx'
/usr/lib/gcc/x86_64-pc-cygwin/4.8.2/../../../../x86_64-pc-cygwin/bin/ld: obj/sha512.o: неправильный адрес перемещения 0x0 в разделе «.rdata$.refptr.sha512_transform_rorx[.refptr.sha512_transform_rorx]»
collect2: ошибка: выполнение ld завершилось с кодом возврата 1
makefile.cygwin:158: ошибка выполнения рецепта для цели «ptsminer»
make: *** [ptsminer] Ошибка 1
-
I got error on building last version. Linking error (sorry for russian, I don't know how chage error message on cygwin)
g++ -o ptsminer obj/cpuid.o obj/sha512_avx.o obj/sha512_sse4.o obj/sha512.o obj/sph_sha2.o obj/sph_sha2big.o obj/CProtoshareProcessor.o obj/AbstractMomentum.o obj/OpenCLMomentum2.o obj/OpenCLMomentumV3.o obj/OpenCLMomentumV4.o obj/OpenCLMomentumV5.o obj/OpenCLMomentumV4_AMD.o obj/OpenCLObjects.o obj/sha_utils.o obj/fileutils.o obj/sha2.o obj/main_poolminer.o -Wl,-Bdynamic -l boost_system-mt -l boost_filesystem-mt -l boost_program_options-mt -l boost_thread-mt -l boost_chrono-mt -Wl,-Bdynamic -l z -l dl -l pthread -L/opt/AMD_SDK/lib/x86_64 -lOpenCL
obj/sha512.o:sha512.c:(.rdata$.refptr.sha512_transform_single_rorx[.refptr.sha512_transform_single_rorx]+0x0): undefined reference to `sha512_transform_single_rorx'
obj/sha512.o:sha512.c:(.rdata$.refptr.sha512_transform_rorx[.refptr.sha512_transform_rorx]+0x0): undefined reference to `sha512_transform_rorx'
/usr/lib/gcc/x86_64-pc-cygwin/4.8.2/../../../../x86_64-pc-cygwin/bin/ld: obj/sha512.o: неправильный адрес перемещения 0x0 в разделе «.rdata$.refptr.sha512_transform_rorx[.refptr.sha512_transform_rorx]»
collect2: ошибка: выполнение ld завершилось с кодом возврата 1
makefile.cygwin:158: ошибка выполнения рецепта для цели «ptsminer»
make: *** [ptsminer] Ошибка 1
sorry. This was from the CPU version. commented out the code. please update the code and it should compile.
-
thanks! builded.
-
unfortunately, that not working.
./ptsminer.exe -u x -m 28 -a gpuv4amd ( I have 3+ GB gpu memory)
Starting OpenCLMomentum V4 AMD Optimized
Device 00: Tahiti
Max work group size: 256
ERROR: -61, CL_INVALID_BUFFER_SIZE, if size is 0.Implementations may return CL_INVALID_BUFFER_SIZE if size is greater than the CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in the table of allowed values for param_name for clGetDeviceInfo for all devices in context.
assertion "_MY_ERR_X == CL_SUCCESS" failed: file "OpenCLObjects.cpp", line 379, function: OpenCLBuffer* OpenCLContext::createBuffer(size_t, cl_mem_flags, void*)
Aborted (core dumped)
with ./ptsminer.exe -u x -m 27 -a gpuv4amd
show around 17k cpm
Warning: found more candidate collisions than storage space available
[WORKER] collision found: 17088 <-> 16407 #28 @ 1390279937 by 0
[WORKER] collision found: 36544 <-> 35863 #82 @ 1390279937 by 0
[WORKER] collision found: 56000 <-> 55319 #96 @ 1390279937 by 0
[WORKER] collision found: 57399 <-> 58352 #126 @ 1390279937 by 0
[WORKER] collision found: 91992 <-> 91175 #142 @ 1390279937 by 0
[WORKER] collision found: 123888 <-> 122935 #154 @ 1390279937 by 0
[WORKER] collision found: 99160 <-> 98343 #164 @ 1390279937 by 0
[WORKER] collision found: 79552 <-> 78871 #196 @ 1390279937 by 0
[MASTER] submitted share -> REJECTED
[STATS] 2014-Jan-21 11:52:16 | 17280.0 c/m | 480.0 sh/m | VL: 0 (0.0%), RJ: 10 (100.0%), ST: 0 (0.0%)
[MASTER] submitted share -> REJECTED
[STATS] 2014-Jan-21 11:52:16 | 17280.0 c/m | 480.0 sh/m | VL: 0 (0.0%), RJ: 11 (100.0%), ST: 0 (0.0%)
[MASTER] submitted share -> REJECTED
too many rejects (3) in a row, forcing reconnect.
[STATS] 2014-Jan-21 11:52:16 | 17280.0 c/m | 480.0 sh/m | VL: 0 (0.0%), RJ: 12 (100.0%), ST: 0 (0.0%)
no connection to the server, reconnecting in 10 seconds
Warning: found more candidate collisions than storage space available
-
unfortunately, that not working.
./ptsminer.exe -u x -m 28 -a gpuv4amd ( I have 3+ GB gpu memory)
Starting OpenCLMomentum V4 AMD Optimized
Device 00: Tahiti
Max work group size: 256
ERROR: -61, CL_INVALID_BUFFER_SIZE, if size is 0.Implementations may return CL_INVALID_BUFFER_SIZE if size is greater than the CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in the table of allowed values for param_name for clGetDeviceInfo for all devices in context.
assertion "_MY_ERR_X == CL_SUCCESS" failed: file "OpenCLObjects.cpp", line 379, function: OpenCLBuffer* OpenCLContext::createBuffer(size_t, cl_mem_flags, void*)
Aborted (core dumped)
with ./ptsminer.exe -u x -m 27 -a gpuv4amd
show around 17k cpm
Warning: found more candidate collisions than storage space available
[WORKER] collision found: 17088 <-> 16407 #28 @ 1390279937 by 0
[WORKER] collision found: 36544 <-> 35863 #82 @ 1390279937 by 0
[WORKER] collision found: 56000 <-> 55319 #96 @ 1390279937 by 0
[WORKER] collision found: 57399 <-> 58352 #126 @ 1390279937 by 0
[WORKER] collision found: 91992 <-> 91175 #142 @ 1390279937 by 0
[WORKER] collision found: 123888 <-> 122935 #154 @ 1390279937 by 0
[WORKER] collision found: 99160 <-> 98343 #164 @ 1390279937 by 0
[WORKER] collision found: 79552 <-> 78871 #196 @ 1390279937 by 0
[MASTER] submitted share -> REJECTED
[STATS] 2014-Jan-21 11:52:16 | 17280.0 c/m | 480.0 sh/m | VL: 0 (0.0%), RJ: 10 (100.0%), ST: 0 (0.0%)
[MASTER] submitted share -> REJECTED
[STATS] 2014-Jan-21 11:52:16 | 17280.0 c/m | 480.0 sh/m | VL: 0 (0.0%), RJ: 11 (100.0%), ST: 0 (0.0%)
[MASTER] submitted share -> REJECTED
too many rejects (3) in a row, forcing reconnect.
[STATS] 2014-Jan-21 11:52:16 | 17280.0 c/m | 480.0 sh/m | VL: 0 (0.0%), RJ: 12 (100.0%), ST: 0 (0.0%)
no connection to the server, reconnecting in 10 seconds
Warning: found more candidate collisions than storage space available
yeah, that's the problem with not having an AMD GPU at home to test :( Probably some memory sync problem, but it will be impossible to debug without a GPU of my own :( the 17K cpm is misleading. all the hashes are probably 0 because the sha512 is not woking...
Tanks anyway. I will find a way to debug it...
-
I do understand
you going great work. if you need test something, add debug info and I will send to you output.
-
Tried it a few times with both ypool username and password and my PTS address for beeeeer. Same
It does not work on ypool. for ypool try my other miner: https://bitsharestalk.org/index.php?topic=2460.0
About beeeeer.org, please can you post or private message me the exact command line parameters you are using?
I think I should pay more attention to what I'm typing in in the future. It is working now thanks :) Now I've just got to get a decent graphics card.....
-
hi. another cygwin make error.
g++ -o ptsminer obj/cpuid.o obj/sha512_avx.o obj/sha512_sse4.o obj/sha512.o obj/sph_sha2.o obj/sph_sha2big.o obj/CProtoshareProcessor.o obj/AbstractMomentum.o obj/OpenCLMomentum2.o obj/OpenCLMomentumV3.o obj/OpenCLMomentumV4.o obj/OpenCLMomentumV5.o obj/OpenCLMomentumV4_AMD.o obj/OpenCLMomentumV3_AMD.o obj/OpenCLObjects.o obj/sha_utils.o obj/fileutils.o obj/sha2.o obj/main_poolminer.o -Wl,-Bdynamic -Wl,-Bdynamic -l z -l dl -l pthread -L/opt/AMD_SDK/lib/x86_64 -lOpenCL
obj/CProtoshareProcessor.o:CProtoshareProcessor.cpp:(.text$_ZN5boost4asio6detail20posix_tss_ptr_createERP15__pthread_key_t[_ZN5boost4asio6detail20posix_tss_ptr_createERP15__pthread_key_t]+0xf): undefined reference to `boost::system::system_category()'
obj/CProtoshareProcessor.o:CProtoshareProcessor.cpp:(.text$_ZN5boost4asio6detail20posix_tss_ptr_createERP15__pthread_key_t[_ZN5boost4asio6detail20posix_tss_ptr_createERP15__pthread_key_t]+0xf): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `boost::system::system_category()'
/usr/lib/gcc/x86_64-pc-cygwin/4.8.2/../../../../x86_64-pc-cygwin/bin/ld: obj/CProtoshareProcessor.o: неправильный адрес перемещения 0x28 в разделе «.text$_ZN5boost4asio6detail20posix_tss_ptr_createERP15__pthread_key_t[_ZN5boost4asio6detail20posix_tss_ptr_createERP15__pthread_key_t]»
collect2: ошибка: выполнение ld завершилось с кодом возврата 1
makefile.cygwin:159: ошибка выполнения рецепта для цели «ptsminer»
make: *** [ptsminer] Ошибка 1
-
listen, what you IDE you use for debug?
possible use something with cygwin? I want make some debug too, but right now I no have linux ( I have good experience with c++ and linux but no experience with opencl, I'm very interested)
-
listen, what you IDE you use for debug?
possible use something with cygwin? I want make some debug too, but right now I no have linux ( I have good experience with c++ and linux but no experience with opencl, I'm very interested)
I use Eclipse for editing code, but no debugging IDE. Just a bunch of "#ifdef" and "printf" all over the code. I Am new to opencl too. as of today, i have exactly 3 weeks of experience with it ;)
-
ok. I do understand.
error on most options.
ERROR: -61, CL_INVALID_BUFFER_SIZE, if size is 0.Implementations may return CL_INVALID_BUFFER_SIZE if size is greater than the CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in the table of allowed values for param_name for clGetDeviceInfo for all devices in context.
assertion "_MY_ERR_X == CL_SUCCESS" failed: file "OpenCLObjects.cpp", line 379, function: OpenCLBuffer* OpenCLContext::createBuffer(size_t, cl_mem_flags, void*)
Aborted (core dumped)
-
Using some optimizations by DGA I could get another 10% improvement in performance. Still not the fastest around, but getting closer ;)
Please use the link on the first page to test!
-
Helo all,
I just released binaries for this miner (see first post). Those binaries use a new sha512 code, tuned and optimized by my during the last few days. It is at least 15% faster than the old code. Binaries have a 1% fee added.
Short Version: faster! gpuv7 is default, gpuv9 is the low memory mode.
Download link: https://www.dropbox.com/sh/n4ta5olqp2g5i9l/xkr0sCTrUu
TLDR version: The new "-a" options "gpuv7", "gpuv8" and "gpuv9" all use the new code. "gpuv9" is the low memory version, same as "gpuv3", using the new sha512 code. "gpuv8" is the same as "gpuv4" with the new code, and "gpuv7" is based on "gpuv6", but uses a linear collision avoidance hashtable instead of the plain hashtable used before. It is bound to find slightly more hashes at the cost of being a little bit slower. since in all my tests it produces better CPM, i made "gpuv7" the default mode. The old modes are still kept for compatibility (maybe my optimizations are not compatible with all GPUs, then you can always fall back to the old modes).
Please download and enjoy.
Thanks,
girino.
-
Thanks Girino,
Good PTS miner. I've tried it today and can say it is quite quick compared to the crz one (cudaPTSwin). I am getting 50-60 c/m more with this one.
I really appreciate making the source available! Thank you.
-
Awesome piece of software
AMD R9 280x ~1040 c/m
built on OpenSuse 13.1
Thank You.
-
Compilation on Ubuntu 13.10 stops here
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE[_ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE]+0x281): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `CMasterThread::run()':
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x127): undefined reference to `boost::thread::start_thread_noexcept()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x1dd): undefined reference to `vtable for boost::detail::thread_data_base'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x529): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x697): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7a9): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7c0): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x877): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x896): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `_GLOBAL__sub_I_collision_table_bits':
main_poolminer.cpp:(.text.startup+0xf63): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf6f): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf7b): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text.startup+0xfa1): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:(.rodata._ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE[_ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE]+0x10): undefined reference to `typeinfo for boost::detail::thread_data_base'
obj/CProtoshareProcessor.o: In function `_GLOBAL__sub_I__Z31protoshares_revalidateCollisionP13blockHeader_tPhjjmP14CBlockProviderPFvS1_jS1_Ej':
CProtoshareProcessor.cpp:(.text.startup+0x33): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x3f): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x4b): undefined reference to `boost::system::system_category()'
CProtoshareProcessor.cpp:(.text.startup+0x71): undefined reference to `boost::system::system_category()'
collect2: error: ld returned 1 exit status
make: *** [ptsminer] Error 1
What is wrong?
-
Compilation on Ubuntu 13.10 stops here
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE[_ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE]+0x281): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `CMasterThread::run()':
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x127): undefined reference to `boost::thread::start_thread_noexcept()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x1dd): undefined reference to `vtable for boost::detail::thread_data_base'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x529): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x697): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7a9): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7c0): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x877): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x896): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `_GLOBAL__sub_I_collision_table_bits':
main_poolminer.cpp:(.text.startup+0xf63): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf6f): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf7b): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text.startup+0xfa1): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:(.rodata._ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE[_ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE]+0x10): undefined reference to `typeinfo for boost::detail::thread_data_base'
obj/CProtoshareProcessor.o: In function `_GLOBAL__sub_I__Z31protoshares_revalidateCollisionP13blockHeader_tPhjjmP14CBlockProviderPFvS1_jS1_Ej':
CProtoshareProcessor.cpp:(.text.startup+0x33): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x3f): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x4b): undefined reference to `boost::system::system_category()'
CProtoshareProcessor.cpp:(.text.startup+0x71): undefined reference to `boost::system::system_category()'
collect2: error: ld returned 1 exit status
make: *** [ptsminer] Error 1
What is wrong?
you need to install boost. Not sure what is the package name on ubuntu 13.10. try this:
sudo apt-get install libboost-all-dev
-
Huge thanks for leaving yours open source. It's educational to see what you did for optimizing it for OpenCL - I haven't quite wrapped my head around when it's best to use the int4/int8 types, and I appreciate you providing an example of how the CUDA version translates into optimized CL.
As an appreciation:
An optimization that I know the CUDA compiler gets through automated loop unrolling, but that I'm not sure the OpenCL compiler gets, is that in your step0to15 function, you can eliminate the + w[ i ] for values of i between 5 and 14. It might be worth testing a variant with two versions of step0to15, one "normal" and one for the known-zero-values in w.
-Dave
-
Huge thanks for leaving yours open source. It's educational to see what you did for optimizing it for OpenCL - I haven't quite wrapped my head around when it's best to use the int4/int8 types, and I appreciate you providing an example of how the CUDA version translates into optimized CL.
As an appreciation:
An optimization that I know the CUDA compiler gets through automated loop unrolling, but that I'm not sure the OpenCL compiler gets, is that in your step0to15 function, you can eliminate the + w[ i ] for values of i between 5 and 14. It might be worth testing a variant with two versions of step0to15, one "normal" and one for the known-zero-values in w.
-Dave
I am really quite new to openCL. I am used to optimize C and C++ code, so I just tried to apply the same techniques. I have still a hard time thinking in parallel ;) As far as read in articles and specs, CUDA does a really better job in automatically vectorizing and in loop unrolling than OpenCL. Usually it's recommended that those be left to the compiler in CUDA while in OpenCL people tend to do it manually. In all my tests, only "long2" really improves speed (there are 128 registers in most gpus), but i used long8 anyway because it really makes the code more readable ;)
I was planning to inspect the intermediary values of vectors so I could optimize a little more, but my son went through a surgery (removing the amygdalae, nothing serious) and i am spending all my free time with him instead of coding ;)
Possibly next week i can continue working on this.
-
I'm not getting any errors upon execution in Windows7, other than this:
C:\[edited]\ptsminer-win64-cygwin64-build1\ptsmi
ner\ptsminer.exe -u [edited] -device 1 -m 26
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
**SSE4/AVX auto-detection
using AVX
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
connecting to 54.201.26.128:1337
Mining for approx 30 seconds to support further development
Payments to: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
[MASTER] work received - sharetarget: 03ffffffffffffffffffffffffffffffffffffffff
ffffffffffffffbeefde4d
[STATS] 2014-Feb-04 06:13:22 | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
0 [unknown (0x2A3C)] ptsminer 9552 open_stackdumpfile: Dumping stack trace
to ptsminer.exe.stackdump
The stackdumpfile contains this:
Exception: STATUS_ACCESS_VIOLATION at rip=00100401339
rax=0000000000000000 rbx=00000000FFFFC760 rcx=00000000FFFFC7C0
rdx=00000000FFFFC778 rsi=0000000000000010 rdi=0000000000000000
r8 =0000000000000001 r9 =0000000000000000 r10=0000000000000000
r11=00000000FFFFC830 r12=00000000FFFFC7C0 r13=0000000000000080
r14=00000000FFFFC778 r15=00000000FFFFC8E0
rbp=00000000FFFFC720 rsp=00000000FFFFC3D0
program=C:\Users\[edited]\ptsminer-win64-cygwin64-build1\ptsminer\ptsminer.exe, pid 7680, thread unknown (0x1F6C)
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame Function Args
000FFFFC720 00100401339 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace
Any idea how to fix this?
-
I'm not getting any errors upon execution in Windows7, other than this:
C:\[edited]\ptsminer-win64-cygwin64-build1\ptsmi
ner\ptsminer.exe -u [edited] -device 1 -m 26
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
*** if you like, donate:
*** PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
*** BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
**SSE4/AVX auto-detection
using AVX
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
connecting to 54.201.26.128:1337
Mining for approx 30 seconds to support further development
Payments to: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
[MASTER] work received - sharetarget: 03ffffffffffffffffffffffffffffffffffffffff
ffffffffffffffbeefde4d
[STATS] 2014-Feb-04 06:13:22 | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
0 [unknown (0x2A3C)] ptsminer 9552 open_stackdumpfile: Dumping stack trace
to ptsminer.exe.stackdump
The stackdumpfile contains this:
Exception: STATUS_ACCESS_VIOLATION at rip=00100401339
rax=0000000000000000 rbx=00000000FFFFC760 rcx=00000000FFFFC7C0
rdx=00000000FFFFC778 rsi=0000000000000010 rdi=0000000000000000
r8 =0000000000000001 r9 =0000000000000000 r10=0000000000000000
r11=00000000FFFFC830 r12=00000000FFFFC7C0 r13=0000000000000080
r14=00000000FFFFC778 r15=00000000FFFFC8E0
rbp=00000000FFFFC720 rsp=00000000FFFFC3D0
program=C:\Users\[edited]\ptsminer-win64-cygwin64-build1\ptsminer\ptsminer.exe, pid 7680, thread unknown (0x1F6C)
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame Function Args
000FFFFC720 00100401339 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace
Any idea how to fix this?
You need to run with "-a gpu" (or any of the variations, gpuv2 to gpuv9). The way you did it is trying to CPU mine with AVX instruction set that your machine does not support.
-
I appreciate the open-source miner, girino!
I needed to do the following, in addition to the instructions, to get a successful compile and run on Ubuntu 12.04:
1. export BOOST_LIB_PATH=/usr/local/lib/boost
2. export BOOST_INCLUDE_PATH=/usr/include/boost
3. export LD_LIBRARY_PATH=/usr/local/lib/boost
4. make osfinder.sh executable
Note, boost install locations can differ, adjust accordingly.
Also, it is not necessary to logout/login or reboot as per the AMD SDK instructions. Just source /etc/profile.
And, does your code honor the "-m" memory buffer size options? I suspect not, because it errors out regardless of specified memory size when I am already running the clpts miner. My hope was that, on my 2GB R9 270x, I could run both, just as how one can run multiple clpts threads on higher-memory GPUs.
-
can this be used to mine to a client?
-
I appreciate the open-source miner, girino!
I needed to do the following, in addition to the instructions, to get a successful compile and run on Ubuntu 12.04:
1. export BOOST_LIB_PATH=/usr/local/lib/boost
2. export BOOST_INCLUDE_PATH=/usr/include/boost
3. export LD_LIBRARY_PATH=/usr/local/lib/boost
4. make osfinder.sh executable
Note, boost install locations can differ, adjust accordingly.
Also, it is not necessary to logout/login or reboot as per the AMD SDK instructions. Just source /etc/profile.
And, does your code honor the "-m" memory buffer size options? I suspect not, because it errors out regardless of specified memory size when I am already running the clpts miner. My hope was that, on my 2GB R9 270x, I could run both, just as how one can run multiple clpts threads on higher-memory GPUs.
the -m is added of 512Mb for algorithms v4, v6, v7 and v8. On algorithms v3 and v9 the value selected by -m is exact. (-m determines the size of the hash-table to be used. algos v4, v6, v7 and v8 also have all the hashes pre-calculated in batch, which uses an extra 512Mb. V3 and v9 calculate hashes on the fly, so no extra memory)
-
can this be used to mine to a client?
I'm not sure to what this means.