Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - girino

Pages: [1] 2 3 4 5
1
I got this error on 6870 1GB. There are some settings in which it uses 1024 MB of memory:

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Mile>cd C:\Users\Mile\Desktop\jhProtominer

C:\Users\Mile\Desktop\jhProtominer>proba2

C:\Users\Mile\Desktop\jhProtominer>jhProtominer.exe -o ypool.net -u x.x
 -p x -a gpuv9 -m1024
╔══════════════════════════════════════════════════╗
║  jhProtominer (v0.2a) + OpenCL GPU Support       ║
║  author: girino, based on code by jh             ║
║                                                  ║
║  If you like it, please donate:                  ║
║  PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h         ║
║  BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce         ║
║                                                  ║
║  Please note  that on  pools that support  it    ║
║  (currently only ypool.net), there is an 2.5║
║  mining fee to support further development of    ║
║  this software.                                  ║
╚══════════════════════════════════════════════════╝
Launching miner...
Using 1024 megabytes of memory per thread
Using 1 threads
Available devices:
Platform 00: AMD Accelerated Parallel Processing
  Device 00: Barts
  Device 01: AMD Phenom(tm) II X4 925 Processor
Adjusting num threads to match device list: 1
Initializing GPU...
Initing device 0.
Starting OpenCLMomentum V9
Device 00: Barts
Max work group size: 256
ERROR: -61, CL_INVALID_BUFFER_SIZE, if size is 0.Implementations may return CL_I
NVALID_BUFFER_SIZE if size is greater than the CL_DEVICE_MAX_MEM_ALLOC_SIZE valu
e specified in the table of allowed values for param_name for clGetDeviceInfo fo
r all devices in context.
Assertion failed!

Program: C:\Users\Mile\Desktop\jhProtominer\jhProtominer.exe
File: OpenCLObjects.cpp, Line 373

Expression: _MY_ERR_X == CL_SUCCESS

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

C:\Users\Mile\Desktop\jhProtominer>

You wont be able to use the full 1Gb for the miner since some parts of the memory is used for other purposes by the card. You should check the amount of free memory, not the total memory to see what you can run.

2
Need help running miner on Ubuntu 13.10.

5850 1GB wont run. I get the following error -
ERROR: -61, CL_INVALID_BUFFER_SIZE, if size is 0.Implementations may return CL_INVALID_BUFFER_SIZE if size is greater than the CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in the table of allowed values for param_name for clGetDeviceInfo for all devices in context.
ptsminer: OpenCLObjects.cpp:373: OpenCLBuffer* OpenCLContext::createBuffer(size_t, cl_mem_flags, void*): Assertion `_MY_ERR_X == 0' failed.
Aborted (core dumped)

Any advice? What should I do ?

try running with "-a gpuv9" and different values for -m (-m128 is sure to work, -m256 probably works, -m512 if it works will be the best)

3
can this be used to mine to a client?

I'm not sure to what this means.

4
I appreciate the open-source miner, girino!

I needed to do the following, in addition to the instructions, to get a successful compile and run on Ubuntu 12.04:

1.  export BOOST_LIB_PATH=/usr/local/lib/boost
2.  export BOOST_INCLUDE_PATH=/usr/include/boost
3.  export LD_LIBRARY_PATH=/usr/local/lib/boost
4.  make osfinder.sh executable

Note, boost install locations can differ, adjust accordingly.

Also, it is not necessary to logout/login or reboot as per the AMD SDK instructions.  Just source /etc/profile.

And, does your code honor the "-m" memory buffer size options?  I suspect not, because it errors out regardless of specified memory size when I am already running the clpts miner.  My hope was that, on my 2GB R9 270x, I could run both, just as how one can run multiple clpts threads on higher-memory GPUs.

the -m is added of 512Mb for algorithms v4, v6, v7 and v8. On algorithms v3 and v9 the value selected by -m is exact. (-m determines the size of the hash-table to be used. algos v4, v6, v7 and v8 also have all the hashes pre-calculated in batch, which uses an extra 512Mb. V3 and v9 calculate hashes on the fly, so no extra memory)

5
I'm not getting any errors upon execution in Windows7, other than this:

Code: [Select]
C:\[edited]\ptsminer-win64-cygwin64-build1\ptsmi
ner\ptsminer.exe -u [edited] -device 1 -m 26
*********************************************************
*** GPU PTS miner by girino v0.2.1 Alpha 2 <experimental>
*** based on Pts Pool Miner v0.7 RC2 <experimental>
*** by xolokram/TB - www.beeeeer.org - glhf
***
*** GPU support and performance improvements by girino
***    if you like, donate:
***    PTS: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
***    BTC: 1GiRiNoKznfGbt8bkU1Ley85TgVV7ZTXce
*** thanks to wjchen for SSE4 improvements.
***
*** press CTRL+C to exit
*********************************************************
**SSE4/AVX auto-detection
using AVX
spawning 1 worker thread(s)
[WORKER0] Hello, World!
[WORKER0] GoGoGo!
connecting to 54.201.26.128:1337
Mining for approx 30 seconds to support further development
Payments to: PkyeQNn1yGV5psGeZ4sDu6nz2vWHTujf4h
[MASTER] work received - sharetarget: 03ffffffffffffffffffffffffffffffffffffffff
ffffffffffffffbeefde4d
[STATS] 2014-Feb-04 06:13:22 | VL: 0 (0.0%), RJ: 0 (0.0%), ST: 0 (0.0%)
      0 [unknown (0x2A3C)] ptsminer 9552 open_stackdumpfile: Dumping stack trace
 to ptsminer.exe.stackdump

The stackdumpfile contains this:

Code: [Select]
Exception: STATUS_ACCESS_VIOLATION at rip=00100401339
rax=0000000000000000 rbx=00000000FFFFC760 rcx=00000000FFFFC7C0
rdx=00000000FFFFC778 rsi=0000000000000010 rdi=0000000000000000
r8 =0000000000000001 r9 =0000000000000000 r10=0000000000000000
r11=00000000FFFFC830 r12=00000000FFFFC7C0 r13=0000000000000080
r14=00000000FFFFC778 r15=00000000FFFFC8E0
rbp=00000000FFFFC720 rsp=00000000FFFFC3D0
program=C:\Users\[edited]\ptsminer-win64-cygwin64-build1\ptsminer\ptsminer.exe, pid 7680, thread unknown (0x1F6C)
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame        Function    Args
000FFFFC720  00100401339 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace

Any idea how to fix this?

You need to run with "-a gpu" (or any of the variations, gpuv2 to gpuv9). The way you did it is trying to CPU mine with AVX instruction set that your machine does not support.

6
Huge thanks for leaving yours open source.  It's educational to see what you did for optimizing it for OpenCL - I haven't quite wrapped my head around when it's best to use the int4/int8 types, and I appreciate you providing an example of how the CUDA version translates into optimized CL.

As an appreciation:

An optimization that I know the CUDA compiler gets through automated loop unrolling, but that I'm not sure the OpenCL compiler gets, is that in your step0to15 function, you can eliminate the + w[ i ]  for values of i between 5 and 14.  It might be worth testing a variant with two versions of step0to15, one "normal" and one for the known-zero-values in w.

  -Dave

I am really quite new to openCL. I am used to optimize C and C++ code, so I just tried to apply the same techniques. I have still a hard time thinking in parallel ;) As far as read in articles and specs, CUDA does a really better job in automatically vectorizing and in loop unrolling than OpenCL. Usually it's recommended that those be left to the compiler in CUDA while in OpenCL people tend to do it manually. In all my tests, only "long2" really improves speed (there are 128 registers in most gpus), but i used long8 anyway because it really makes the code more readable ;)

I was planning to inspect the intermediary values of vectors so I could optimize a little more, but my son went through a surgery (removing the amygdalae, nothing serious) and i am spending all my free time with him instead of coding ;)

Possibly next week i can continue working on this.

7
Compilation on Ubuntu 13.10 stops here
Code: [Select]
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE[_ZN5boost4asio6detail16resolver_serviceINS0_2ip3tcpEE7resolveERNS_10shared_ptrIvEERKNS3_20basic_resolver_queryIS4_EERNS_6system10error_codeE]+0x281): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `CMasterThread::run()':
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x127): undefined reference to `boost::thread::start_thread_noexcept()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x1dd): undefined reference to `vtable for boost::detail::thread_data_base'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x529): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x697): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7a9): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x7c0): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x877): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:main_poolminer.cpp:(.text._ZN13CMasterThread3runEv[_ZN13CMasterThread3runEv]+0x896): more undefined references to `boost::system::system_category()' follow
obj/main_poolminer.o: In function `_GLOBAL__sub_I_collision_table_bits':
main_poolminer.cpp:(.text.startup+0xf63): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf6f): undefined reference to `boost::system::generic_category()'
main_poolminer.cpp:(.text.startup+0xf7b): undefined reference to `boost::system::system_category()'
main_poolminer.cpp:(.text.startup+0xfa1): undefined reference to `boost::system::system_category()'
obj/main_poolminer.o:(.rodata._ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE[_ZTIN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0Iv13CWorkerThreadEENS2_5list1INS2_5valueIPS6_EEEEEEEE]+0x10): undefined reference to `typeinfo for boost::detail::thread_data_base'
obj/CProtoshareProcessor.o: In function `_GLOBAL__sub_I__Z31protoshares_revalidateCollisionP13blockHeader_tPhjjmP14CBlockProviderPFvS1_jS1_Ej':
CProtoshareProcessor.cpp:(.text.startup+0x33): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x3f): undefined reference to `boost::system::generic_category()'
CProtoshareProcessor.cpp:(.text.startup+0x4b): undefined reference to `boost::system::system_category()'
CProtoshareProcessor.cpp:(.text.startup+0x71): undefined reference to `boost::system::system_category()'
collect2: error: ld returned 1 exit status
make: *** [ptsminer] Error 1
What is wrong?

you need to install boost. Not sure what is the package name on ubuntu 13.10. try this:

Code: [Select]
sudo apt-get install libboost-all-dev

8
Helo all,

I just released binaries for this miner (see first post). Those binaries use a new sha512 code, tuned and optimized by my during the last few days. It is at least 15% faster than the old code. Binaries have a 1% fee added.

Short Version: faster! gpuv7 is default, gpuv9 is the low memory mode.

Download link: https://www.dropbox.com/sh/n4ta5olqp2g5i9l/xkr0sCTrUu

TLDR version: The new "-a" options "gpuv7", "gpuv8" and "gpuv9" all use the new code. "gpuv9" is the low memory version, same as "gpuv3", using the new sha512 code. "gpuv8" is the same as "gpuv4" with the new code, and "gpuv7" is based on "gpuv6", but uses a linear collision avoidance hashtable instead of the plain hashtable used before. It is bound to find slightly more hashes at the cost of being a little bit slower. since in all my tests it produces better CPM, i made "gpuv7" the default mode. The old modes are still kept for compatibility (maybe my optimizations are not compatible with all GPUs, then you can always fall back to the old modes).

Please download and enjoy.

Thanks,
girino.

9
Same error here...  :(
win7_x64, 2x R9280
Hope you can fix this!  :-*

Fixed the new implementation so it works on AMD. All that had " error: expression must have pointer-to-struct-or-union type" type errors, please download the new version (build6), or simply update the sha512vectorized.cl file in your already downloaded build5.

Thanks to all for the bug reports and for the patience.
girino.

10
error when run build5

windows 7 64bit HD7950 and build 4 runs normally




                                                  ^

"C:\Users\John\AppData\Local\Temp\OCL469B.tmp.cl", line 255: error: expression
          must have pointer-to-struct-or-union type
        step0to15(0); step0to15(1); step0to15(2); step0to15(3);
                                                  ^

Error limit reached.
100 errors detected in the compilation of "C:\Users\John\AppData\Local\Temp\OCL4
69B.tmp.cl".
Compilation terminated.

Frontend phase failed compilation.

---  End log  ---
Assertion failed!

Program: H:\jhProtominer-win64-mingw-build5\jhProtominer\jhProtominer.exe
File: OpenCLObjects.cpp, Line 310

Expression: !error

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

probably used some dialect that AMD does not recognize. (tested on NVidia only). I will test with AMD later today and correct the errors. Thanks for the report.

11
Helo all,

I just released new binaries (ending with build5) in the same old place (see first post). Those binaries use a new sha512 code, tuned and optimized by my during the last few days. It is at least 15% faster than the old code.

Short Version: faster! gpuv7 is default, gpuv9 is the low memory mode.

TLDR version: The new "-a" options "gpuv7", "gpuv8" and "gpuv9" all use the new code. "gpuv9" is the low memory version, same as "gpuv3", using the new sha512 code. "gpuv8" is the same as "gpuv4" with the new code, and "gpuv7" is based on "gpuv6", but uses a linear collision avoidance hashtable instead of the plain hashtable used before. It is bound to find slightly more hashes at the cost of being a little bit slower. since in all my tests it produces better CPM, i made "gpuv7" the default mode. The old modes are still kept for compatibility (maybe my optimizations are not compatible with all GPUs, then you can always fall back to the old modes).

Please download and enjoy.

Thanks,
girino.

12
tested : 2x R9 290x  with -m512 / -m1024 or -m2048
The result is the same :


I had same bug than you fredz, it is enough to Launch it two time, and it works correctly...

how strange. Does it happen also with the other algorithms? (command line options: "-a gpuv3" and "-a gpuv4") .


13
I just released another windows binary with new optimizations in the hashing code. This version is about 10% faster than the previous one.

Links on the first post of this thread. (new file is jhProtominer-win64-mingw-build4.zip).

Please update and test.

14
BitShares PTS / Re: Open source OpenCL GPU miner by girino
« on: January 23, 2014, 06:18:59 am »
Using some optimizations by DGA I could get another 10% improvement in performance. Still not the fastest  around, but getting closer ;)

Please use the link on the first page to test!

15
BitShares PTS / Re: Open source OpenCL GPU miner by girino
« on: January 23, 2014, 01:58:14 am »
listen, what you IDE you use for debug?
possible use something with cygwin? I want make some debug too, but right now I no have linux ( I have good experience with c++ and linux but no experience with opencl, I'm very interested)

I use Eclipse for editing code, but no debugging IDE. Just a bunch of "#ifdef" and "printf" all over the code. I Am new to opencl too. as of today, i have exactly 3 weeks of experience with it ;)

Pages: [1] 2 3 4 5