BitShares Forum
		Other => Graveyard => BitShares PTS => Topic started by: slothlike on January 16, 2014, 06:06:59 am
		
			
			- 
				---- UPDATE -----
 I have changed the core gpu code a bit and have now managed to eek out about 10% more performance on sm35 devices (GTX titan, GTX 780, and GTX 780Ti, and some weird variant of the 640).  On my gtx titan I now average 2200 (overclocked) or 2000 cpm (stock).  My previous rates before the change were 2000 (overclocked) and 1800 (stock).    Anyways I still haven't made any improvement for older devices as I don't own them so I wasn't as interested in non sm35 devices.  I will send DGA a pull request to get the sm35 changes back ported to his code as well or maybe just submit a patch depending on what he prefers.
 Anyways my changes currently have caused some increased register pressure that I haven't been able to get rid of so I am sure it could get even faster.  Lastly I updated the donation to 2%.  1% for me and 1% for DGA.  As always looking forward to hearing feedback if this release works better for those of you with top end cards, as I don't have a 780 gtx or 780 ti to test on.
 Anyways the new sm35 binary can be downloaded below.  Or as always you can pull the updated  source from github and compile yourself.  If you incorporate these changes into your own miners I would love a small donation ;).  Pc9oQoKptcwnQMoTj3RBvHzDVxx97fu6Kq
 
 Updated Binary for SM35:
 https://dl.dropboxusercontent.com/u/33838/cudaptswin-0.2-SM35.7z
 
 
 As a warning while it is stable right now and fast it does need some work to get threading to work.  If you have multiple devices you must specify a device number or it will crash.   Also please note that this miner appears to be the fastest windows one I have used thanks to DGA.  I get 1950 cpm on my titan with it.  I only get about 1500 cpm with abc's miner which was teh previous best miner I had seen for windows.  I have posted a rough draft of the windows README below.  This miner only works for the ptsweb.beeeeer.org pool (YPOOL controls way way too much of the network anyways, 51%+ is bad for PTS).  You do not need to register just use your address and they will send you pts every time you reach .1 pts.
 --Adam
 
 PURPOSE:
 The purpose of CUDAPTSWIN is that I got sick of seeing all these ports of DGA's code with no one releasing their source code nor how they got it to compile under windows.  So I decided to do it myself  (Note I also added my own address to the donation list feel free to take it out if you want when you build so this miner has a 1% fee with 2/3 going to DGA and 1/3 to me).  The most current windows code can always be found at:
 https://github.com/acarasso/cudapts.git
 I have changed as little code as possible as my intention is to continue to pull in DGA's improvments from the linux miner as he makes them and I will do my best to keep it up to date with his latest changes and any other changes people offer up that might improve performance.
 BINARIES
 The latest binary distribution can be found at:
 https://dl.dropboxusercontent.com/u/33838/CUDAPTSWINx64sm30.zip
 This binary is for x64 systems and requires compute 3.0 capability (gtx 650 and up I believe).  I currently get 1950 to 2000 cpm on my gtx titan with this release.
 Also you might need to download and install http://www.microsoft.com/en-au/download/details.aspx?id=30679.  It comes wtih most windows games though so if you have skyrim, or photoshop or a host of other programs installed already you probably won't need it.
 IF YOU HAVE MORE THAN ONE CUDA DEVICE YOU MUST SPECIFY A DEVICE NUMBER like the example below (I broke threading when I ported it to windows and you will get an error if you don't, hopefully will fix that soon):
 CUDAPTSWIN Pc9oQoKptcwnQMoTj3RBvHzDVxx97fu6Kq 0
 
 
 REQUIREMENTS TO COMPILE.
 In order to compile this code yourself.  You will need the cuda sdk for windows 5.5.  You can get this from nvidia: https://developer.nvidia.com/cuda-toolkit
 You will also need boost libraries and headers for windows:
 http://www.boost.org/doc/libs/1_55_0/more/getting_started/windows.html
 I REALLY REALLY suggest downloading the binary release rather than compiling yourself as a self compile of boost takes days.  The binaries can be found here:
 http://sourceforge.net/projects/boost/files/boost-binaries/1.55.0/
 I put my version of boost in c:/local/boost_1_55_0/ and my project file expects it to be there.  If you move it you need to change 3 settings on the project.  Under Visual C++ Directories in the project config change add your boost header location to the header path.  Add your boost library location to the library path.  And Add your boost library path to the linker library path.
 If you have any questions feel free to ask me and I will try to answer them.
 
- 
				Which commit is it based on? 
 
 EDIT: Doesn't work for me on a GTX 560 ti
- 
				I pulled in his very latest changes from the 14th
 
 "EDIT: Doesn't work for me on a GTX 560 ti"
 
 The 560 ti is compute 2.0 not 3.0.  Only 660s and higher support Cuda 3.0.  If you want to give this build a go you can change the cuda architecture to 2.0 in the project properties under cuda.  Using Cuda 3.0 is a huge performance boost for those of us who have it.  I will try to make another build config in the project to support lower cuda versions for older cards.
- 
				What sm are you suing? And you ares sure that a higher cm gives better performance?
			
- 
				please add cuda 2.1 version, 560ti
			
- 
				Thanks for this, do you have any plans on adding in the ability to use other pools?
			
- 
				I won't add ypool.  Is there another pool you want.  Ypool has way too much of the market already.  Plus their fees are outrageous.
			
- 
				please add cuda 2.1 version, 560ti
 
 SM 2.0 version should work on your 560 ti but slower than the sm 30 version if you can run it at least in my testing:   https://dl.dropboxusercontent.com/u/33838/CUDAAPTSWINSM20.7z
- 
				What sm are you suing? And you ares sure that a higher cm gives better performance?
 
 sm30 it gives me 300 extra cpm not sure why.  sm35 drops it as well so it may be something specific to the gtx titan.
- 
				Hello.
 
 I tried the compiled version. I get the error:
 
 ---------------------------
 CUDAPTSWIN.exe - System Error
 ---------------------------
 The program can't start because MSVCP110.dll is missing from your computer. Try reinstalling the program to fix this problem.
 
 Windows 8 x64 | Phenom II X6 | Asus Geforce 660 GTX
 
 Any ideea what should I do? Thanks.
 
- 
				Hello.
 
 I tried the compiled version. I get the error:
 
 ---------------------------
 CUDAPTSWIN.exe - System Error
 ---------------------------
 The program can't start because MSVCP110.dll is missing from your computer. Try reinstalling the program to fix this problem.
 
 Windows 8 x64 | Phenom II X6 | Asus Geforce 660 GTX
 
 Any ideea what should I do? Thanks.
 
 
 
 Ah my apologies that means you need the microsoft visual c++ redistribute.  It can be found here.  Installing it should fix your problems.
 http://www.microsoft.com/en-au/download/details.aspx?id=30679
- 
				THANK YOU.
 
 I will do donation mining. Absolutely.
- 
				Only curious--did my posts about iruu's suggestion and also any of my code help out?
 
 (I still vote you get mining donation prizes for getting it done.)
- 
				Only curious--did my posts about iruu's suggestion and also any of my code help out?
 
 (I still vote you get mining donation prizes for getting it done.)
 
 
 Nope I started playing with make some example cuda projects from scratch and realized I was just being retarded with how I was setting up stuff in the project file so I started over from scratch with a fresh cuda project and imported dga's source.  The other gotcha is making sure you don't have the sha helper file set to compile and you set it to include only.  DGA also tried to make a few changes to make it easier to compile on windows.  Sadly visual studio 2012 isn't c99 compatible so you have to compile it as a c++ project and move all the restricts as well.
 
- 
				I won't add ypool.  Is there another pool you want.  Ypool has way too much of the market already.  Plus their fees are outrageous.
 
 
 If you'd add support for pts.1gh.com/ that would be nice.
- 
				Only curious--did my posts about iruu's suggestion and also any of my code help out?
 
 (I still vote you get mining donation prizes for getting it done.)
 
 
 I should mention... I tried out your code yesterday and it worked for me. Initially it wasn't printing anything to the screen after the startup messages, however, I merged the latest  changes dga made with your source and it worked fine. Was going to post but then saw this so didn't bother. This version seems to be perform the same, 1600-1700 c/m on a 780.
- 
				Only curious--did my posts about iruu's suggestion and also any of my code help out?
 
 (I still vote you get mining donation prizes for getting it done.)
 
 
 I should mention... I tried out your code yesterday and it worked for me. Initially it wasn't printing anything to the screen after the startup messages, however, I merged the latest  changes dga made with your source and it worked fine. Was going to post but then saw this so didn't bother. This version seems to be perform the same, 1600-1700 c/m on a 780.
 
 
 None of these versions should have any changes in speed, all of them are based on dga's code.
- 
				Only curious--did my posts about iruu's suggestion and also any of my code help out?
 
 (I still vote you get mining donation prizes for getting it done.)
 
 
 I should mention... I tried out your code yesterday and it worked for me. Initially it wasn't printing anything to the screen after the startup messages, however, I merged the latest  changes dga made with your source and it worked fine. Was going to post but then saw this so didn't bother. This version seems to be perform the same, 1600-1700 c/m on a 780.
 
 
 None of these versions should have any changes in speed, all of them are based on dga's code.
 
 
 Actually I find what matters most with speed is weather you pick the right architecture for your card.  Archit when I run your code I get 1500 cpm on my titans.  When I run this code complied with sm35 or sm20 I get 1500 cpm, but when I run this code compiled with sm30 I get 2000 cpm per titan.  I haven't had the time to decompile and see what the cuda compiler is doing.  By all rights sm 35 should be the faster as the sha 512 can take advantage of the new funnle shift operators but its not.
 
- 
				Only curious--did my posts about iruu's suggestion and also any of my code help out?
 
 (I still vote you get mining donation prizes for getting it done.)
 
 
 I should mention... I tried out your code yesterday and it worked for me. Initially it wasn't printing anything to the screen after the startup messages, however, I merged the latest  changes dga made with your source and it worked fine. Was going to post but then saw this so didn't bother. This version seems to be perform the same, 1600-1700 c/m on a 780.
 
 
 None of these versions should have any changes in speed, all of them are based on dga's code.
 
 
 sm30 fast
 gtx780
 cudaPTSwin-0.3-cuda-3.0: 1800cpm
 cudaPTSwin-0.3-cuda-3.5: 1200cpm
- 
				Only curious--did my posts about iruu's suggestion and also any of my code help out?
 
 (I still vote you get mining donation prizes for getting it done.)
 
 
 I should mention... I tried out your code yesterday and it worked for me. Initially it wasn't printing anything to the screen after the startup messages, however, I merged the latest  changes dga made with your source and it worked fine. Was going to post but then saw this so didn't bother. This version seems to be perform the same, 1600-1700 c/m on a 780.
 
 
 None of these versions should have any changes in speed, all of them are based on dga's code.
 
 
 sm30 fast
 gtx780
 cudaPTSwin-0.3-cuda-3.0: 1800cpm
 cudaPTSwin-0.3-cuda-3.5: 1200cpm
 
 
 wow did you use my optimized versions? try the 0.4 version :)
- 
				How do I start this miner? Do I make a .bat file with my address pointed to CUDAPTSWIN.exe?
			
- 
				Hello.
 
 I tried the compiled version. I get the error:
 
 ---------------------------
 CUDAPTSWIN.exe - System Error
 ---------------------------
 The program can't start because MSVCP110.dll is missing from your computer. Try reinstalling the program to fix this problem.
 
 Windows 8 x64 | Phenom II X6 | Asus Geforce 660 GTX
 
 Any ideea what should I do? Thanks.
 
 
 
 Ah my apologies that means you need the microsoft visual c++ redistribute.  It can be found here.  Installing it should fix your problems.
 http://www.microsoft.com/en-au/download/details.aspx?id=30679
 
 
 Thank you very much, that was it.
- 
				Only curious--did my posts about iruu's suggestion and also any of my code help out?
 
 (I still vote you get mining donation prizes for getting it done.)
 
 
 I should mention... I tried out your code yesterday and it worked for me. Initially it wasn't printing anything to the screen after the startup messages, however, I merged the latest  changes dga made with your source and it worked fine. Was going to post but then saw this so didn't bother. This version seems to be perform the same, 1600-1700 c/m on a 780.
 
 
 None of these versions should have any changes in speed, all of them are based on dga's code.
 
 
 sm30 fast
 gtx780
 cudaPTSwin-0.3-cuda-3.0: 1800cpm
 cudaPTSwin-0.3-cuda-3.5: 1200cpm
 
 
 wow did you use my optimized versions? try the 0.4 version :)
 
 0.4=1200  :(
- 
				it's fast, good job!
 forgive my english is not good.
 you can only shield ypool? because i'm slow connection to beeeeer.org, i want connect to other pool, thanks!
- 
				please add GTX480 cuda 2.0~thanks~
 
 Sent from my ME860 using Tapatalk 2
 
 
- 
				---- UPDATE -----
 I have changed the core gpu code a bit and have now managed to eek out about 10% more performance on sm35 devices (GTX titan, GTX 780, and GTX 780Ti, and some weird variant of the 640).  On my gtx titan I now average 2200 (overclocked) or 2000 cpm (stock).  My previous rates before the change were 2000 (overclocked) and 1800 (stock).    Anyways I still haven't made any improvement for older devices as I don't own them so I wasn't as interested in non sm35 devices.  I will send DGA a pull request to get the sm35 changes back ported to his code as well or maybe just submit a patch depending on what he prefers.
 Anyways my changes currently have caused some increased register pressure that I haven't been able to get rid of so I am sure it could get even faster.  Lastly I updated the donation to 2%.  1% for me and 1% for DGA.  As always looking forward to hearing feedback if this release works better for those of you with top end cards, as I don't have a 780 gtx or 780 ti to test on.
 Anyways the new sm35 binary can be downloaded below.  Or as always you can pull the updated  source from github and compile yourself.  If you incorporate these changes into your own miners I would love a small donation ;).  Pc9oQoKptcwnQMoTj3RBvHzDVxx97fu6Kq
 
 Updated Binary for SM35:
 https://dl.dropboxusercontent.com/u/33838/cudaptswin-0.2-SM35.7z
- 
				This works great dude! I'm very happy with it :D
 
 (http://s22.postimg.org/vbchymt19/GTX780.jpg) (http://postimg.org/image/vbchymt19/)
 
 As the screenshot shows, I managed to get 2100-2200 on my GTX 780.
 
 Is there any possibility you can add support for other pools such as http://pts.1gh.com/??