Author Topic: Open source optimized PTS CPU miner (BETA)  (Read 47686 times)

0 Members and 1 Guest are viewing this topic.

Offline dga

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #41 on: January 16, 2014, 07:51:21 pm »
yea, i did that already but thanks anyway :D

and I'm getting this now:
Code: [Select]
Could not mmap hugepage, reverting to malloc: Cannot allocate memory


btw is there any way to reduce memory usage to say 512 or 768 MB per thread?

Not from the command line.  That's my next planned optimization.  I need to finish poking at some other constants to figure out how aggressive I want to be about pushing the memory.

Stay tuned.  I think I can get that into the binary by tonight.  For now, you can run on fewer threads -- you'll find that 4 is actually nearly as happy as 6, and 6 is typically happier for me than 8.

  -Dave

Ok.  I've replaced the binary at the old URL with a new build that uses about 600MB of RAM per thread.  Thanks for the feature request - I'd been meaning to implement this optimization, and it looks from here like it's giving a very pleasant speedup just from using less memory (for those who care, this helps reduce TLB misses).  I haven't run it long enough to get a stable number out of it, but it's looking like 460-475 cpm on an i7-4770.  The 4770k users should be cracking 500cpm.

  -Dave

Offline Aber

  • Jr. Member
  • **
  • Posts: 23
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #40 on: January 16, 2014, 06:53:27 pm »
Nice work dga :) can u add 1gh?

Offline dga

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #39 on: January 16, 2014, 06:17:04 pm »
yea, i did that already but thanks anyway :D

and I'm getting this now:
Code: [Select]
Could not mmap hugepage, reverting to malloc: Cannot allocate memory


btw is there any way to reduce memory usage to say 512 or 768 MB per thread?

Not from the command line.  That's my next planned optimization.  I need to finish poking at some other constants to figure out how aggressive I want to be about pushing the memory.

Stay tuned.  I think I can get that into the binary by tonight.  For now, you can run on fewer threads -- you'll find that 4 is actually nearly as happy as 6, and 6 is typically happier for me than 8.

  -Dave

Offline noobster

  • Jr. Member
  • **
  • Posts: 35
  • cryptocurrencies vs. fed
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #38 on: January 16, 2014, 06:08:48 pm »
yea, i did that already but thanks anyway :D

and I'm getting this now:
Code: [Select]
Could not mmap hugepage, reverting to malloc: Cannot allocate memory


btw is there any way to reduce memory usage to say 512 or 768 MB per thread?
BTC: 15mey7vTkkvHm4UoZgVEP4Yo3REDpH87KW
PTS: PkzbnN7Nkv6TcqJuNjpcLfmPqpPUphpu5W
drop some =)

Offline dga

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #37 on: January 16, 2014, 05:46:39 pm »
Code: [Select]
Couldn't use the hugepage speed optimization.  Enable huge pages for a slight speed boost.
kernel config:
Code: [Select]
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y

Code: [Select]
$ cat /proc/meminfo | grep HugePages
AnonHugePages:     14336 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0

followed https://wiki.archlinux.org/index.php/KVM#Enabling_huge_pages

What am I missing here?

sudo bash
echo "4096" > /proc/sys/vm/nr_hugepages

Offline noobster

  • Jr. Member
  • **
  • Posts: 35
  • cryptocurrencies vs. fed
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #36 on: January 16, 2014, 05:29:36 pm »
Code: [Select]
Couldn't use the hugepage speed optimization.  Enable huge pages for a slight speed boost.
kernel config:
Code: [Select]
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y

Code: [Select]
$ cat /proc/meminfo | grep HugePages
AnonHugePages:     14336 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0

followed https://wiki.archlinux.org/index.php/KVM#Enabling_huge_pages

What am I missing here?
BTC: 15mey7vTkkvHm4UoZgVEP4Yo3REDpH87KW
PTS: PkzbnN7Nkv6TcqJuNjpcLfmPqpPUphpu5W
drop some =)

Offline dga

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #35 on: January 16, 2014, 04:18:14 pm »
I've released beta 2 of my AVX2-optimized build for Linux x64:

http://www.cs.cmu.edu/~dga/ptsminer/ptsminer-dga-beta2-avx2-linux64.bin

EEeeeeeek.  If you grabbed it in the prior 30 minutes, download again.  I botched the dev-fee switching when I implemented the new dev mining code and it's not switching properly.

Sorry about that.  Re-tested and it's happy.

Offline dga

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #34 on: January 16, 2014, 04:09:57 pm »
dga any plans of blessing the people who only have avx?

It's a lot harder to beat Intel's assembly-optimized sha512 on avx than it was on avx2.  I'll port my most recent speed improvements back, but the biggest speed gain came from rewriting the sha512 computation, and I'm not going to do that for avx.  I'll give a few more % in the avx version of my code, but it won't be the same as the 80cpm jump I just introduced for avx2.

It'll be a while.  I've used up my free time coding quota for the week. :)

Offline archit

  • Full Member
  • ***
  • Posts: 161
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #33 on: January 16, 2014, 04:04:08 pm »
dga any plans of blessing the people who only have avx?

Offline dga

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #32 on: January 16, 2014, 04:01:49 pm »
I've released beta 2 of my AVX2-optimized build for Linux x64:

http://www.cs.cmu.edu/~dga/ptsminer/ptsminer-dga-beta2-avx2-linux64.bin

(Note the changed URL).  This one is still binary-only -- I've been focusing on speed, not making it possible for anyone else to build this hunk o'junk code.

This is the first version of my code that beats 400 cpm on a stock i7-4770.  You i7-4770k overclocked folks should see very happy results.  I've affectionately termed this release "herbivore", because, of course, that's what eats yams for dinner.   :)

This version has a 3% dev fee, which I'll reduce further in later builds.  If it's not clear, I'm using the ratcheting-down dev fee as a good reason for people to upgrade to the later releases and not have old versions of the code floating around.

I've updated the dev fee mechanism a little, so don't freak out:
  - It mines for the 60 seconds for dev
  - It mines for the next 2000 seconds for the user
  -- After that, those numbers are multiplied by 20, so that the miner runs with fewer interruptions:  20 minutes of dev mining followed by 1.3 days of blissfully uninterrupted user mining.
 
Still tied to beeeeeeer.  Are there other pools that use the same protocol as beer?  I can support those easily.

  -Dave

Offline honger18

  • Newbie
  • *
  • Posts: 5
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #31 on: January 16, 2014, 03:14:23 pm »
Quote
Oof.  This is going to be a problem on a 32 bit system.  There are some very x86_64 specific chunks of code in the assembly-optimized sha512 routines (which you need if you want this thing to be fast).

Sorry.

I was afraid of that, no problem. Maybe finally a good reason to covert my main desktop to 64bit...

Offline dga

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #30 on: January 16, 2014, 02:50:25 pm »
I googled Haswell and since mine isn't one I tried commenting out the failing sha512_avx2.S bit in the makefile.unix , since apparently I can't use it anyway, but now it fails with the following. Not sure if I need to do more than mess with the makefile...


Code: [Select]
~/comps/ptsminer/src$ make -f makefile.unix
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 11 has invalid symbol index 13
make: *** [ptsminer] Error 1

Oof.  This is going to be a problem on a 32 bit system.  There are some very x86_64 specific chunks of code in the assembly-optimized sha512 routines (which you need if you want this thing to be fast).

Sorry. :(

Offline daem0n

  • Jr. Member
  • **
  • Posts: 22
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #29 on: January 16, 2014, 02:45:16 pm »
Ubuntu 13.10 Work!  :D

1 - git clone https://github.com/dave-andersen/ptsminer

2 - sudo apt-get install build-essential libboost-system-dev libboost-filesystem-dev libboost-program-options-dev libboost-thread-dev zlib1g-dev yasm

3 - cd ptsminer/src

4 - make -f makefile.unix

5 - ./ptsminer <address> <threads> avx

Example: ./ptsminer Padf809dfgdf9OP23nht8f02j3f0 8 avx

 8)

Offline honger18

  • Newbie
  • *
  • Posts: 5
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #28 on: January 16, 2014, 02:12:30 pm »
I googled Haswell and since mine isn't one I tried commenting out the failing sha512_avx2.S bit in the makefile.unix , since apparently I can't use it anyway, but now it fails with the following. Not sure if I need to do more than mess with the makefile...


Code: [Select]
~/comps/ptsminer/src$ make -f makefile.unix
g++ -c -O3  -fpermissive -o obj/cpuid.o cpuid.c
yasm -f elf32 -o obj/sha512_avx.o intel/sha512_avx.asm
yasm -f elf32 -o obj/sha512_sse4.o intel/sha512_sse4.asm
g++ -Wl,-z,relro -Wl,-z,now  -o ptsminer  obj/cpuid.o obj/sha512_avx.o obj/sha512_sse4.o  -Wl,-Bdynamic -l boost_system -l boost_filesystem -l boost_program_options -l boost_thread -l boost_chrono -Wl,-Bdynamic -l z -l dl -l pthread
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 0 has invalid symbol index 11
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 1 has invalid symbol index 12
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 2 has invalid symbol index 2
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 3 has invalid symbol index 2
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 4 has invalid symbol index 11
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 5 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 6 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 7 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 8 has invalid symbol index 12
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 9 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 10 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 11 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 12 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 13 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 14 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 15 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 16 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 17 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 18 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 19 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 20 has invalid symbol index 13
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_info): relocation 21 has invalid symbol index 22
/usr/bin/ld: /usr/lib/debug/usr/lib/i386-linux-gnu/crt1.o(.debug_line): relocation 0 has invalid symbol index 2
/usr/lib/gcc/i686-linux-gnu/4.8/../../../i386-linux-gnu/crt1.o: In function `_start':
(.text+0x18): undefined reference to `main'
collect2: error: ld returned 1 exit status
make: *** [ptsminer] Error 1

Offline fishrat

  • Full Member
  • ***
  • Posts: 50
    • View Profile
Re: Open source optimized PTS CPU miner (BETA)
« Reply #27 on: January 16, 2014, 02:12:00 pm »
very good