Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - yvg1900

Pages: 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14
76
I am working on more different AVs for MMC now, so probably it will be possible to get better balance then...

But as said, CPU really squeezed out by av=1, as well as by av=3.

77
yam miner is really stressing CPU and memory when maxed out. It is causing many OCed systems to overheat more than while running IntelButrTest in top aggressive mode.

If you are running Linux or Windows with HugePages, it is adjusting memory (DTLB) layout to better fit mem access patterns of MMC, so HPM will climb, but temperatures will climb as well.

Can you check the memory usage of the miner? It shall be mem leak-free, but who knows - maybe I overlooked something.

yvg1900

78
I've been doing some analysis and tweaking this morning.

Test machine and i7 3770 with 8GB

This has 4 real cores and uses hyperthreading for 8 virtual cores.

I'm finding that most of the performance (~85%) is achievable with only 4 threads - assuming those threads are maxing out the AES-NI instructions.

Mining with the other 4 threads is not efficient - it's better to use those to mine another CPU coin that doesn't require AES-NI - ProtoShares or Primecoin are good candidates.

I've just tested ProtoShares, it does impact the MemoryCoin HPM, probably because of contention for memory.

Total figures I'm seeing on the i7 3770 - 8GB

MMC - 5.2HPM 
plus
PTS -  115CPM

This is in contrast to just running MMC exclusively - which gives about 9HPM
Might see less impact on MMC mining if using the remaining cores to mine a non-memory intensive CPU coin like Prime.

Okay - tested a Prime miner - using the non AVX version gives better results, so I'm thinking the contention is not for memory, but instruction set - Here's the figures

MMC - 6.2 HPM
plus
XPM - 14,000 PPS

- This might be a good combination of coins to mine on the same machine

The major point in such setups is algo variation used (configured via AV parameter).

av=1 maximizes TOTAL hpm for HyperThreading-enabled machines, so using av=1 setups is not efficient if you run 4 threads on 4-core mach with HT. For that setup there is av=3, which really maxes out AES-NI capabilities of the non-HT core.

Another issue is that you have to properly pin your process to logical CPUs, having in mind avoiding core sharing. This is non-trivial task, but can be accomplished with stock OS tools, too.

Multicoin CPU mining is actually initial design idea of yam, and this is exactly why I am combining coins in monolythic miner.

I am currently experimenting more algo variations, so probably we will have other possibilities as well.

yvg1900

79
MemoryCoin / Re: YAM Miner and solo mining
« on: January 16, 2014, 09:34:40 pm »
Pool load balancing was on the plan, but is delaying due to other... more interesting and funny features are on the way, such as perf enh, etc.

yvg1900

80
MemoryCoin / Re: YAM Miner and solo mining
« on: January 16, 2014, 09:03:27 pm »
yam miner supports getwork protocol, so if you can configure your wallet as getwork server (which shall be possible) you can try solo mining. Unfortunately, this functionality has not been tested at all.

You shall define rpc user and password, and specify your connection parameters in yam config file. Check readme.txt for mining target URI format details - you have to create your own.

yvg1900

81
Also ensure that either yam-mmc.cfg located in the same directory as yam executable, or you specified reachable path to it in --config parameter.

Follow @yvg1900 on Twitter to get updates on performance mining software


82
Marketplace / Re: 200 PTS - Bounty Rules and Procedures Document
« on: January 15, 2014, 12:53:30 am »
I find rules for software bounties way too much restrictive, so I can propose addition of the Bounty Poster to decide if following some of steps is really necessary to declare real goal achieved (for example, github submission and platform specific testing).

For example, a top level dev may implement very good algo, but creating all the infrastructure for it (git repo, step by step instructions, etc), as well as teaming with others to fulfil that may easily become boring. He may decide just to dump source tree/workspace archive to dropbox and continue with other tasks, so placing software to github can be accomplished by anyone else.

I personally think that software development (as software design and coding) can be easily separated from open source infrastructure support, so I suggest to think how to prevent converting developers to managers by these rules.

Another issue is a bounty split. Let us imagine situation of developing some software optimizations or solution for complicated cryptography protocol problem. There are people who can come with clear explanation of the concept/idea/optimization approach, but will refuse to code that and will even refuse to apply for bounty, and proposed system with record of work may completely mitigate initial concept contribution while focusing on coding/implementation details, leaving "opportunity opener" out of the process. So there shall be a statement/guideline for bounty poster to specifically take care of such situations.

yvg1900

P.S. This is my personal opinion only, given as a response for personal request for comment from barwizi and to support his efforts in putting these things together.

83
По уму если использовать yam M7j для совместного GPU+CPU майнинга на 4-ядерных Haswell с HyperThreading, надо запускать 4 потока в yam с указанием av=3. Это полностью задействует AES-NI возможности процессора, но оставляет 4 свободных потока на управление GPU.

84
With the new coin miner fully justifies its name.  :)

Yes! Thanks for bringing mining home to CPUs and Memory. 5000 MMC tip on its way to yvg1900 from MCF.

Got it. I appreciate.

As a response to the tip, I announce that I will change my priorities towards to MMC CPU mining and will put more efforts into additional optimizations of MMC CPU mining:

- Better support for non-AES-NI CPUS to bring small miners closer to profitable mining;
- Variation of a mining setup with shared 1Gb memory for all threads (expect lower performance fur such setups due to inter-core contention);
- More optimizations to AES-NI algos to come even closer to GPU;
- More pool and protocol support (I encourage pool maintainers to contact me in order to provide better integration and improve general mining experience);
- Built-in NUMA support for better CPU utilization and proper data flow routing.

I will keep concentrating on optimizing miners for physical CPUs, because of cloud servers have numerous issues with memory management, NUMA compatibility, advanced instruction set exposure, etc.

I will not give any ETA, you all can imagine there is a lot of research work going on, and tens of different algo implementations are being tested.

yvg1900

85
Unfortunately, here we have nearly the same situation as we had with cloud servers running first versions of PTS mining (before yam, it was jhProtominer M7c-M7h, probably you followed a bit).

I recommend you the following:

1. Post here output of the miner that comes right after banner message - there is info on algo in use, number of threads selected and optimization compatibility checks - there we will see if AES-NI is really working
2. Try running it with ONE thread. You shall get ART value within reasonable time and factor out many potential issues.
3. Post here output of numactl --hardware (this is TYPICAL that cloud services DO NOT properly support NUMA, to be honest I am NOT aware of any that does that right, as well as memory management is completely unacceptable for high performance computing).

From what I see, you have multi-socket system (some Dual Xeon E5-26XX), so proper NUMA support is critical. Default memory allocation policy for such systems is typical to be Interleaved, causing huge underperformance under heavy memory load due to QPI bandwidth limitation, which is exactly our case.

Note that miner updates statistics in between rounds, so not showing statistics means that round takes way too long - compare this with physical machine - some SB, IB or Haswell i7.

As of crash, which EXACTLY CPU is there and which build do you use?

yvg1900

P.S. In other case I would not support cloud services and low end configs because of they simply using outdated and improperly configured software. But let us try to figure out what we can do.

86
One of advantages of HugePages is that they bypass OS memory swapping mechanism, so if you are unable to get them allocated, there is a high chance that you will get some of memory swapped out to disk, and this will lead to extreme underperformance.

Possibilities in your case are to add more RAM, further reduce number of threads or wait until I come up with alternate version of miner with lower RAM requirements. But expect that version to be slower. How much - it is matter of tests and efforts put into optimizing.

yvg1900

87
At the moment I believe that there is no point running yam M7j on non-AES-NI configs - I expect it may be slower even than original miner, because of non-AES-NI codepath did not get enough attention from me. Probably for next versions.

As of less memory usage, I am working on that. Major part of the optimization was to reduce data exchange and contention between cores, that's why separate buffer per thread used. Reverting back to old schema is possible in theory, but it is complete rethink of the optimizations made and needs time. You can imagne even this set of optimizations took long time to get implemented.

As of miner behavior under low RAM situations, there is absolutely no special handling for that except for some null pointer checks. At first, applcation performance will degrade due to swapping (if you have one configured - I don't) and if there is not enough RAM even in case of swapping, it will (may) just crash, but will do so at very early stage.

If you are not getting hashrate, then system runs extremely slow due to swapping (=> more RAM or less threads), inter-socket data exchange over QPI (=> numactl or proper affinity setup) or no AES-NI. There is an ART parameter (Average Round Time in milliseconds) that shall be less than 60000 in regular cases. 45000 is typical, less than 30000 is fast.

If you reduced number of threads due to RAM constrains, consider using av=2 or av=3 to gain from non-HT algo variations.

yvg1900

P.S. At the end, it is MemoryCoin :) It is supposed to consume A LOT of memory for PoW :)

88
Azure VPS is known to have problems with AES-NI and AVX virtualization support.

You may try running barcelona or generic build, or try these builds in PTS mining mode.


89
sorry, was still testing.

there 8 threads using
all 12gb ram were free
hugpages enabled/disabled  - no difference

and now, sth new :D miner doesnt start anymore with  aesni on anymore
and with off shows the same msg X_X

i tested on all my win2008/2012 servers
all have the same, miner closed, or this erroe

EXACT CPU model? K10 is a family, and did you try generic and barcelona builds?

90
Info you provided is simply not enough to assist you.

How many threads are you running? Is AES-NI enabled? What are compat check messages and pool connection states? How much RAM is free before miner start? Is hugepages enabled or has warning?

Pages: 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14