Author Topic: Shuffle Bounty $200 BitUSD  (Read 14389 times)

0 Members and 1 Guest are viewing this topic.

Offline BTSdac

  • Hero Member
  • *****
  • Posts: 1219
    • View Profile
  • BitShares: K1

Actually no matter how you shuffle, the average interval block between public hash of secret is 101 blocks , but cannot keep all interval per delegate is 101,
because there are must have some delegates less than 101,some delegates larger than 101
if we make the MIN interval is larger than 90 by good shuffle,I think it is very easy . the random is honest and enough for BTS
but maybe not for game, it is the reason I suggest use two round history

I might be misunderstanding you, but even though some delegates have interval more than 101 and some less than 101, the idea "one delegate is honest, the random is honest" is essentially true for the last random number of the round under the current system, because the interval for the last revealed number in the round is at least 101.

But yeah, it's a simplification and ignores intentionally not producing a block (this doesn't allow you to determine the random number but allows you to impact it).  It also ignores skipping honest delegate's blocks on purpose.... At least these things might raise red flags or be detectable.  Newly elected delegates would have to be handled right also.

I think if we want to use random No. to play game, ,maybe we should update DPOS a little to fit the short time wait game to void the attack that delegate refuse produce block when he know he does`t win in the block his turn .
I mean though we use two round history hash. or very good  shuffle, we also cannot defense this type attack
Yea I think for gambling it's best to look at the specific application and delegate provided random numbers are likely not the best option.  Maybe Dan has other things in mind.
yes you are right .so I edit my reply before your reply ,  :D :D   i think there only  use one time random No. in BTS per 101 block . though random of some delegate in the middle turn of round , but the random from block in the end of round is perfect. (i hope my english can express me well  :P)
« Last Edit: April 09, 2015, 05:34:05 pm by BTSdac »
github.com :pureland
BTS2.0 API :ws://139.196.37.179:8091
BTS2.0 API 数据源ws://139.196.37.179:8091

Offline Agent86

  • Sr. Member
  • ****
  • Posts: 471
  • BTSX: agent86
    • View Profile

Actually no matter how you shuffle, the average interval block between public hash of secret is 101 blocks , but cannot keep all interval per delegate is 101,
because there are must have some delegates less than 101,some delegates larger than 101
so it is not really "one delegate is honest ,the random is honest" but we can very very close it .  if we make the MIN interval is larger than 90 by good shuffle,I think it is very easy . the random is honest and enough for BTS
but maybe not for game, it is the reason I suggest use two round history

I might be misunderstanding you, but even though some delegates have interval more than 101 and some less than 101, the idea "one delegate is honest, the random is honest" is essentially true for the last random number of the round under the current system, because the interval for the last revealed number in the round is at least 101.

But yeah, it's a simplification and ignores intentionally not producing a block (this doesn't allow you to determine the random number but allows you to impact it).  It also ignores skipping honest delegate's blocks on purpose.... At least these things might raise red flags or be detectable.  Newly elected delegates would have to be handled right also.

I think if we want to use random No. to play game, ,maybe we should update DPOS a little to fit the short time wait game to void the attack that delegate refuse produce block when he know he does`t win in the block his turn .
I mean though we use two round history hash. or very good  shuffle, we also cannot defense this type attack
Yea I think for gambling it's best to look at the specific application and delegate provided random numbers are likely not the best option.  Maybe Dan has other things in mind.

Offline logxing

A            A             A              A               A
Sn-2       Sn-1         Sn            Sn+1          Sn+2
H(Sn)     H(Sn+1)     H(Sn+2)    H(Sn+3)     H(Sn+4)

2 round is really simple method to solve problem AXXXA(all control by one person), not take more blockchain space.
And i like new shuffle algorithm too. Can we do both?
BTS Account:logxing

Offline BTSdac

  • Hero Member
  • *****
  • Posts: 1219
    • View Profile
  • BitShares: K1
I think if we want to use random No. to play game, ,maybe we should update DPOS a little to fit the short time wait game to void the attack that delegate refuse produce block when he know he does`t win in the block his turn .
I mean though we use two round history hash. or very good  shuffle, we also cannot defense this type attack
« Last Edit: April 09, 2015, 03:14:41 pm by BTSdac »
github.com :pureland
BTS2.0 API :ws://139.196.37.179:8091
BTS2.0 API 数据源ws://139.196.37.179:8091

Offline BTSdac

  • Hero Member
  • *****
  • Posts: 1219
    • View Profile
  • BitShares: K1
IMO It's better to just have delegates reveal the number they hashed 2 rounds ago instead of prior round.  The shuffling isn't the problem.  You're just trying to have a random number each block to use for gambling.  This way the random number is more robust (1 honest delegate is always enough)… it's also less complicated.  Your way, collusion of a large number of delegates (but not all) could still be a problem and it would be very difficult to detect.  I think for most gambling applications, allowing the players to participate in the secret/reveal random number gen process is probably the way to go anyway.

A two round history would make the blockchain take more space and would be a much more involved change than simply changing the shuffle algorithm.     There are benefits beyond gambling too.
Why would a two round history take more blockchain space?  It's still only one hash and one reveal per block... You have to keep a rolling database of 303 hashes instead of 202, doesn't seem an issue.  What are the benefits beyond gambling?  Using prior round hash for shuffling is not an issue and the current shuffling should be more random than the proposed new shuffling.  The only issue with current system is that the random numbers from early in a round could possibly be manipulated... but if you're not using them for gambling then who cares?
Actually no matter how you shuffle, the average interval block between public hash of secret is 101 blocks , but cannot keep all interval per delegate is 101,
because there are must have some delegates less than 101,some delegates larger than 101
  if we make the MIN interval is larger than 90 by good shuffle,I think it is very easy . the random is honest and enough for BTS
but maybe not for game, it is the reason I suggest use two round history
(

3.
1).delegate have to publish two hash of secret random when he first create block, one is for next round ,and other is for next next round , when he first create a block .
2).delegate reveal the secret random ( its hash had been publish in previous previous round by him )  and publish a hash of secret random that would been reveal in next next round .
the min interval between one delegate publish the hash of secret random and reveal the secret is larger than 101 block ( all delegate)
so maybe he cannot attack as follow
A-2 | XXXXA-1 | A0
when the time is at A0, the hash of secret random that the delegate reveal was published in A-2 ( mean previous two round). so though in the A-1 A can know what turn he in next round , but he cannot change the secret random that its hash was was published in previous two round

equivalent interval=101
« Last Edit: April 09, 2015, 04:47:25 pm by BTSdac »
github.com :pureland
BTS2.0 API :ws://139.196.37.179:8091
BTS2.0 API 数据源ws://139.196.37.179:8091

Offline Agent86

  • Sr. Member
  • ****
  • Posts: 471
  • BTSX: agent86
    • View Profile
IMO It's better to just have delegates reveal the number they hashed 2 rounds ago instead of prior round.  The shuffling isn't the problem.  You're just trying to have a random number each block to use for gambling.  This way the random number is more robust (1 honest delegate is always enough)… it's also less complicated.  Your way, collusion of a large number of delegates (but not all) could still be a problem and it would be very difficult to detect.  I think for most gambling applications, allowing the players to participate in the secret/reveal random number gen process is probably the way to go anyway.

A two round history would make the blockchain take more space and would be a much more involved change than simply changing the shuffle algorithm.     There are benefits beyond gambling too.
Why would a two round history take more blockchain space?  It's still only one hash and one reveal per block... You have to keep a rolling database of 303 hashes instead of 202, doesn't seem an issue.  What are the benefits beyond gambling?  Using prior round hash for shuffling is not an issue and the current shuffling should be more random than the proposed new shuffling.  The only issue with current system is that the random numbers from early in a round could possibly be manipulated... but if you're not using them for gambling then who cares?
« Last Edit: April 09, 2015, 01:46:42 am by Agent86 »

Offline bytemaster

IMO It's better to just have delegates reveal the number they hashed 2 rounds ago instead of prior round.  The shuffling isn't the problem.  You're just trying to have a random number each block to use for gambling.  This way the random number is more robust (1 honest delegate is always enough)… it's also less complicated.  Your way, collusion of a large number of delegates (but not all) could still be a problem and it would be very difficult to detect.  I think for most gambling applications, allowing the players to participate in the secret/reveal random number gen process is probably the way to go anyway.

A two round history would make the blockchain take more space and would be a much more involved change than simply changing the shuffle algorithm.     There are benefits beyond gambling too. 
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline Agent86

  • Sr. Member
  • ****
  • Posts: 471
  • BTSX: agent86
    • View Profile
IMO It's better to just have delegates reveal the number they hashed 2 rounds ago instead of prior round.  The shuffling isn't the problem.  You're just trying to have a random number each block to use for gambling.  This way the random number is more robust (1 honest delegate is always enough)… it's also less complicated.  Your way, collusion of a large number of delegates (but not all) could still be a problem and it would be very difficult to detect.  I think for most gambling applications, allowing the players to participate in the secret/reveal random number gen process is probably the way to go anyway.

Offline hrossik

  • Jr. Member
  • **
  • Posts: 38
    • View Profile
Ok, I added the binary search and removed the unnecessary set. Overall speedup is around 27% (measured on 10k runs of the old and new approach). New implementation is in "delegate_shuffle", the old one is in "delegate_shuffle_linear".

Code: [Select]
/*
 * File:   main.cpp
 * Author: Hrossik
 *
 * Created on April 7, 2015, 11:27 PM
 */

#include <cstdlib>
#include <vector>
#include <cmath>
#include <set>

#include <stdio.h>
#include <time.h>
#include <algorithm>

#define NUM_DELEGATES 101
#define INVALID_DELEGATE -1
#define delegate_id int
#define sha256 int

using namespace std;

vector<delegate_id> delegate_shuffle(vector<delegate_id>, vector<delegate_id>, sha256);
vector<delegate_id> delegate_shuffle_linear(vector<delegate_id>, vector<delegate_id>, sha256);
int binary_search(vector<delegate_id>, delegate_id);
void print_vec(vector<delegate_id>);

/*
 *
 */
int main(int argc, char** argv) {

    vector<delegate_id> prev_delegate_order;

    for (int i = 0; i < NUM_DELEGATES; i++) {
        prev_delegate_order.push_back(i);
    }

    vector<delegate_id> top_101_delegates_sorted_by_vote;

    for (int i = 0; i < NUM_DELEGATES; i++) {
        top_101_delegates_sorted_by_vote.push_back(i);
    }

    const int num_of_tests = 100000;

    struct timespec start, stop;

    double time_bin = 0;
    for (int i = 0; i < num_of_tests; i++) {
        clock_gettime(CLOCK_REALTIME, &start);
        delegate_shuffle(prev_delegate_order, top_101_delegates_sorted_by_vote, time(NULL));
        clock_gettime(CLOCK_REALTIME, &stop);
        time_bin += (stop.tv_sec - start.tv_sec)*1000.0 + (stop.tv_nsec - start.tv_nsec) / 1000000.0;
    }

    double time_lin = 0;
    for (int i = 0; i < num_of_tests; i++) {
        clock_gettime(CLOCK_REALTIME, &start);
        delegate_shuffle_linear(prev_delegate_order, top_101_delegates_sorted_by_vote, time(NULL));
        clock_gettime(CLOCK_REALTIME, &stop);
        time_lin += (stop.tv_sec - start.tv_sec)*1000.0 + (stop.tv_nsec - start.tv_nsec) / 1000000.0;
    }

    printf("time_bin = %.6lf, time_lin = %.6lf", time_bin, time_lin);

    // test of binary_search
    /*int count = 0;
    for (int i = 0; i < prev_delegate_order.size(); i++) {
        int ind = binary_search(prev_delegate_order, i);
        if(ind == i) {
            count++;
        }       
    }
    printf("count = %d\n",count);*/

    return 0;
}

vector<delegate_id> delegate_shuffle(vector<delegate_id> prev_delegate_order, vector<delegate_id> top_101_delegates_sorted_by_vote, sha256 random_seed) {

    // init
    srand(random_seed);
    vector<delegate_id> result;
    const int lower_half = floor(NUM_DELEGATES / 2);
    const int upper_half = ceil(NUM_DELEGATES / 2);
   
    // sort top_101_delegates_sorted_by_vote by delegates
    sort(top_101_delegates_sorted_by_vote.begin(),top_101_delegates_sorted_by_vote.end());

    // prefill with empty/invalid delegates
    for (int i = 0; i < NUM_DELEGATES; i++) {
        result.push_back(INVALID_DELEGATE);
    }

    vector<delegate_id> constrained_delegates;

    // find the delegates, which are constrained by the n/2 distance and are at top_101_delegates_sorted_by_vote so should be placed in the result
    for (int i = NUM_DELEGATES; i > lower_half; i--) {
        int actual = binary_search(top_101_delegates_sorted_by_vote, prev_delegate_order[i]);
        if (actual != -1) {
            top_101_delegates_sorted_by_vote.erase(top_101_delegates_sorted_by_vote.begin() + actual);
            constrained_delegates.push_back(prev_delegate_order[i]);
        }
    }

    // init free slots for constrained delegates
    vector<int> free_slots;
    for (int i = 0; i < upper_half; i++) {
        free_slots.push_back(NUM_DELEGATES - 1 - i);
    }

    // fill the result with constrained delegates. Each of them has n/2 possible locations
    for (int i = 0; i < constrained_delegates.size(); i++) {
        int uniform_rand_free_slot = rand() % free_slots.size();
        result[free_slots[uniform_rand_free_slot]] = constrained_delegates[i];

        // erase used slot
        free_slots.erase(free_slots.begin() + uniform_rand_free_slot);

        // add new slot, because next delegate's constraint is lower
        free_slots.push_back(lower_half - i);
    }

    // fill free slots for the rest of the delegates
    for (int i = lower_half - constrained_delegates.size(); i > -1; i--) {
        free_slots.push_back(i);
    }

    // place the rest of the delegates. i-th delegate has X possible locations, where X = NUM_DELEGATES - constrained_delegates.size() - i
    for (int i = 0; i < top_101_delegates_sorted_by_vote.size(); i++) {
        int uniform_rand_free_slot = rand() % free_slots.size();
        result[free_slots[uniform_rand_free_slot]] = top_101_delegates_sorted_by_vote[i];
        free_slots.erase(free_slots.begin() + uniform_rand_free_slot);
    }

    return result;
}

void print_vec(vector<delegate_id> vec) {
    for (int i = 0; i < vec.size(); i++) {
        printf("%d ", vec[i]);
    }
    printf("\n");
}

int binary_search(vector<delegate_id> sorted_vec, delegate_id wanted) {
    int lower_bound = 0;
    int upper_bound = sorted_vec.size() - 1;
    while (lower_bound < upper_bound) {
        int mid = (lower_bound + upper_bound) / 2;
        if (wanted > sorted_vec[mid]) {
            lower_bound = mid + 1;
        } else {
            upper_bound = mid;
        }
    }
    if (sorted_vec[lower_bound] != wanted) {
        lower_bound = -1;
    }
    return lower_bound;
}

vector<delegate_id> delegate_shuffle_linear(vector<delegate_id> prev_delegate_order, vector<delegate_id> top_101_delegates_sorted_by_vote, sha256 random_seed) {

    // init
    srand(random_seed);
    vector<delegate_id> result;
    const int lower_half = floor(NUM_DELEGATES / 2);
    const int upper_half = ceil(NUM_DELEGATES / 2);
    std::set<delegate_id> remaining(top_101_delegates_sorted_by_vote.begin(),
            top_101_delegates_sorted_by_vote.end());

    // prefill with empty/invalid delegates
    for (int i = 0; i < NUM_DELEGATES; i++) {
        result.push_back(INVALID_DELEGATE);
    }

    vector<delegate_id> constrained_delegates;

    // find the delegates, which are constrained by the n/2 distance and are at top_101_delegates_sorted_by_vote so should be placed in the result
    for (int i = NUM_DELEGATES; i > lower_half; i--) {
        set<delegate_id>::iterator actual = remaining.find((delegate_id) prev_delegate_order[i]);
        if (actual != remaining.end()) {
            remaining.erase(actual);
            constrained_delegates.push_back(prev_delegate_order[i]);
        }
    }

    // init free slots for constrained delegates
    vector<int> free_slots;
    for (int i = 0; i < upper_half; i++) {
        free_slots.push_back(NUM_DELEGATES - 1 - i);
    }

    // fill the result with constrained delegates. Each of them has n/2 possible locations
    for (int i = 0; i < constrained_delegates.size(); i++) {
        int uniform_rand_free_slot = rand() % free_slots.size();
        result[free_slots[uniform_rand_free_slot]] = constrained_delegates[i];

        // erase used slot
        free_slots.erase(free_slots.begin() + uniform_rand_free_slot);

        // add new slot, because next delegate's constraint is lower
        free_slots.push_back(lower_half - i);
    }

    // fill free slots for the rest of the delegates
    for (int i = lower_half - constrained_delegates.size(); i > -1; i--) {
        free_slots.push_back(i);
    }

    // place the rest of the delegates. i-th delegate has X possible locations, where X = NUM_DELEGATES - constrained_delegates.size() - i
    for (set<delegate_id>::iterator it = remaining.begin(); it != remaining.end(); ++it) {
        int uniform_rand_free_slot = rand() % free_slots.size();
        result[free_slots[uniform_rand_free_slot]] = *it;
        free_slots.erase(free_slots.begin() + uniform_rand_free_slot);
    }

    return result;
}
BTS: hr0550

Offline hrossik

  • Jr. Member
  • **
  • Posts: 38
    • View Profile
1) and 3) are easily evaluated empirically.
2) needs to be better defined.

Actually I think 2) is pretty clear. With 1) and 3) I can not help that much, because c++ is not my native programming language. From algorithmical point of view the only unpredictable part of my code is the intersection of the set of constrained delegates and the top_101_delegates_sorted_by_vote. You never know, how big the intersection will be. But I agree, that it could be computed in a more efficient way. Top_101_delegates are sorted, so we could use binary search for the lookup instead of find(), which has linear complexity. It would have to be tested, because I am not sure if my implementation of binary search on just 101 elements will beat native find, however it has worse asymptotic complexity.

From the c++ implementation point of view there also might be a better approach of tracking unused indexes in the result. I like the free_slots in my implementation, because it allows not only the tracking, but also generating the correct random index.


Edit: Hehe, delegates are sorted by votes, not id :) Still it might be reasonable to sort them by id in order to speed up their lookup. But this really depends on implementation- and number of delegates..

Edit2: To sum up the worst case analysis for intersection, we have (n * n/2) complexity with linear find, which gives 5101 operations for 101 delegates. With logarithmic find it's n*log n + n/2 * log n , which gives approx 304 operations. This might be worth the try.
« Last Edit: April 08, 2015, 09:56:10 am by hrossik »
BTS: hr0550

Offline pc

  • Hero Member
  • *****
  • Posts: 1530
    • View Profile
    • Bitcoin - Perspektive oder Risiko?
  • BitShares: cyrano
1) and 3) are easily evaluated empirically.
2) needs to be better defined.

I think we're clear about 2. The list below maps an old index to a range of possible new indices. With the special case that recently voted in delegates are also mapped to {0,100}, and those voted out aren't mapped at all.

0 ->{0, 100}
1 ->{0, 100}
...
50 ->{0, 100}
51 ->{1, 100}
...
99 ->{49, 100}
100 ->{50, 100}

I think hrossik's implementation and mine are mostly equivalent, and optimal wrt mixing. Not sure if using a set for the free slots instead of a vector is faster, but it's probably hardly measurable anyway.

I like arhag's prng best (because I also wanted to use only as many bits from the seed as absolutely required, but I chose the easy way out - call it KISS, or call is lazy ;-) ), but apart from that I think his implementation is too complicated.
Bitcoin - Perspektive oder Risiko? ISBN 978-3-8442-6568-2 http://bitcoin.quisquis.de

Offline emski

  • Hero Member
  • *****
  • Posts: 1282
    • View Profile
    • http://lnkd.in/nPbhxG
1) and 3) are easily evaluated empirically.
2) needs to be better defined.

Offline bytemaster

Ok, I finally got to implement my former idea.

I separate the delegates to those which are constrained by the n/2 condition and to the rest. Each constrained delegate is placed to a random place chosen out of n/2 allowed positions (which is the max randomization allowed by the constraint). i-th unconstrained delegate can be placed at X positions, where X = NUM_DELEGATES - NUM_CONSTRAINED_DELEGATES - i (which is the classical way of generating permutations, so it is also optimal). I also account for the fact that delegates to be shuffled can (but don't have to) be different than the previous delegates.

Code: [Select]
/*
 * File:   main.cpp
 * Author: Hrossik
 *
 * Created on April 7, 2015, 11:27 PM
 */

#include <cstdlib>
#include <vector>
#include <cmath>
#include <set>

#define NUM_DELEGATES 101
#define INVALID_DELEGATE -1
#define delegate_id int64_t
#define sha256 int

using namespace std;

vector<delegate_id> delegate_shuffle(vector<delegate_id>, vector<delegate_id>, sha256);
void printVec(vector<delegate_id>);

vector<delegate_id> delegate_shuffle(vector<delegate_id> prev_delegate_order, vector<delegate_id> top_101_delegates_sorted_by_vote, sha256 random_seed) {

    // init
    srand(random_seed);
    vector<delegate_id> result;
    const int lower_half = floor(NUM_DELEGATES / 2);
    const int upper_half = ceil(NUM_DELEGATES / 2);
    std::set<delegate_id> remaining(top_101_delegates_sorted_by_vote.begin(),
            top_101_delegates_sorted_by_vote.end());

    // prefill with empty/invalid delegates
    for (int i = 0; i < NUM_DELEGATES; i++) {
        result.push_back(INVALID_DELEGATE);
    }

    vector<delegate_id> constrained_delegates;

    // find the delegates, which are constrained by the n/2 distance and are at top_101_delegates_sorted_by_vote so should be placed in the result
    for (int i = NUM_DELEGATES; i > lower_half; i--) {
        set<delegate_id>::iterator actual = remaining.find((delegate_id)prev_delegate_order[i]);
        if (actual != remaining.end()) {
            remaining.erase(actual);
            constrained_delegates.push_back(prev_delegate_order[i]);
        }
    }

    // init free slots for constrained delegates
    vector<int> free_slots;
    for (int i = 0; i < upper_half; i++) {
        free_slots.push_back(NUM_DELEGATES - 1 - i);
    }

    // fill the result with constrained delegates. Each of them has n/2 possible locations
    for (int i = 0; i < constrained_delegates.size(); i++) {
        int uniform_rand_free_slot = rand() % free_slots.size();
        result[free_slots[uniform_rand_free_slot]] = constrained_delegates[i];

        // erase used slot
        free_slots.erase(free_slots.begin() + uniform_rand_free_slot);

        // add new slot, because next delegate's constraint is lower
        free_slots.push_back(lower_half - i);
    }

    // fill free slots for the rest of the delegates
    for (int i = lower_half - constrained_delegates.size(); i > -1; i--) {
        free_slots.push_back(i);
    }

    // place the rest of the delegates. i-th delegate has X possible locations, where X = NUM_DELEGATES - constrained_delegates.size() - i
    for (set<delegate_id>::iterator it = remaining.begin(); it != remaining.end(); ++it) {
        int uniform_rand_free_slot = rand() % free_slots.size();
        result[free_slots[uniform_rand_free_slot]] = *it;
        free_slots.erase(free_slots.begin() + uniform_rand_free_slot);
    }

    return result;
}

So far after a quick glance I think I like your approach the best, but I will be honest I have not evaluated it closely enough relative to the other ideas to come to a final conclusion.   Would anyone like to suggest a way of judging the best solution?

1) Predictable/Constant run time is a requirement.
2) Best mixing
3) best performance.

How can we actually "TEST" these methods to see whether or not they are doing a "good job"? 
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline bytemaster

However, I think BM meant the word "distance" this way: If a delegate X has index i in the previous order and index j in the next order, then | (n - i) + j | >= n/2. Imagine it as 2 sequences one after another and there should be a gap of n/2 other delegates before X can go again.

Ah of course. That makes sense. Thanks. Still would be great to get clarification from bytemaster on this, but I can see why that would be useful.

I was trying to figure out how to communicate that without drawing it on a white board, I assumed you smart people would see the problem and figure out what I meant.   This is a good description of what I wanted.
For the latest updates checkout my blog: http://bytemaster.bitshares.org
Anything said on these forums does not constitute an intent to create a legal obligation or contract between myself and anyone else.   These are merely my opinions and I reserve the right to change them at any time.

Offline arhag

  • Hero Member
  • *****
  • Posts: 1214
    • View Profile
    • My posts on Steem
  • BitShares: arhag
  • GitHub: arhag
So I had to change it up considerably compared to the previous method to generalize to N delegates.

Instead of trying to use all 256-bits of entropy directly to select the permutation, I am resorting to generating enough pseudorandom data from the random_seed to break it up into a sequence of just the number of bits necessary to get the appropriate-sized offset for each delegate that will be added into the new order. I can then use the same process to regenerate more of that pseudorandom data to replace the original so that I can keep iterating until an offset within the appropriate limit is selected for each of the delegates or until I hit maximum iteration in which case I bias the results by selecting an offset of 0 (but that should be an incredibly low probability event: less than 2^-100 with the current selection of the MAX_ITERATION constant below).

I also now actually update the set of delegates (remove voted out delegaes, add voted in delegates). To maximize entropy and mixing, the old delegates that have not been voted out are added first in reverse order of the prev_delegate_order (because the last half of prev_delegate_order are the ones that have constraints on where they can go), and then the new delegates are added.

Code: [Select]
#include <algorithm>
#include <vector> 
#include <tuple>
#include <utility>   
#include <fc/crypto/sha256.hpp>
#include <fc/exception/exception.hpp>
#include <string.h>

namespace bts { namespace blockchain {

typedef int64_t delegate_id;

// Calculates minimum number of bits needed to fit n
inline uint8_t
calculate_bit_size(uint32_t n)
{
    uint8_t size = 1;
    for (uint32_t bound = 2; n >= bound; bound = (1 << (++size)));
    return size;
}

// Fills data with pseudorandom bytes using random seed and returns updated seed.
fc::sha256 generate_random_data(std::vector<uint8_t>& data, const fc::sha256& random_seed)
{
    uint64_t num_bytes = data.size();
    uint8_t hash_size = sizeof(random_seed._hash);
    fc::sha256 seed(random_seed.data(), (size_t) hash_size);
    for (uint64_t i = 0; i < num_bytes; i += hash_size) {
        memcpy(data.data() + i, seed.data(), std::min((uint64_t) hash_size, num_bytes - i));
        seed = fc::sha256::hash(seed);
    }
    return seed;
}

std::vector< std::pair<delegate_id, bool> >
prepend_new_delegates(const std::vector<delegate_id>& prev_delegate_order,
                      std::vector<delegate_id> top_101_delegates)
{
    std::vector< std::tuple<delegate_id, uint32_t, bool> > prev_delegates;
    prev_delegates.reserve(prev_delegate_order.size());
    for (uint32_t i = 0; i < prev_delegate_order.size(); ++i)
        prev_delegates.emplace_back(prev_delegate_order[i], i, false);
    std::sort(prev_delegates.begin(), prev_delegates.end(),
                 [] (const std::tuple<delegate_id, uint32_t, bool>& a,
                     const std::tuple<delegate_id, uint32_t, bool>& b) {
                    return (std::get<0>(a) < std::get<0>(b));
                 });
    std::sort(top_101_delegates.begin(), top_101_delegates.end());

    std::vector< std::pair<delegate_id, bool> > delegate_list;
    delegate_list.reserve(top_101_delegates.size() + prev_delegate_order.size());
   
    auto old_it = prev_delegates.begin();
    auto new_it = top_101_delegates.begin();
    while (old_it != prev_delegates.end() || new_it != top_101_delegates.end()) {
        if ((new_it == top_101_delegates.end()) || (std::get<0>(*old_it) < *new_it)) {
            // delegate std::get<0>(*old_it) has been removed
            *old_it = std::make_tuple(std::get<0>(*old_it), std::get<1>(*old_it), true);
            ++old_it;
        } else if ((old_it == prev_delegates.end()) || (std::get<0>(*old_it) > *new_it)) {
            // delegate *new_it has been added
            delegate_list.emplace_back(*new_it, false);
            ++new_it;
        } else {
            ++old_it;
            ++new_it;
        }
    }   
   
    uint32_t new_delegates = delegate_list.size();
    delegate_list.resize(new_delegates + prev_delegates.size());
    for (auto it = prev_delegates.begin(); it != prev_delegates.end(); ++it) {
        uint32_t i = new_delegates + std::get<1>(*it);
        delegate_list[i].first = std::get<0>(*it);
        delegate_list[i].second = std::get<2>(*it);
    }

    return delegate_list;
}

struct ReorderData {
    enum Status : uint8_t { PENDING = 0, COMPLETE };
   
    delegate_id d;
    uint32_t offset;
    uint32_t limit;
    uint64_t mask;
    uint64_t byte_index;
    uint8_t  bit_shift;
    uint8_t  num_bytes;
    Status  status;
};

std::vector<delegate_id>
next_delegate_order(const std::vector<delegate_id>& prev_delegate_order,
                    const std::vector<delegate_id>& top_101_delegates_sorted_by_vote,
                    const fc::sha256& random_seed)
{
    uint32_t num_delegates = prev_delegate_order.size();
    auto delegate_list = prepend_new_delegates(prev_delegate_order, top_101_delegates_sorted_by_vote);

    std::vector<ReorderData> reorder(num_delegates);

    uint64_t byte_index = 0, bit_index = 0;
    uint32_t middle = (num_delegates + 1)/2;
    uint32_t limit = middle;
    uint32_t i = num_delegates;
    uint32_t j = num_delegates;
    for (auto it = delegate_list.rbegin(); it != delegate_list.rend(); ++it) {
        if (j < middle) {
            --limit;
        }
        if (j > 0)
            --j;
        if (it->second) { // if missing
            ++limit;
            continue;
        }
        FC_ASSERT( i > 0 );
        --i;
        reorder[i].d = it->first;
        reorder[i].byte_index = byte_index;
        reorder[i].limit = limit;
        reorder[i].num_bytes = 1;
        uint8_t remaining_bits = calculate_bit_size(limit);
        uint64_t mask = 0;
        while ((remaining_bits + bit_index) >= 8) {
            remaining_bits = remaining_bits + bit_index - 8;
            mask = mask | ( ((1 << (8 - bit_index)) - 1) << remaining_bits );
            bit_index = 0;
            ++byte_index;
            ++(reorder[i].num_bytes);
        }
        reorder[i].bit_shift = 8 - (remaining_bits + bit_index);
        reorder[i].mask = (mask | ((1 << remaining_bits) - 1)) << reorder[i].bit_shift;
        bit_index += remaining_bits;
    }

    const uint32_t MAX_ITERATION = 100; // Probability of reaching max iteration is less than 2^(-MAX_ITERATION)
    fc::sha256 seed = random_seed;
    std::vector<uint8_t> random_data(byte_index + 1);
    for (uint32_t k = 0; k < MAX_ITERATION; ++k) {
        bool repeat = false;
        seed = generate_random_data(random_data, seed);
        for (int64_t i = num_delegates - 1; i >= 0; --i) {
            if (reorder[i].status != ReorderData::Status::PENDING)
                continue;
            uint64_t value = 0;
            uint16_t shift = (reorder[i].num_bytes - 1) * 8;
            for (uint64_t j = reorder[i].byte_index, end = j + reorder[i].num_bytes; j < end; ++j) {
                value = value | (random_data[j] << shift);
                shift -= 8;
            }
            value = (value & reorder[i].mask) >> reorder[i].bit_shift;
            if (value < reorder[i].limit) {
                reorder[i].offset = value;
                reorder[i].status = ReorderData::Status::COMPLETE;
            } else {
                repeat = true;
            }
        }
        if (!repeat)
            break;
    }

    const delegate_id free_slot = (delegate_id) (-1);
    std::vector<delegate_id> new_order(num_delegates, free_slot);
    for (int64_t i = num_delegates - 1; i >= 0; --i) {
        uint32_t offset = (reorder[i].status == ReorderData::Status::PENDING) ?
                           0 : reorder[i].offset; // Biases result in case of max iteration
        uint32_t j = num_delegates - 1;
        for (; offset > 0; --j) {
            if (new_order[j] == free_slot)
                --offset;
        }
        for (; new_order[j] != free_slot; --j);
        new_order[j] = reorder[i].d;
    }

    return new_order;
}

}}
« Last Edit: April 08, 2015, 05:18:25 am by arhag »