Skip to main content
  1. About
  2. For Teams
Asked
Viewed 2k times
2

I want to create a vector of random 1's and 0's in a proportion set by me (in the program I called it dropout) The vector is the same size of a previously created vector CSUM.

in MATLAB it would be

dropout=0.9;
n_elements=size(CSUM)
drpoutmask = (rand(n_elements) > dropout); 

in C++ I have

size_t elements = Csum.size();
std::vector<float> y(elements);
std::uniform_real_distribution<float> distribution(0.0f, 1.0f); 
std::mt19937 engine; // Mersenne twister MT19937
auto generator = std::bind(distribution, engine);
std::generate_n(y.begin(), elements, generator);
std::vector<int> dropoutmask(elements,0);
float dropout=0.9;

for(int i=0; i<elements; i++)
  {
  if(y.at(i)>dropout)
    {
    dropoutmask.at(i)=1;
    }
  }
}

which works but for huge vectors is very very slow, is there a faster way to do this? I am very new at C++.

Any help will be much appreciated

8
  • In MATLAB I meant to say n_elements=length(CSUM) not n_elements=size(CSUM)
    Diego Fernando Pava
    –  Diego Fernando Pava
    2016-12-05 21:08:40 +00:00
    Commented Dec 5, 2016 at 21:08
  • consider std::bitset for dropoutmask if it's just a flag
    Steve Townsend
    –  Steve Townsend
    2016-12-05 21:09:02 +00:00
    Commented Dec 5, 2016 at 21:09
  • Silly question, are you compiling with optimisations turned on?
    Borgleader
    –  Borgleader
    2016-12-05 21:09:22 +00:00
    Commented Dec 5, 2016 at 21:09
  • @DiegoFernandoPava you may edit your post by clicking the edit link just below the tags list on your post.
    jaggedSpire
    –  jaggedSpire
    2016-12-05 21:09:34 +00:00
    Commented Dec 5, 2016 at 21:09
  • 1
    afaik std::bitset would trade speed for space. As always, measure performance using a profiler. If heap usage is a hog std::bitset might be a better choice. If nothing else, use char not int as the mask member. Using int for a 1/0 flag is quite wasteful.
    Steve Townsend
    –  Steve Townsend
    2016-12-05 23:20:24 +00:00
    Commented Dec 5, 2016 at 23:20

1 Answer 1

6
  1. You do know about bernoulli distribution, right? You can use it to generate your integer vector directly.
    Example:

    #include <iostream>
    #include <algorithm>
    #include <string>
    #include <random>
    
    int main()
    {
        constexpr double dropout = 0.9; // Chance of 0
        constexpr size_t size = 1000;
        std::random_device rd;
        std::mt19937 gen(rd());
        std::bernoulli_distribution dist(1 - dropout); // bernoulli_distribution takes chance of true n constructor
    
        std::vector<int> dropoutmask(size);
        std::generate(dropoutmask.begin(), dropoutmask.end(), [&]{ return dist(gen); });
        size_t ones = std::count(dropoutmask.begin(), dropoutmask.end(), 1);
        std::cout << "vector contains " << ones << " 1's, out of " << size << ". " << ones/double(size) << "%\n";
        std::cout << "vector contains " << size - ones << " 0's, out of " << size << ". " << (size - ones)/double(size) << "%\n";
    }
    

    Live example: http://coliru.stacked-crooked.com/a/a160743185ded5c5

  2. Alternatively, you can create a integer vector of desired size (This will set all elements to 0), set first N elements to 1, where n is (1 - dropout) * size (You said you want a proportion, not random amount close to proportion) and then shuffle vector.

    #include <iostream>
    #include <algorithm>
    #include <string>
    #include <random>
    
    int main()
    {
        constexpr double dropout = 0.9; // Chance of 0
        constexpr size_t size = 77;
        std::random_device rd;
        std::mt19937 gen(rd());
    
        std::vector<int> dropoutmask(size);
        std::fill_n(dropoutmask.begin(), dropoutmask.size() * (1 - dropout), 1);
        std::shuffle(dropoutmask.begin(), dropoutmask.end(), gen);
    
        size_t ones = std::count(dropoutmask.begin(), dropoutmask.end(), 1);
        std::cout << "vector contains " << ones << " 1's, out of " << size << ". " << ones/double(size) << "%\n";
        std::cout << "vector contains " << size - ones << " 0's, out of " << size << ". " << (size - ones)/double(size) << "%\n";
    
        for (auto i :dropoutmask) {
            std::cout << i << ' ';   
        }
        std::cout << '\n';
    }
    

Live example: http://coliru.stacked-crooked.com/a/0a9dacd7629e1605

Sign up to request clarification or add additional context in comments.

3 Comments

Good find. But maybe you could provide some code, not a "link-only" answer for the first part.
@Jean-FrançoisFabre Done.
Thank you for referring me to the Bernoulli distribution I did not know that was part of the std. It works great!!

Your Answer

Post as a guest

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.