Ever wondered how to reset an entire array of N elements in a constant slice of time? This post will introduce the algorithm along with an implementation.

Let me lay out the problem. There’s an array of N integers. We would like to be able to reset that array (set all elements to zero), in a set amount of time – regardless of the value of N. The reset operation should take the same amount of time whether it operates on 100 elements or 10,000 of them.

Like many other algorithms and data structures in computer science, we will require a little more allocated memory (3 times more, to be exact. Still O(N) space) to make such unreasonable request actually feasible. The first piece of memory is going to be our data section and hold all N elements. The other two arrays will be called “forward” and “backward” and will be used to check if a certain index is in use or not. Here’s a sketch of the setup, along with a detailed algorithm:

For every added element in index id, we are going to do the following (“used” is the number of elements in use since last reset):

- forward[id] = used
- backward[used] = id
- used++

Analyzing this algorithm we can verify that: index id is in use IFF (if and only if) 0 <= forward[id] && forward[id] < used && backward[forward[id]] == id. This is because we know that all the elements in "backward" with indices 0 to used-1 have been set by us since the last reset. Consequently, all we need to do in a reset operation is set the "used" variable to zero. Here's a proper implementation of such an array, for the integer case, just as an example:

// — array.hpp — #include <stdexcept> #include <boost/shared_array.hpp> class Array { public: explicit Array (unsigned size); // All operations are done in constant O(1) time. void reset (); int get (unsigned id) const; void set (unsigned id, int i); protected: bool used (unsigned id) const; bool check (unsigned id) const throw(std::out_of_range); unsigned size_, used_; boost::shared_array<int> arr_; boost::shared_array<unsigned> forward_, backward_; private: // no implementation for these: Array (const Array &arr); Array &operator= (const Array &arr); }; // — array.cc — template <typename T> boost::shared_array<T> make_shared_arr (size_t size) { // this is crucial for exception safety. // see: http://www.gotw.ca/gotw/056.htm return boost::shared_array<T>(new T[size]); } Array::Array (unsigned size) : size_(size), used_(0), arr_(make_shared_arr<int>(size)), forward_(make_shared_arr<unsigned>(size)), backward_(make_shared_arr<unsigned>(size)) { } void Array::reset () { used_ = 0; } int Array::get (unsigned id) const { if (check(id) && used(id)) return arr_[id]; return 0; } void Array::set (unsigned id, int i) { if (check(id) && !used(id) && ++used_) forward_[backward_[used_-1] = id] = used_-1; arr_[id] = i; } bool Array::used (unsigned id) const { if (forward_[id] >= used_) return false; return backward_[forward_[id]] == id; } bool Array::check (unsigned id) const throw(std::out_of_range) { if (id >= size_) throw std::out_of_range("Index too big!"); return true; } // — main.cc — #include <iostream> int main () { Array a(3); a.set(2,1); std::cout << a.get(2) << std::endl; a.reset(); std::cout << a.get(2) << std::endl; return a.get(1); }

You will have to forgive me for the somewhat obscure implementation of the set() method. I was going for a minimal number of implementation lines – all in the spirit of .

You can generalize the algorithm so that reset will get a parameter, and so all the elements will be “reset” to that value…

Isn’t it possible that the backward[forward[id]] will hold a garbage data that exactly equals id?

It is possible that forward[id] is garbage.

But backward[i] for any i in the range [0, used) was initialized by us in the process, so it can not hold garbage data.

Therefore, if forward[id] is in the correct range (can be garbage) AND backward[forward[id]] is ok (can not be garbage), then we know the cell is initialized for sure.

It beats me as I don’t understand how backword is initialized correctly.

When you call set(2,1) and check the condition of !used(id) how do you ensure you don’t get a garbage data in backword if it hasn’t been initialized yet.

Pretty weird.

Let’s think of it this way:

At the start we initialize used to be 0. That’s O(1) obviously, and pretty straightforward.

Now, as we add members to our array, we populate the backward array in indices 0 to N-1 (N being the number of added elements). Consequently, used = N. This means that all elements backward[i] for i in [0, used) are initialized by us, and can not contain garbage.

Then, when we check if index i is initialized, we first see if forward[i] points to a proper location (as explained in my previous comment, that can be by chance). but then we check if backward[forward[i]] is also correct – and that can only be so if we have set it ourselves that way, for the reasons specified above.

It’s a little complex to grasp and I hope I did not make it any more messy than it has to be, so let me know if there’s still anything unclear.

Let’s look at the call:

a.set(2,1)

at first you check if check(id) && !used(id) is true, if it is you set up the correct value in forward and backword, but when call used(id), and forward[id] <= used just by chance, and you still hasn't initialized any element in the backward array so there is still a chance that backward[forward[id]] will equall id!

It can’t be that forward[id] is ok but I haven’t initialized anything in backward, because forward[id] is ok only if it is equal to something below used_ (forward[id] = x, must be that x < used_) and that represents the number of initialized elements in backward.

Ah, now I’ve groked it. That’s really nice.

Thanks.

I’m a bit confused.

Is this the same problem that the technique described in the post is supposed to solve?

Yes, it looks like a possible solution to that problem.

If that is the case, a far simpler solution is to keep a timestamp on each element of the array, indicating the last time it was individually modified. On a reset operation, we don’t actually touch the array, but we remember the time when it occurred and the value to reset to. Then, to retrieve the value at an element, we examine its timestamp; if it’s later than the last reset, we return the value stored in the array, but if it’s earlier, then we return the reset value. This uses less space and should not be any slower.

Then the initial timestamp initialization is O(N), which is not the case here.