Numpy’s random number routines produce pseudo random numbers using combinations of a BitGenerator to create sequences and a Generator to use those sequences to sample from different statistical distributions.

**Table of Contents**

- Introduction
- Using the NumPy
`random.seed()`

function - The problem with NumPy
`random.seed()`

function - Using the NumPy
`random.RandomState()`

function - Conclusion

## Introduction

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generated sequence is not truly random, because it is completely determined by an initial value, called the PRNG’s **seed** (which may include truly random values).

Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility. In particular, Numpy’s random number routines produce pseudo random numbers using combinations of a BitGenerator to create sequences and a Generator to use those sequences to sample from different statistical distributions.

## Using the NumPy random.seed() function

Recall that the `random.rand(n)`

function generates an array of the `n`

random samples from a uniform distribution over $[0, 1)$. In the following code, we use `random.rand(3)`

to generate an array of 3 random numbers twice.

```
1import numpy as np
2print(np.random.rand(3))
3print(np.random.rand(3))
```

```
[0.70814782 0.29090474 0.51082761]
[0.89294695 0.89629309 0.12558531]
```

We see that the two separate calls to the `random.rand(3)`

function lead to two completely different random arrays. If there is a need to reproduce the same results everytime you call the random function, we can set a seed in the `random.seed()`

function.

```
1np.random.seed(3)
2print(np.random.rand(3))
3
4np.random.seed(3)
5print(np.random.rand(3))
```

```
[0.5507979 0.70814782 0.29090474]
[0.5507979 0.70814782 0.29090474]
```

We see that the 2 arrays generated are identical. Setting a certain seed means that the random generator will produce numbers from a **deterministic** sequence, which means that subsequent random calls (after the pseudorandom number generator is initialized with a seed) will produce the same results.

**Note**: To generate the same random array with each call to a random function, we need to precede the call by an initialization with the same seed each time.

Let’s take a look at the following:

```
1np.random.seed(4)
2print(np.random.rand(3))
3print(np.random.rand(3))
4print(np.random.rand(3))
```

```
[0.96702984 0.54723225 0.97268436]
[0.71481599 0.69772882 0.2160895 ]
[0.97627445 0.00623026 0.25298236]
```

```
1np.random.seed(4)
2print(np.random.rand(3))
3print(np.random.rand(3))
4print(np.random.rand(3))
```

```
[0.96702984 0.54723225 0.97268436]
[0.71481599 0.69772882 0.2160895 ]
[0.97627445 0.00623026 0.25298236]
```

Note that the subsequent sequences of random arrays are the same after initializing with the same seed, though the arrays generated with each call are different. Providing a fixed seed assures that the same series of calls to functions in the `numpy.random`

namespace will always produce the same results, which can be helpful in code testing.

ADVERTISEMENT

## The problem with NumPy random.seed() function

The `np.random.seed()`

function ensures that we can create reproducible results, which means that all random arrays generated (after initialization with the same seed) will be the same on any machine. However, there is a potential problem - the `np.random.seed()`

function sets the seed to a **global** instance of the pseudorandom number generator.

This can potentially be a problem for projects which import other modules or packages which also call `np.random.seed()`

, affecting all calls to the NumPy random functions. For instance, these imported modules could reset the global random seed to other values, leading to unexpected changes to computed results. Therefore, the preferred best practice for getting reproducible pseudorandom numbers is to instantiate a generator object with a seed and “pass it around”.

The preferred best practice for getting reproducible pseudorandom numbers is to instantiate a generator object with a seed and pass it around. The implicit global

`RandomState`

behind the`numpy.random.*`

convenience functions can cause problems, especially when threads or other forms of concurrency are involved. Global state is always problematic. We categorically recommend avoiding using the convenience functions when reproducibility is involved.

NEP 19 — Random number generator policy by Robert Kern

## Using the NumPy random.RandomState() function

To avoid impacting the global numpy state, we shall use the `np.random.RandomState()`

function to replace the `random.seed()`

function. The `np.random.RandomState()`

function has the advantage that it does not change the global `RandomState`

instance that underlies the functions in the `numpy.random`

namespace.

```
1R = np.random.RandomState(32)
2print(R.rand(3))
3
4R = np.random.RandomState(32)
5print(R.rand(3))
```

```
[0.85888927 0.37271115 0.55512878]
[0.85888927 0.37271115 0.55512878]
```

We see that the 2 random arrays generated are identical. Also, note that after setting the variable `R= np.random.RandomState(32)`

, we only need to **prefix** the call to the `rand()`

function by `R.`

We can also combine the 2 statements into a single statement:

```
1np.random.RandomState(32).rand(3)
```

```
array([0.85888927, 0.37271115, 0.55512878])
```

Calling 3 times results in 3 identical random arrays.

```
1np.random.RandomState(32).rand(3)
2np.random.RandomState(32).rand(3)
3np.random.RandomState(32).rand(3)
```

```
array([0.85888927, 0.37271115, 0.55512878])
array([0.85888927, 0.37271115, 0.55512878])
array([0.85888927, 0.37271115, 0.55512878])
```

However, calling the following multiple times will **not** lead to the same array.

```
1R = np.random.RandomState(32)
2print(R.rand(3))
3print(R.rand(3))
4print(R.rand(3))
```

```
[0.85888927 0.37271115 0.55512878]
[0.95565655 0.7366696 0.81620514]
[0.10108656 0.92848807 0.60910917]
```

If you wish to generate the same random array with each call to a random function using `np.random.RandomState()`

and yet do not wish to repeat the prefix with each call, you may write a simple function as follows:

```
1def rng(n): # n is the length of the random array
2 R = np.random.RandomState(12)
3 return R.rand(n)
4
5print(rng(4))
6print(rng(4))
7print(rng(4))
```

```
[0.15416284 0.7400497 0.26331502 0.53373939]
[0.15416284 0.7400497 0.26331502 0.53373939]
[0.15416284 0.7400497 0.26331502 0.53373939]
```

## Conclusion

If you use the functions in the `numpy.random`

namespace, you will not get consistent pseudorandom numbers because they are pulling from a different `RandomState`

instance than the one you just created. Instead, initializing the PRNG with a certain seed using `numpy.random.RandomState(seed)`

will return a new seeded `RandomState`

instance but otherwise does not change anything else. Note that you have to use the returned `RandomState`

instance everytime to get consistent pseudorandom numbers.

On the other hand, `numpy.random.seed()`

resets the state of the existing global `RandomState`

instance that underlies the functions in the `numpy.random`

namespace. This may have undesirable and unexpected consequences on computed outputs and is to be avoided.