1. Pytorch's CPU's and GPU's seeds are independent. So set/save/load them independently
https://discuss.pytorch.org/t/are-gpu-and-cpu-random-seeds-independent/142
2. If you use randomness on severall gpus, you need to set torch.cuda.manual_seed_all(seed).
If you use cudnn, you need to set torch.backends.cudnn.deterministic=True.
3. There are many lines that set random seeds in different functions. E.g.,
linear_layer_init(seed=1)
sample_next_word(seed=10)
4. Different random generators
import random as rng
import numpy.random as rng
import torch; torch.cuda.manual_seed()
5. Forget to save random seed before testing the model on test/dev sets (which also do sampling or shuffle or simply setting seeds)
6. Forget to test immediately after loading the model before resuming training (because testing(random sample involved, performance showed) is always at the end of each epoch)