RNG 批处理和可重复性:如何在延续随机序列的批次中生成随机数?

问题描述 投票:0回答:1

我正在 R 中运行蒙特卡罗模拟,由于内存和时间限制,需要批量运行。我希望结果是可重复的。为此,我希望能够在每个批次结束时保存 RNG 的“随机状态”,然后在下一个批次开始时加载它以继续伪随机序列。我在执行此操作时遇到问题,特别是如果我在批次中生成多种类型的随机数(unif、norm、lognorm),再现性似乎不起作用。

我提供了一个我的理解的例子以及一个它何时不起作用的例子:


# check session rng_state
state_0 <- .Random.seed

# changes when a seed is specified
set.seed(42)
state_1 <- .Random.seed
all.equal(state_1,state_2)

# generate random numbers
unif_1 <- runif(10)
state_2 <- .Random.seed

# random state has now changed
all.equal(state_1,state_2)

# new random numbers are generated continuing the random sequence
unif_2 <- runif(10) 
all.equal(unif_1,unif_2)

# reset random state to regenerate the random sequence
.Random.seed <- state_1
unif_1_2 <- runif(20)
all.equal(unif_1_2, c(unif_1,unif_2))

# i want to generate random numbers in batches and continue the random sequence 
# whilst making sure it is reproducable

generate_rn <- function(num, rng_state){
  .Random.seed <- rng_state
  
  unif <- runif(num)
  norm <- rnorm(num)
  
  random_numbers <- cbind(unif,norm)
  rng_state <- .Random.seed
  return(list(random_numbers,rng_state))
}

# call function to generate random numbers for first batch using the starting random state
batch_1 <- generate_rn(10,state_1)

# call function to generate random numbers for second batch using the ending random state from batch 1
batch_2 <- generate_rn(10,batch_1[[2]])


batch_1_2 <- generate_rn(20,state_1)

# the random state after both functions are the same as we have generated 40 random numbers in each
all.equal(batch_1_2[[2]],batch_2[[2]])

# but the random numbers produced are not the same
all.equal(batch_1_2[[1]],rbind(batch_1[[1]],batch_2[[1]]))

# and the first 10 uniform random numbers are not the same as the first 10 uniform numbers generated
# above whilst supposedly using the same random state
all.equal(unif_1,batch_1[[1]][,1])

r random montecarlo
1个回答
0
投票

来自

?.Random.seed

可以保存和恢复,但用户不得更改。

你可以像这样使用

set.seed
吗?它将是完全可重现的。

generate_rn <- function(seed, num){
  set.seed(seed)
  cbind(runif(num), rnorm(num))
}

(seed0 <- sample(.Machine$integer.max, 1))
#> [1] 1394740963
set.seed(seed0)
nBatches <- 5
seeds <- sample(.Machine$integer.max, nBatches, 1)

res_1 <- lapply(seeds, generate_rn, num = 10)
res_2 <- lapply(seeds, generate_rn, num = 10)
identical(res_1, res_2)
#> [1] TRUE
© www.soinside.com 2019 - 2024. All rights reserved.