[R] Non-reproducible LDA results across machines

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Fri Oct 3 18:41:37 CEST 2025


     To add a little bit of detail to what others have said:

  If you are using the same version of R, on the same operating system, 
with the same processor (e.g. there will be differences between 
Intel/M1/M2 Macs), then as far as I know the only source of 
non-determinism, which could even affect successive runs on the same 
machine, would be parallel operations in BLAS/LAPACK resulting in 
mathematically equivalent operations being done in a different order 
(floating point arithmetic is not associative, so (a+b)+c != a + (b+c) 
in general).

On 10/3/25 05:57, Jeanne Moreau wrote:
> Good Morning,
> 
> I am working with LDA models in R (using both topicmodels::LDA and
> quanteda::textmodel_lda) and noticed that the results differ slightly
> across different machines, even when I use set.seed(1234) and the same
> dataset.
> 
> So, I have a few questions:
> Is this expected due to BLAS/LAPACK or low-level random number generation
> differences?
> Is there a recommended way to enforce bit-for-bit reproducibility of LDA
> results across machines in R?
> Would you recommend always saving fitted models with saveRDS() to ensure
> reproducible outputs instead of re-fitting?
> 
> Thanks a lot for your guidance.
> 
> Best regards,
> 
> Jeanne Moreau
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Associate chair (graduate), Mathematics & Statistics
Director, School of Computational Science and Engineering
* E-mail is sent at my convenience; I don't expect replies outside of 
working hours.



More information about the R-help mailing list