R Under development (unstable) (2024-07-10 r86888 ucrt) -- "Unsuffered Consequences" Copyright (C) 2024 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(fitdistrplus) Loading required package: MASS Loading required package: survival > > nsample <- 500 > nsample <- 10 > visualize <- FALSE # TRUE for manual tests with visualization of results > set.seed(1234) > > > # (1) check input data > # > data(smokedfish) > fitsf <- fitdistcens(smokedfish, "lnorm") > summary(fitsf) Fitting of the distribution ' lnorm ' By maximum likelihood on censored data Parameters estimate Std. Error meanlog -3.627606 0.4637122 sdlog 3.544570 0.4876610 Loglikelihood: -90.65154 AIC: 185.3031 BIC: 190.5725 Correlation matrix: meanlog sdlog meanlog 1.0000000 -0.4325873 sdlog -0.4325873 1.0000000 > > > smokedfish2 <- tail(smokedfish, 20) > > try(fitdistcens(rbind(smokedfish2, c(NA, NA)) , "lnorm")) Error in checkCensoredDataFrameNAInfNan(censdata) : censdata contain two NA values on the same line. > > try(fitdistcens(rbind(smokedfish2, c(3, Inf)) , "lnorm")) Error in checkCensoredDataFrameNAInfNan(censdata) : censdata contain Inf (infinite) values. > > try(fitdistcens(rbind(smokedfish2, c(NaN, 3)) , "lnorm")) Error in checkCensoredDataFrameNAInfNan(censdata) : censdata contain NaN (not a numeric) values. > > try(fitdistcens(rbind(smokedfish2, c(5, 3)) , "lnorm")) Error in checkCensoredDataFrameNAInfNan(censdata) : each censdata$left value must be less or equal to the corresponding censdata$right value > > fitsf <- fitdistcens(smokedfish, "lnorm", calcvcov = FALSE) > > > > # (6) custom optimisation function - example with the genetic algorithm > # > data(fluazinam) > log10EC50 <-log10(fluazinam) > > #wrap genoud function rgenoud package > mygenoud <- function(fn, par, ...) + { + require(rgenoud) + res <- genoud(fn, starting.values=par, ...) + standardres <- c(res, convergence=0) + + return(standardres) + } > > # call fitdistcens with a 'custom' optimization function > fit.with.genoud <- fitdistcens(log10EC50, "logis", custom.optim=mygenoud, nvars=2, + start=list(location=1, scale=1), + Domains=cbind(c(0,0), c(5, 5)), boundary.enforcement=1, + print.level=1, hessian=TRUE) Loading required package: rgenoud ## rgenoud (Version 5.9-0.10, Build Date: 2023-12-13) ## See http://sekhon.berkeley.edu/rgenoud for additional documentation. ## Please cite software as: ## Walter Mebane, Jr. and Jasjeet S. Sekhon. 2011. ## ``Genetic Optimization Using Derivatives: The rgenoud package for R.'' ## Journal of Statistical Software, 42(11): 1-26. ## Thu Jul 11 15:03:40 2024 Domains: 0.000000e+00 <= X1 <= 5.000000e+00 0.000000e+00 <= X2 <= 5.000000e+00 Data Type: Floating Point Operators (code number, name, population) (1) Cloning........................... 122 (2) Uniform Mutation.................. 125 (3) Boundary Mutation................. 125 (4) Non-Uniform Mutation.............. 125 (5) Polytope Crossover................ 125 (6) Simple Crossover.................. 126 (7) Whole Non-Uniform Mutation........ 125 (8) Heuristic Crossover............... 126 (9) Local-Minimum Crossover........... 0 HARD Maximum Number of Generations: 100 Maximum Nonchanging Generations: 10 Population size : 1000 Convergence Tolerance: 1.000000e-03 Using the BFGS Derivative Based Optimizer on the Best Individual Each Generation. Checking Gradients before Stopping. Not Using Out of Bounds Individuals But Allowing Trespassing. Minimization Problem. Generation# Solution Value 0 2.060254e+01 1 2.055391e+01 'wait.generations' limit reached. No significant improvement in 10 generations. Solution Fitness Value: 2.055391e+01 Parameters at the Solution (parameter, gradient): X[ 1] : 2.151909e+00 G[ 1] : -9.910020e-06 X[ 2] : 6.909662e-01 G[ 2] : 3.172033e-06 Solution Found Generation 1 Number of Generations Run 12 Thu Jul 11 15:03:42 2024 Total run time : 0 hours 0 minutes and 2 seconds > > summary(fit.with.genoud) Fitting of the distribution ' logis ' By maximum likelihood on censored data Parameters estimate Std. Error location 2.1519094 0.3222566 scale 0.6909662 0.1744835 Loglikelihood: -20.55391 AIC: 45.10781 BIC: 46.38593 Correlation matrix: location scale location 1.00000000 0.05106398 scale 0.05106398 1.00000000 > > > # (9) check keepdata > # > if (visualize) # LONG TO RUN ON CRAN AND NEEDS VISALIZATION OF RESULTS + { + set.seed(1234) + x <- rexp(1e3, 5) + # x <- data.frame(left=x, right=x+rexp(x, 1/2)) + x <- data.frame(left=x, right=x) + f1 <- fitdistcens(x, "exp", keepdata=FALSE) + f2 <- fitdistcens(x, "exp", keepdata=TRUE) + f1$censdata + f2$censdata + plot(f1) + plot(f2) + + } > > > # (9) fixing parameters > # > x <- rexp(nsample, 5) > x <- data.frame(left=x, right=x+.1) > > f1 <- fitdistcens(x, "gamma", fix.arg=list(shape=1.5)) > f1 Fitting of the distribution ' gamma ' on censored data by maximum likelihood Parameters: estimate rate 9.841827 Fixed parameters: value shape 1.5 > f1$fix.arg $shape [1] 1.5 > > f1 <- fitdistcens(x, "gamma", fix.arg=function(x) list(shape=1.5)) > f1 Fitting of the distribution ' gamma ' on censored data by maximum likelihood Parameters: estimate rate 9.841827 Fixed parameters (computed by a user-supplied function): value shape 1.5 > f1$fix.arg.fun function (x) list(shape = 1.5) > > > # (10) weights > # > data(salinity) > salinity.unique <- unique(salinity) > string.unique <- paste(salinity.unique$left, salinity.unique$right) > string.salinity <- paste(salinity$left, salinity$right) > nobs <- nrow(salinity.unique) > salinity.weights <- numeric(nobs) > for (i in 1:nobs) + { + salinity.weights[i] <- length(which(string.salinity == string.unique[i])) + } > cbind(salinity.unique, salinity.weights) left right salinity.weights 1 20.00 NA 8 6 21.50 21.5 1 7 15.00 30.0 1 8 20.00 25.0 1 9 23.70 23.7 1 10 25.00 NA 5 13 20.00 30.0 3 18 3.20 47.0 2 20 26.10 26.1 1 21 26.20 26.2 1 22 25.00 30.0 1 23 29.10 29.1 2 25 30.00 NA 10 30 30.00 30.0 1 36 30.00 35.0 2 38 35.00 35.0 1 39 35.00 NA 2 41 25.00 47.0 1 42 30.00 47.0 1 43 30.00 50.0 1 44 35.00 47.0 1 45 43.90 43.9 1 46 35.00 55.0 1 47 45.70 45.7 1 48 47.00 NA 3 50 47.00 47.0 2 53 49.00 49.0 1 54 0.10 NA 5 59 0.23 NA 1 60 3.20 NA 3 63 6.40 NA 6 69 3.20 12.8 1 70 6.40 12.8 1 71 0.10 20.0 1 72 3.20 20.0 1 73 12.60 12.6 1 74 12.80 12.8 2 75 12.80 NA 10 86 12.80 15.0 2 88 15.00 NA 7 90 15.00 15.0 3 98 6.40 25.0 1 99 15.00 20.0 1 100 6.40 30.0 1 101 12.80 25.0 2 103 3.20 35.0 1 104 15.00 25.0 1 105 12.80 20.0 1 > > (fa <- fitdistcens(salinity, "lnorm")) Fitting of the distribution ' lnorm ' on censored data by maximum likelihood Parameters: estimate meanlog 3.3854230 sdlog 0.4961333 > (fb <- fitdistcens(salinity.unique, "lnorm", weights = salinity.weights)) # should give the same results Fitting of the distribution ' lnorm ' on censored data by maximum likelihood Parameters: estimate meanlog 3.3854550 sdlog 0.4961777 Warning message: In mledist(censdata, distname, start = arg_startfix$start.arg, fix.arg = arg_startfix$fix.arg, : weights are not taken into account in the default initial values > > # (11) check the warning messages when using weights in the fit followed by functions > # that do not yet take weights into account > # with an example to be used later to see if weights are well taken into account > # > x <- rexp(100, 5) > x <- sort(x) > x <- data.frame(left=x, right=x+.1) > (f <- fitdistcens(x, "gamma", weights=c(rep(10, 50), rep(1, 50)))) Fitting of the distribution ' gamma ' on censored data by maximum likelihood Parameters: estimate shape 2.413364 rate 19.146928 Warning message: In mledist(censdata, distname, start = arg_startfix$start.arg, fix.arg = arg_startfix$fix.arg, : weights are not taken into account in the default initial values > try(plot(f)) Error in plot.fitdistcens(f) : The plot of the fit is not yet available when using weights > try(cdfcompcens(f)) Error in cdfcompcens(f) : cdfcompcens is not yet available when using weights > (f2 <- fitdistcens(x, "weibull", weights=c(rep(10, 50), rep(1, 50)))) Fitting of the distribution ' weibull ' on censored data by maximum likelihood Parameters: estimate shape 1.3770720 scale 0.1372821 Warning message: In mledist(censdata, distname, start = arg_startfix$start.arg, fix.arg = arg_startfix$fix.arg, : weights are not taken into account in the default initial values > try(cdfcompcens(list(f, f2))) Error in cdfcompcens(list(f, f2)) : cdfcompcens is not yet available when using weights > try(bootdistcens(f)) Error in bootdistcens(f) : Bootstrap is not yet available when using weights > > proc.time() user system elapsed 3.90 0.35 4.23