Implement data preloading #361

kylelang · 2025-01-16T00:23:17Z

This PR makes three meaningful and two trivial adjustments.

Meaningful

Implement data preloading for all analyses
Update unit tests to work with data preloading and typed data
- Adding a typed RDS file "tests/testthat/testData.rds" to make life easier
Adding two workaround functions in "R/commonClassicalRegression.R" to harmonize the column names stored in options$modelTerms.
- These names are not properly coded when the options list is parsed.
  - I.e., they still get the old encoding that doesn't provide type information.
- I suspect that the modelTerms element may be processed too early (i.e., before the column names in dataset are properly encoded), but that's pure speculation.
- The workaround functions simply retrieve the correctly encoded names from dataset by decoding both the modelTerms names and column names in dataset back to the user-supplied versions and mapping from there.
- If the encoding issue is fixed in jaspBase, this hack won't be necessary, but it works for now and doesn't seem to cause any problems.

Trivial

Adjust the loop index in .computeCorBayes() to only calculate the stats requested by the user.
Changed the data used for the Kendall's Tau tests in Bayesian Correlation.
- These tests were persistently crashing somewhere in bstat.
- The putative effect size in the crashy tests was really small, and that small effect looked to be causing the convergence problems.
- The tests ran fine when using data with a larger effect, so I changed the data.
- Since it looks like nobody has touched bstat or the Bayesian correlation analyses in three years, I didn't think it was worth the effort of trying to get the sampler to play nice with the small effect.

added a workaround to harmonize the column encoding for the names in 'modelTerms'

changed the example data used for the kendall's tau tests in bayesian correlation because bstat kept crashing due to the very small effect size in the old example data (at least, that's my hypothesis)

kylelang · 2025-01-16T00:38:50Z

@JorisGoosen, I added you as a reviewer so you could see what I did about the encoding issue we discussed on Tuesday. Specifying the allowed types in the QML and setting allowTypeChange: true had no effect.

kylelang · 2025-02-07T13:40:14Z

@vandenman, I haven't been able to get Johnny's attention (via Mattermost or email). Are you still willing to review this PR?

JohnnyDoorn · 2025-02-10T11:16:42Z

inst/qml/RegressionLinear.qml

@@ -63,7 +63,8 @@ Form
 			availableVariablesList.source: ['covariates', 'factors']
 			startIndex:			0
 			availableVariablesListName: "availableTerms"
-			allowedColumns:		[]
+			allowedColumns:		['scale', 'nominal']


Why did you add these specifications? The column type specification already happens in the main menu, where the user specifies either continuous (covariates) or factors (categorical) predictors, and those types are then respected in the model box. Allowing the type to change seems a bit weird, also since it does not get changed in all models and therefore causing non-nested model comparisons.

This change is a vestige of an attempt to fix the variable name encoding problems that I ended up working around with the routines defined in "R/commonClassicalRegression.R". That attempt didn't work, but I forgot to undo this change.

JohnnyDoorn · 2025-02-10T11:22:06Z

@kylelang great changes! just two things:

the type specification is not necessary I believe
Poisson regression is not working in GLM :

forcing conversion to scale for DV in poisson regression

kylelang · 2025-02-10T15:32:18Z

The crash in Poisson regression was caused by the error-checking routines trying to analyze the factor DV as a numeric variable. Since the data preloading types the variables before they get to R, a nominal or ordinal DV gets to R as a factor.

Since we always want the DV in Poisson regression to be an integer (I think), it seems cleanest to force the appropriate type conversion in the QML form. That's what I've implemented in my latest commit. We could also do the type conversion in R, but it feels simpler and more transparent to do the conversion where the user can see what's happening.

kylelang · 2025-02-10T16:58:55Z

@JohnnyDoorn, this PR is ready for re-review.

kylelang added 3 commits January 16, 2025 00:56

adjusting R scripts to accomodate data preloading

cbc8a60

added a workaround to harmonize the column encoding for the names in 'modelTerms'

updating the unit tests to handle the typed data

8162dcf

changed the example data used for the kendall's tau tests in bayesian correlation because bstat kept crashing due to the very small effect size in the old example data (at least, that's my hypothesis)

removing 'tests/testthat/test.csv' file that snuck in

fc119a0

kylelang requested review from JohnnyDoorn and JorisGoosen January 16, 2025 00:29

kylelang marked this pull request as ready for review January 16, 2025 00:30

kylelang requested a review from vandenman February 7, 2025 13:38

JohnnyDoorn reviewed Feb 10, 2025

View reviewed changes

undoing allowed types specification for model terms in linear regression

1245505

forcing conversion to scale for DV in poisson regression

kylelang force-pushed the dataPreLoading branch from 14bd402 to 1245505 Compare February 10, 2025 15:24

JohnnyDoorn approved these changes Feb 13, 2025

View reviewed changes

JohnnyDoorn merged commit ab90540 into jasp-stats:master Feb 13, 2025
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement data preloading #361

Implement data preloading #361

kylelang commented Jan 16, 2025

kylelang commented Jan 16, 2025

kylelang commented Feb 7, 2025

JohnnyDoorn Feb 10, 2025

kylelang Feb 10, 2025

JohnnyDoorn commented Feb 10, 2025

kylelang commented Feb 10, 2025 •

edited

Loading

kylelang commented Feb 10, 2025

Implement data preloading #361

Implement data preloading #361

Conversation

kylelang commented Jan 16, 2025

Meaningful

Trivial

kylelang commented Jan 16, 2025

kylelang commented Feb 7, 2025

JohnnyDoorn Feb 10, 2025

Choose a reason for hiding this comment

kylelang Feb 10, 2025

Choose a reason for hiding this comment

JohnnyDoorn commented Feb 10, 2025

kylelang commented Feb 10, 2025 • edited Loading

kylelang commented Feb 10, 2025

kylelang commented Feb 10, 2025 •

edited

Loading