Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brainstorm overall model validation methods #31

Open
cczhu opened this issue Dec 19, 2019 · 2 comments
Open

Brainstorm overall model validation methods #31

cczhu opened this issue Dec 19, 2019 · 2 comments
Assignees
Labels
testing and validation Testing and validating existing code

Comments

@cczhu
Copy link
Contributor

cczhu commented Dec 19, 2019

Traffic Prophet currently checks CountMatch's predictive accuracy by creating a set of fake STTC out of our PTC. TEPs checks KCOUNT and LSVR accuracy by using CountMatch's outputs as ground truth. While these methods are useful for spotting eg. poor fits in KCOUNT or CountMatch, they have extremely limited ability to determine the predictive error on roads where we don't already have empirical AADTs.

We need to brainstorm ways of estimating the predictive accuracy of Traffic Prophet, ideally in ways that minimize the data bias we currently have to expressways.

@cczhu cczhu added the testing and validation Testing and validating existing code label Dec 19, 2019
@cczhu
Copy link
Contributor Author

cczhu commented Dec 19, 2019

Sensitivity testing: given variations in daily count totals at a short term count location, how much does CountMatch's AADT prediction vary? This can be tested either by varying STTC data by X%, or by "swapping" data between neighbouring STTCs.

The results of this experiment can also point to where we are in greatest need for PTCs, helping to inform the 2020 count program.

@cczhu
Copy link
Contributor Author

cczhu commented Dec 19, 2019

Deviation from model assumptions: since we calculate an MSE or COV to match an STTC with a nearby PTC, the goodness of match is encoded by the value of the error metric. Both metrics are technically [0, infty), but MSE or COV of >~ 1 is probably a terrible fit. We can use this mismatch as goodness of fit metric between the model and the data. It wouldn't necessarily indicate that there's something wrong with either, just that the data and model assumptions don't align for whatever reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing and validation Testing and validating existing code
Projects
None yet
Development

No branches or pull requests

2 participants