Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Verification logger (TRX only) (#1552)
This change adds a new `/verificationLogger` option, which enables logging verification results in convenient formats for further analysis. It is meant to imitate the `--logger` option that can be passed to `dotnet test` to choose a logger for test results, and the implementation reuses the same test results model and logger interface. The motivation behind this is to provide two end-user benefits: 1. Enable reporting verification results in a easy-to-read format, especially in CI for Dafny code bases, through existing tools that support common test result formats (e.g. https://github.com/dorny/test-reporter#supported-formats) 2. Recording the verification time needed for each procedure, to enable identifying hot spots. We have observed that proof conditions that take a long time to verify are also prone to timing out or failing to verify when Dafny code is changed over time, and once this PR is merged I will add a wiki page calling attention to this and how to use this new option to pay attention to verification time in the development process. The only currently supported `/verificationLogger` option is `trx`, which produces the widely-supported VSTest TRX file format. The mapping of verification results to "test results" is experimental and still subject to improvement, but here is a sample of what it looks like: ``` <?xml version="1.0" encoding="utf-8"?> <TestRun id="782a53bd-962c-460a-9c1e-347392c7cfea" name="@a483e79acc1e 2021-11-02 21:18:24" xmlns="http://microsoft.com/schemas/VisualStudio/TeamTest/2010"> <Times creation="2021-11-02T21:18:24.7568110-07:00" queuing="2021-11-02T21:18:24.7568110-07:00" start="2021-11-02T21:18:24.7373890-07:00" finish="2021-11-02T21:18:24.7872770-07:00" /> <TestSettings name="default" id="28f50e45-c211-49b3-9464-1878b090a068"> <Deployment runDeploymentRoot="_a483e79acc1e_2021-11-02_21_18_24" /> </TestSettings> <Results> <UnitTestResult executionId="284b9663-40ec-46bd-b5d7-a3b92cf7fdac" testId="b346bd77-ae8b-e18e-94e7-4711e20a10b8" testName="Impl$$_module.__default.Same2" computerName="a483e79acc1e" duration="00:00:00.1680000" startTime="2021-11-02T21:18:24.0000000-07:00" endTime="2021-11-02T21:18:24.0000000-07:00" testType="13cdc9d9-ddb5-4fa4-a97d-d965ccfc6d4b" outcome="Passed" testListId="8c84fa94-04c1-424b-9868-57a2d4851a1d" relativeResultsDirectory="284b9663-40ec-46bd-b5d7-a3b92cf7fdac" /> <UnitTestResult executionId="309ab74b-92b6-4963-a346-681f26f11ed4" testId="72f213c0-00a3-454b-d21a-61094039a2b0" testName="CheckWellformed$$_module.__default.IRP__Alt" computerName="a483e79acc1e" duration="00:00:00.0870000" startTime="2021-11-02T21:18:23.0000000-07:00" endTime="2021-11-02T21:18:23.0000000-07:00" testType="13cdc9d9-ddb5-4fa4-a97d-d965ccfc6d4b" outcome="Passed" testListId="8c84fa94-04c1-424b-9868-57a2d4851a1d" relativeResultsDirectory="309ab74b-92b6-4963-a346-681f26f11ed4" /> ... </Results> <TestDefinitions> <UnitTest name="Impl$$_module.__default.Prefix" storage="test/verifythis2015/problem1.dfy" id="20f05010-f9dc-e8d1-a417-9bb05176768a"> <Execution id="6a99bc61-e035-4f86-a20b-99b03318997e" /> <TestMethod codeBase="Test/VerifyThis2015/Problem1.dfy" adapterTypeName="executor://dafnyverifier/v1" className="Impl$$_module.__default" name="Prefix" /> </UnitTest> <UnitTest name="Impl$$_module.__default.Same1" storage="test/verifythis2015/problem1.dfy" id="6fb25bcd-2493-278f-b561-850c81ed8687"> <Execution id="bc130fd9-dbbd-4b83-84d3-fd6592ed1049" /> <TestMethod codeBase="Test/VerifyThis2015/Problem1.dfy" adapterTypeName="executor://dafnyverifier/v1" className="Impl$$_module.__default" name="Same1" /> </UnitTest> ... </TestDefinitions> <TestEntries> <TestEntry testId="b346bd77-ae8b-e18e-94e7-4711e20a10b8" executionId="284b9663-40ec-46bd-b5d7-a3b92cf7fdac" testListId="8c84fa94-04c1-424b-9868-57a2d4851a1d" /> <TestEntry testId="72f213c0-00a3-454b-d21a-61094039a2b0" executionId="309ab74b-92b6-4963-a346-681f26f11ed4" testListId="8c84fa94-04c1-424b-9868-57a2d4851a1d" /> ... </TestEntries> <TestLists> <TestList name="Results Not in a List" id="8c84fa94-04c1-424b-9868-57a2d4851a1d" /> <TestList name="All Loaded Results" id="19431567-8539-422a-85d7-44ee4e166bda" /> </TestLists> <ResultSummary outcome="Completed"> <Counters total="18" executed="18" passed="18" failed="0" error="0" timeout="0" aborted="0" inconclusive="0" passedButRunAborted="0" notRunnable="0" notExecuted="0" disconnected="0" warning="0" completed="0" inProgress="0" pending="0" /> </ResultSummary> </TestRun> ``` Note in particular the `duration` field on `UnitTestResult` nodes. Disappointingly, most tools that handle TRX files do not seem to support sorting by duration, which is how I'd expect users to identify hot spots. I will offer suggestions in the wiki for how to extract this data. Known issues: * I still need to add an integration test to sanity check this option is working, but need to find a portable way to search for a substring in an output TRX file. We weren't using OutputCheck when we were on lit and the xUnit test runner intentionally doesn't support it.
- Loading branch information