Skip to content

C/C++ test cases formatted for input to an LSTM

Notifications You must be signed in to change notification settings

ecna/vulnerabilitydataset

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vulnerabilitydataset

C/C++ test cases formatted for input to an LSTM for natural language processing. All test cases originally come from the SARD database at https://samate.nist.gov/SARD/

finaltestcaselabels.csv holds the filename of each test case. filenames ending in 1 are vulnerable, 0 nonvulnerable. The first number corresponds to the array position of the vulnerability label in cwelabels.txt, and the second number the false positive vulnerability of the testcase.

ie, in the row: cpp\CWE-079\000001973_1.txt,10,0

cpp\CWE-079\000001973_1.txt is the filepath of the testcase.

10 is the 10th array position in cwelabels = "CWE-079: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')" which is the vulnerability of this testcase.

0 is the 0 position in the cwelabels array = "No Vulnerability" which is the possible false positive vulnerability of the testcase, in this example being none

In the nonvulnerable counterpart testcase "cpp\CWE-079\000001974_0.txt,0,10" the first number being 0 means there is no vulnerability in the testcase, and the second 10 means this case would be a false positive for CWE-079.

About

C/C++ test cases formatted for input to an LSTM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published