Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fuzzer based on honggfuzz #312

Merged
merged 3 commits into from
Aug 23, 2021
Merged

Add fuzzer based on honggfuzz #312

merged 3 commits into from
Aug 23, 2021

Conversation

PsiACE
Copy link
Member

@PsiACE PsiACE commented Jun 3, 2021

fuzzing

Also see: #211

PsiACE added 2 commits June 3, 2021 14:00
Signed-off-by: Chojan Shang <[email protected]>
Signed-off-by: Chojan Shang <[email protected]>
@BohuTANG
Copy link
Contributor

Hi, @Dandandan @andygrove

With this patch, we can easily to run the fuzz tool for sqlparser-rs and find more fuzz issues, great for robustness.
In the future we can make a CI to run it (limited time, i.e 10 minutes).

@coveralls
Copy link

Pull Request Test Coverage Report for Build 901777498

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 21 unchanged lines in 6 files lost coverage.
  • Overall coverage increased (+0.1%) to 90.17%

Files with Coverage Reduction New Missed Lines %
src/ast/data_type.rs 1 75.0%
src/ast/ddl.rs 2 80.18%
src/ast/query.rs 2 86.6%
src/parser.rs 4 85.89%
src/tokenizer.rs 4 91.18%
src/ast/mod.rs 8 70.6%
Totals Coverage Status
Change from base Build 676410191: 0.1%
Covered Lines: 5467
Relevant Lines: 6063

💛 - Coveralls

Signed-off-by: Chojan Shang <[email protected]>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in review. I am trying to pitch in and help out now to clear the backlog.

This is very cool @PsiACE and @BohuTANG -- thank you. It is cool that it seems you have found a bug with this fuzzer already.

My biggest question is about the input -- does the fuzzer simply check against random inputs or does it somehow know how to randomly make sql statements?

@@ -2358,8 +2358,7 @@ impl<'a> Parser<'a> {
]) // This couldn't possibly be a bad idea
})?
.into_iter()
.filter(|i| i.is_some())
.map(|i| i.unwrap())
.flatten()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct that this is a real bug that was found/fixed with the fuzzer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, just make clippy happy


fn main() {
loop {
fuzz!(|data: String| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty cool. Where does the data come from? Is it just random strings? Or does this somehow know how to make valid sql statements?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see some related comments here: #211 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now it is some random bytes, but can be used to find bugs. Optimization can be continued later, refer to: https://www.cockroachlabs.com/blog/sqlsmith-randomized-sql-testing/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a past project, we had a harness that could generate random SQL queries and it found many bugs -- such tests are a wonderful way to help database software mature

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed apache/datafusion#913 for DataFusion. Thanks for the pointer @PsiACE

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is cool and having initial fuzzing, even if just random strings to start, is a good first step.

Copy link
Member

@houqp houqp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty cool. Looking forward to the day when we could have sql syntax specific fuzzer :)

@alamb alamb merged commit 2d04266 into apache:main Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants