Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misuse of sqlite3_interrupt #681

Open
rittneje opened this issue Jan 6, 2019 · 11 comments
Open

misuse of sqlite3_interrupt #681

rittneje opened this issue Jan 6, 2019 · 11 comments

Comments

@rittneje
Copy link
Collaborator

rittneje commented Jan 6, 2019

This library is not using sqlite3_interrupt correctly. When called, it interrupts all current and future active statements, until the active statement count reaches 0. (An active statement is any statement for which the last call to sqlite3_step returned SQLITE_ROW, and it has not been reset or finalized since.) Consequently, it is entirely possible for one operation getting canceled to result in other, unrelated operations being canceled.

Consider the following (contrived) example.

db, err := sql.Open("sqlite3", ":memory:")
if err != nil {
	panic(err)
}
defer db.Close()

if _, err := db.Exec(`
	CREATE TABLE Foo(FooID INTEGER NOT NULL PRIMARY KEY);
	INSERT INTO Foo VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10);
`); err != nil {
	panic(err)
}

tx, err := db.Begin()
if err != nil {
	panic(err)
}
defer tx.Rollback()

rows, err := tx.Query("SELECT * FROM Foo")
if err != nil {
	panic(err)
}
defer rows.Close()

for rows.Next() {
	// nested func so defer works properly
	func() {
		ctx, cancel := context.WithCancel(context.Background())
		defer cancel()

		r2, err := tx.QueryContext(ctx, "SELECT * FROM Foo")
		if err != nil {
			panic(err)
		}
		defer r2.Close()

		cancel()

		// sleep so the cancellation go routine has time to run
		time.Sleep(time.Second)
	}()
}

fmt.Println(rows.Err())

This example always outputs "interrupted", even though the outer query was not canceled.

@leslie-wang
Copy link

leslie-wang commented Jan 31, 2019

During longevity test, interrupt error happens randomly and frequently. It also happened in real production sometime. There is only one DB connection. Below are some trace logs. I'm not sure if another row available caused the interruption.

Trace: ev 1 -AC- conn 0x64000950, stmt 0x640189d8 {"BEGIN"} = exp 
Trace: ev 2 +Tx+ conn 0x64000950, stmt 0x640189d8 {""}; time 0
Trace: ev 1 +Tx+ conn 0x64000950, stmt 0x64005478 {"select token, user_id, device_id from token as t inner join (select id from user where user_name = ?) as u on u.id = t.user_id"} expanded {"select token, user_id, device_id from token as t inner join (select id from user where user_name = 'alice') as u on u.id = t.user_id"} 
Trace: ev 4 +Tx+ conn 0x64000950, stmt 0x64005478 {""}
Trace: ev 2 +Tx+ conn 0x64000950, stmt 0x64005478 {""}; time 1 ms; DB error: sqlite3.Error{Code:100, ExtendedCode:100, err:"another row available"}
Trace: ev 1 +Tx+ conn 0x64000950, stmt 0x64007548 {"select id, hash, ext_id from (select * from asset as a inner join (select * from device) as d on a.device_id = d.id where a.user_id = ? and year = ? and month = ? and day = ?) order by hash"} expanded {"select id, hash, ext_id from (select * from asset as a inner join (select * from device) as d on a.device_id = d.id where a.user_id = 1 and year = 2014 and month = 1 and day = 7) order by hash"}
Trace: ev 2 +Tx+ conn 0x64000950, stmt 0x64007548 {""}; time 7 ms
Trace: ev 1 +Tx+ conn 0x64000950, stmt 0x64007548 {"COMMIT"} = exp 
Trace: ev 2 -AC- conn 0x64000950, stmt 0x64007548 {""}; time 1 ms
Trace: ev 1 -AC- conn 0x64000950, stmt 0x64007548 {"BEGIN"} = exp 
Trace: ev 2 +Tx+ conn 0x64000950, stmt 0x64007548 {""}; time 0
Trace: ev 1 +Tx+ conn 0x64000950, stmt 0x6400bb98 {"select token, user_id, device_id from token as t inner join (select id from user where user_name = ?) as u on u.id = t.user_id"} expanded {"select token, user_id, device_id from token as t inner join (select id from user where user_name = 'alice') as u on u.id = t.user_id"} 
Trace: ev 4 +Tx+ conn 0x64000950, stmt 0x6400bb98 {""}
Trace: ev 2 +Tx+ conn 0x64000950, stmt 0x6400bb98 {""}; time 0; DB error: sqlite3.Error{Code:100, ExtendedCode:100, err:"another row available"}
Trace: ev 1 +Tx+ conn 0x64000950, stmt 0x64012c18 {"select id, hash, ext_id from (select * from asset as a inner join (select * from device) as d on a.device_id = d.id where a.user_id = ? and year = ? and month = ? and day = ?) order by hash"} expanded {"select id, hash, ext_id from (select * from asset as a inner join (select * from device) as d on a.device_id = d.id where a.user_id = 1 and year = 2014 and month = 1 and day = 8) order by hash"}
Trace: ev 2 +Tx+ conn 0x64000950, stmt 0x64012c18 {""}; time 7 ms
Trace: ev 1 +Tx+ conn 0x64000950, stmt 0x64012c18 {"COMMIT"} = exp 
Trace: ev 2 -AC- conn 0x64000950, stmt 0x64012c18 {""}; time 0
Trace: ev 2 -AC- conn 0x64000950, stmt 0x64012c18 {""}; time 0
time="2019-01-30T23:52:54-08:00" level=error msg=interrupted
Pop handle 0x64000950: deleted trace config {0x3ac4d0 15 true}.
Trace: ev 8 -AC- conn 0x64000950, stmt 0x0 {""}

@mattn
Copy link
Owner

mattn commented Jan 31, 2019

Could you please send me PR?

@leslie-wang
Copy link

My case seems like different from the author's. Mine just got interrupt error during normal call. Not sure what caused the failure.

@rittneje
Copy link
Collaborator Author

@leslie-wang "another row available" isn't actually an error. When you call "next", SQLite returns SQLITE_ROW, which means the query produced another row of data. That's the "error" you are seeing. Thus it would seem this library is getting its wires crossed somewhere. Would you mind filing a separate issue, including a code snippet so we can see exactly what you were doing?

@leslie-wang
Copy link

Sure. I'll try to make a simplified program, and see if it can reproduce, then create a seperate issue for that.

@leslie-wang
Copy link

#691 was created with simplified program and dataset. Please help taking a look. Thanks.

@rittneje
Copy link
Collaborator Author

rittneje commented Feb 3, 2019

Could you please send me PR?

@mattn Was this question addressed to me? I'm honestly not sure how to fix this problem. It seems to be a fundamental issue - SQLite doesn't actually support cancellation the way Go wants. In theory you could use a progress callback rather than sqlite3_interrupt. However, that effectively resorts to polling, which is a performance hit (especially in the case in which the user is using the default background context).

@mattn
Copy link
Owner

mattn commented Feb 3, 2019

I want to interrupt heavy query. So I think that using context is not enough to do it.

@rittneje
Copy link
Collaborator Author

rittneje commented Feb 3, 2019

I'm not sure what you mean by that. sqlite3_interrupt will interrupt your heavy query. The problem is that it can and will interrupt other things that you didn't mean it to.

@mattn
Copy link
Owner

mattn commented Feb 3, 2019

@rittneje Now I read the doc of progress callback. If sqlite3_iterrupt is not right way to cancel query, I hope to implement another way. If you know the usage of the API, please send me PR.

@rittneje
Copy link
Collaborator Author

rittneje commented Feb 3, 2019

I'm not really familiar with it, I'm afraid. I'm not even sure if it will work. It just appeared to be the only alternative for canceling a query. Your best bet might be to ask for advice on the SQLite mailing list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants