Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add an endpoint for doing SQL selects #641

Merged
merged 5 commits into from
Sep 4, 2023
Merged

Conversation

paulfitz
Copy link
Member

@paulfitz paulfitz commented Aug 24, 2023

This adds an endpoint for doing SQL selects directly on a Grist document. Other kinds of statements are not supported. There is a default timeout of a second on queries.

Users with the rights to view the entirety of a document could now POST to /api/docs/DOCID/sql a body like:

{
  "sql": "select * from Students where school_id = ? and name like 'Robert%'",
  "args": [ 47 ]
}

(For quick tests, it is also possible to GET that same URL with a q query parameter).

It is scary running user supplied SQL. Security concerns have been the stumbling block to adding such an endpoint. Here's why I've managed to convince myself it might be OK to do:

  • The underlying SQLite functions used will only process the first statement in the supplied text (the remainder is placed in a "tail string" ignored by node-sqlite3). So a Robert'); DROP TABLE Students;-- style attack isn't applicable.
  • The text is wrapped in select * from (USER SUPPLIED TEXT) which puts SQLite unconditionally onto the SELECT branch of its grammar. It is straightforward to break out of such a wrapper with multiple statements, but again, only the first statement is processed.
  • There's a timeout on the query.

There's no limit on memory usage, which is unfortunate. Grist is already vulnerable to memory exhaustion attacks so I personally can live with this one. But perhaps the endpoint should be an option at the installation level? SQL as understood by SQLite is a complicated beast.

Extra security could be arranged by extending node-sqlite3. For example, exposing sqlite3_stmt_readonly could add a bit of defense in depth (gristlabs/node-sqlite3#13). sqlite3_progress_handler and sqlite3_set_authorizer also could be useful.

The API endpoints follow loosely a design by Alex Hall.

This adds an endpoint for doing SQL selects directly on a
Grist document. Other kinds of statements are not supported.
There is a default timeout of a second on queries.

This follows loosely an API design by Alex Hall.
test/server/lib/DocApi.ts Outdated Show resolved Hide resolved
@paulfitz paulfitz marked this pull request as ready for review August 24, 2023 15:41
Copy link
Collaborator

@fflorent fflorent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A case appeared to me this morning: if I am correct, an attacker can mislead the checks with your wrap here by adding a closing bracket and reopening one later.

I proposed below an addition to the unit test to add a check for this case.

What do you think?


// rejected because deletes/updates/... can't be nested within a select
await check(false, 'delete from Table1 where id in (select id from Table1)');
await check(false, 'update Table1 set A = ?', 'test');
Copy link
Collaborator

@fflorent fflorent Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are more checks to add in order to be safe against SQLi.

If I add this test, it seems like it does not pass the check (i.e. the request is accepted):

await check(false, 'select 1); delete from Table1; select * from (select 1');

Copy link
Member Author

@paulfitz paulfitz Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is similar in nature to this check a little later, right?

await check(true, 'select * from Table1); delete from Table1 where id in (select id from Table1');

My comment there was:

   // Of course, we can get out of the wrapping select, but we can't
    // add on more statements. For example, the following runs with no
    // trouble - but only the SELECT part. The DELETE is discarded.
    // (node-sqlite3 doesn't expose enough to give an error message for
    // this, though we could extend it).

So: security wise, only the first statement goes through. It would be great to have an error, but I don't see a way to do it without modifying node-sqlite3 (which we've done before, it is painful though) or adding a parser for SQLite flavor of SQL (which, from a security perspective, we couldn't trust anyway, since the parser might be missing a case). Adding a parser would allow giving a better error message for users who try statements like this, so I'm open to it for that reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two points in the PR description are also relevant:

  • The underlying SQLite functions used will only process the first statement in the supplied text (the remainder is placed in a "tail string" ignored by node-sqlite3). ...
  • The text is wrapped in select * from (USER SUPPLIED TEXT) ... It is straightforward to break out of such a wrapper with multiple statements, but again, only the first statement is processed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is similar in nature to this check a little later, right?

You're totally right, I missed that part (my apologies, because you also put a big comment on it which should have raised my attention).

I also see that you check no deletions has been processed:
https://github.com/gristlabs/grist-core/pull/641/files#diff-ab299e5251059a3440c3a9ef4ecbf8efef7929325e762aeace269202f61052aeR4617-R4618

In case node-sqlite3 change that behavior, the test would not pass anymore.
I guess that's fine, regarding that:

  • the attacker should be authenticated with its API key;
  • already have access to the document;
  • an elevation of privilege would only target a single document and we must expect backups are made so the admin can rollback the changes;

I would say that the risk seems low and the danger probably not that high.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I've been looking into is other sqlite wrappers. For example, in grist-static, we replace node-sqlite3 with sql.js. Reading its API docs carefully, the call to all should be safe in the same way, but it is a bit scary how carefully one needs to read. That said, specifically in grist-static, the endpoint wouldn't even be available, and would only be accessing something in the user's own browser page.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are still a bit worrying about this issue, do you think we may make a temporary copy of the SQLite file in order to query it?
The original file would remain untouched, so if an attacker achieved to alter the document, it won't have any effect.

One drawback I see is that it's more expansive with the IO though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't thought of a copy. That's a thought. Another option is a temporary read-only connection to the file (temporary to avoid the possibility of changes to the connection via pragmas). I didn't jump into doing that, since as soon as there is more than one connection to the sqlite file, file-level locking can kick in and some new logic would be needed to handle busy timeouts. Definitely doable though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crazy additional idea - could offer PostgreSQL-compatible access via postlite

Copy link
Contributor

@berhalak berhalak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any new way to break it. From a security perspective, it looks ok to me.

app/server/lib/DocApi.ts Show resolved Hide resolved
app/server/lib/DocApi.ts Outdated Show resolved Hide resolved
statement,
records: records.map(
rec => ({
fields: rec,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that the record format is a good option here (as records look different with id as a primary field, and some other tweaks). Maybe we should return records without any modifications.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some options and discussion at https://grist.quip.com/ajj0ANBA5Tzd/SQL-API-endpoint#temp:C:KXO745c7072603e4c29be9140d72

I'm also unsure, but I weakly prefer this as a bit more future-safe - it is easy to add more information at both the query and record level in a non-disruptive way.

app/server/lib/DocApi.ts Outdated Show resolved Hide resolved
test/server/lib/DocApi.ts Show resolved Hide resolved
@paulfitz paulfitz requested a review from berhalak August 29, 2023 21:42
@georgegevoian georgegevoian self-requested a review August 30, 2023 02:48
Copy link
Contributor

@berhalak berhalak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Paul, It looks good to me. Can't wait to use it :)

// serious use.
// If SQL statements that modify the DB are ever supported, they should
// not be permitted by this endpoint.
this._app.get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some telemetry on API usage. I wonder if this will be widely used feature.

@paulfitz paulfitz requested a review from dsagal September 1, 2023 19:44
// that feels tricky currently with node-sqlite3)
timeout?: number; // In msecs. Can only be reduced from server default,
// not increased. Note timeout of a query can affect
// other queries on same document, because of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "other queries" mean other simultaneous queries, or all future ones?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the comment. Not future ones.

@@ -518,6 +528,27 @@ export class DocWorkerApi {
})
);

// A GET /sql endpoint that takes a query like ?q=select+*+from+Table1
// Not very useful, apart from testing - see the POST endpoint for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this comment about being not very useful? Doesn't it support everything that the POST endpoint does, except for reducing the timeout? (I imagine args can just be made part of the query string)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once args are part of the query, it'd be useful, but I didn't want to invest in that (e.g. handling types) since I didn't know whether this PR would pass scrutiny. Datasette does a quite beautiful job on a similar endpoint. It would be more work to make beautiful via node-sqlite3, it'd be better with named parameters rather than just a list.

I can see that "useful" is a judgement call, I say it because any SQL interface that is string only is kind of a pain to work with and invites problems.

app/server/lib/DocApi.ts Outdated Show resolved Hide resolved
app/server/lib/DocApi.ts Show resolved Hide resolved
@@ -258,6 +258,10 @@ export class SQLiteDB implements ISQLiteDB {
private constructor(protected _db: MinDB, private _dbPath: string) {
}

public async interrupt(): Promise<void> {
return this._db.interrupt?.();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it is ok for the interrupt method to be missing. If that's indeed possible in some configurations, should the SQL endpoint still be enabled in those cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this more explicit, and also added a check that the wrapper had been reviewed for how it handles multiple statements.

test/server/lib/DocApi.ts Outdated Show resolved Hide resolved
test/server/lib/DocApi.ts Outdated Show resolved Hide resolved
@paulfitz paulfitz requested a review from dsagal September 2, 2023 21:35
Copy link
Member

@dsagal dsagal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you!

@paulfitz paulfitz merged commit bfd0fa8 into main Sep 4, 2023
@paulfitz paulfitz deleted the paulfitz/sql-endpoint branch September 4, 2023 13:21
paulfitz added a commit to gristlabs/grist-help that referenced this pull request Sep 28, 2023
This adds API documentation for the /sql endpoints added in
gristlabs/grist-core#641
paulfitz added a commit to gristlabs/grist-help that referenced this pull request Sep 28, 2023
This adds API documentation for the /sql endpoints added in
gristlabs/grist-core#641
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants