Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transpile using lsp #1354

Closed
wants to merge 802 commits into from
Closed

Conversation

ericvergnaud
Copy link
Contributor

@ericvergnaud ericvergnaud commented Dec 12, 2024

This PR implements LSPEngine, a TranspileEngine that leverages the Language Server Protocol to launch and communicate with a pluggable transpiler, implemented as a LSP server.

In our prototype, we used the existing LSP CodeAction mechanism with CodeActionKind.Refactor. This design works well when using VSCode as a client, but is not ideal for batch transpile:

  • it does not provide a deterministic way to identify the command for transpiling to databricks
  • it requires 2 interactions with the server for each file (1 to fetch the command, 1 to execute it)
  • the transpile result is sent asynchronously, which complicates the client engine's job
  • the transpile errors are sent asynchronously, which complicates the client engine's job

For those reasons, this PR chooses a different design, relying on LSP custom capabilities.
The LSPEngine supports a "document/transpileToDatabricks" capability, that the server must register during initialization.
Once the registration is done, the LSPEngine can then safely invoke that capability, which returns all the results in a single response (changes and errors).

This mechanism is tested successfully against a test LSPServer (which for transpiling, simply sets the content to uppercase).

Worth noting, this PR works around a limitation (bug ?) in the lsprotocol Python library, where custom capability messages are incorrectly serialized. This is considered tech debt, for which an issue has been created: #1378

Fixes #1299
Requires #1364
Requires #1390

ganeshdogiparthi-db and others added 30 commits November 26, 2024 13:06
* Added support for json_size function in presto and alternative in
Databricks using SQL functions
* Fixed `is not null` error for json_extract in Databricks generator.
Added a new test case for this issue.
Doesn't actually fix anything, rather adds tests that show that a fix
might not be required
Progresses #976

Supersedes #1223 which was lacking gpg signature
This PR trivially refactors the situations where we are using cases
classes without parameters to be case objects (as is idiomatic in
Scala).
This PR aims to fix the memory leak causing failure by raising parse
errors immediately.
Generator does not currently enclose subqueries in parenthesis, thus
generating incorrect code such as the following:

`SELECT * FROM SELECT * FROM t WHERE a > 'a'  WHERE a > 'b' `

This PR fixes the issue such that the generated code is now:

`SELECT * FROM (SELECT * FROM t WHERE a > 'a')  WHERE a > 'b' `

It takes special care of the `.. IN(SELECT...)` pattern, by avoiding
doubling enlosing parenthesis

Test cases are included #1233 but not added here because they require
changes unrelated to this PR
This PR refactors the situations where we use this:

```scala
x.tail.foldLeft(x.head)
```

to use the equivalent operation that does this:
```scala
x.reduceLeft
```

Following a review comment discussing the safety of reduce in these
situations[^1], some further changes have been made to the surrounding
code such that the behaviour remains the same but the safety can be
determined via trivial inspection without needing to refer to the
grammar.

[^1]: Not a new concern; the previous code had the same safety issue due
to the use of `.head`.
## Changes

This PR implements support for parsing set operations with TSql: `UNION
[ALL]`, `EXCEPT` and `INTERSECT`.

The grammar previously supported these but they were not being converted
to the IR.

### Linked issues

Resolves #1126.
Resolves #1102.

### Tests

- [x] added unit tests
- [x] added transpiler tests
- [x] added functional tests
…1250)

Added support for format_datetime function in presto to Databricks
<img width="838" alt="Screenshot 2024-11-27 at 5 09 44 PM"
src="https://github.com/user-attachments/assets/5eef2df5-757c-4107-a2a7-28b3f57f970d">
<img width="972" alt="Screenshot 2024-11-27 at 5 09 38 PM"
src="https://github.com/user-attachments/assets/70499579-09a8-4f34-b6fb-062b5bbb4424">
This PR drops an unnecessary unit test.
Bumps
[codecov/codecov-action](https://github.com/codecov/codecov-action) from
4 to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/codecov/codecov-action/releases">codecov/codecov-action's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>v5 Release</h2>
<p><code>v5</code> of the Codecov GitHub Action will use the <a
href="https://github.com/codecov/wrapper">Codecov Wrapper</a> to
encapsulate the <a
href="https://github.com/codecov/codecov-cli">CLI</a>. This will help
ensure that the Action gets updates quicker.</p>
<h3>Migration Guide</h3>
<p>The <code>v5</code> release also coincides with the opt-out feature
for tokens for public repositories. In the <code>Global Upload
Token</code> section of the settings page of an organization in
codecov.io, you can set the ability for Codecov to receive a coverage
reports from any source. This will allow contributors or other members
of a repository to upload without needing access to the Codecov token.
For more details see <a
href="https://docs.codecov.com/docs/codecov-tokens#uploading-without-a-token">how
to upload without a token</a>.</p>
<blockquote>
<p>[!WARNING]<br />
<strong>The following arguments have been changed</strong></p>
<ul>
<li><code>file</code> (this has been deprecated in favor of
<code>files</code>)</li>
<li><code>plugin</code> (this has been deprecated in favor of
<code>plugins</code>)</li>
</ul>
</blockquote>
<p>The following arguments have been added:</p>
<ul>
<li><code>binary</code></li>
<li><code>gcov_args</code></li>
<li><code>gcov_executable</code></li>
<li><code>gcov_ignore</code></li>
<li><code>gcov_include</code></li>
<li><code>report_type</code></li>
<li><code>skip_validation</code></li>
<li><code>swift_project</code></li>
</ul>
<p>You can see their usage in the <code>action.yml</code> <a
href="https://github.com/codecov/codecov-action/blob/main/action.yml">file</a>.</p>
<h2>What's Changed</h2>
<ul>
<li>chore(deps): bump to eslint9+ and remove eslint-config-google by <a
href="https://github.com/thomasrockhu-codecov"><code>@​thomasrockhu-codecov</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1591">codecov/codecov-action#1591</a></li>
<li>build(deps-dev): bump <code>@​octokit/webhooks-types</code> from
7.5.1 to 7.6.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1595">codecov/codecov-action#1595</a></li>
<li>build(deps-dev): bump typescript from 5.6.2 to 5.6.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1604">codecov/codecov-action#1604</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/parser</code> from
8.8.0 to 8.8.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1601">codecov/codecov-action#1601</a></li>
<li>build(deps): bump <code>@​actions/core</code> from 1.11.0 to 1.11.1
by <a href="https://github.com/dependabot"><code>@​dependabot</code></a>
in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1597">codecov/codecov-action#1597</a></li>
<li>build(deps): bump github/codeql-action from 3.26.9 to 3.26.11 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1596">codecov/codecov-action#1596</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/eslint-plugin</code>
from 8.8.0 to 8.8.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1600">codecov/codecov-action#1600</a></li>
<li>build(deps-dev): bump eslint from 9.11.1 to 9.12.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1598">codecov/codecov-action#1598</a></li>
<li>build(deps): bump github/codeql-action from 3.26.11 to 3.26.12 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1609">codecov/codecov-action#1609</a></li>
<li>build(deps): bump actions/checkout from 4.2.0 to 4.2.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1608">codecov/codecov-action#1608</a></li>
<li>build(deps): bump actions/upload-artifact from 4.4.0 to 4.4.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1607">codecov/codecov-action#1607</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/parser</code> from
8.8.1 to 8.9.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1612">codecov/codecov-action#1612</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/eslint-plugin</code>
from 8.8.1 to 8.9.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1611">codecov/codecov-action#1611</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/eslint-plugin</code>
from 8.9.0 to 8.10.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1615">codecov/codecov-action#1615</a></li>
<li>build(deps-dev): bump eslint from 9.12.0 to 9.13.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1618">codecov/codecov-action#1618</a></li>
<li>build(deps): bump github/codeql-action from 3.26.12 to 3.26.13 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1617">codecov/codecov-action#1617</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/parser</code> from
8.9.0 to 8.10.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1614">codecov/codecov-action#1614</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/eslint-plugin</code>
from 8.10.0 to 8.11.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1620">codecov/codecov-action#1620</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/parser</code> from
8.10.0 to 8.11.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1619">codecov/codecov-action#1619</a></li>
<li>build(deps-dev): bump <code>@​types/jest</code> from 29.5.13 to
29.5.14 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1622">codecov/codecov-action#1622</a></li>
<li>build(deps): bump actions/checkout from 4.2.1 to 4.2.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1625">codecov/codecov-action#1625</a></li>
<li>build(deps): bump github/codeql-action from 3.26.13 to 3.27.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1624">codecov/codecov-action#1624</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/eslint-plugin</code>
from 8.11.0 to 8.12.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1626">codecov/codecov-action#1626</a></li>
<li>build(deps-dev): bump <code>@​typescript-eslint/eslint-plugin</code>
from 8.12.1 to 8.12.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/codecov/codecov-action/pull/1629">codecov/codecov-action#1629</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md">codecov/codecov-action's
changelog</a>.</em></p>
<blockquote>
<h2>4.0.0-beta.2</h2>
<h3>Fixes</h3>
<ul>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/1085">#1085</a>
not adding -n if empty to do-upload command</li>
</ul>
<h2>4.0.0-beta.1</h2>
<p><code>v4</code> represents a move from the <a
href="https://github.com/codecov/uploader">universal uploader</a> to the
<a href="https://github.com/codecov/codecov-cli">Codecov CLI</a>.
Although this will unlock new features for our users, the CLI is not yet
at feature parity with the universal uploader.</p>
<h3>Breaking Changes</h3>
<ul>
<li>No current support for <code>aarch64</code> and <code>alpine</code>
architectures.</li>
<li>Tokenless uploading is unsuported</li>
<li>Various arguments to the Action have been removed</li>
</ul>
<h2>3.1.4</h2>
<h3>Fixes</h3>
<ul>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/967">#967</a>
Fix typo in README.md</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/971">#971</a>
fix: add back in working dir</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/969">#969</a>
fix: CLI option names for uploader</li>
</ul>
<h3>Dependencies</h3>
<ul>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/970">#970</a>
build(deps-dev): bump <code>@​types/node</code> from 18.15.12 to
18.16.3</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/979">#979</a>
build(deps-dev): bump <code>@​types/node</code> from 20.1.0 to
20.1.2</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/981">#981</a>
build(deps-dev): bump <code>@​types/node</code> from 20.1.2 to
20.1.4</li>
</ul>
<h2>3.1.3</h2>
<h3>Fixes</h3>
<ul>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/960">#960</a>
fix: allow for aarch64 build</li>
</ul>
<h3>Dependencies</h3>
<ul>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/957">#957</a>
build(deps-dev): bump jest-junit from 15.0.0 to 16.0.0</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/958">#958</a>
build(deps): bump openpgp from 5.7.0 to 5.8.0</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/959">#959</a>
build(deps-dev): bump <code>@​types/node</code> from 18.15.10 to
18.15.12</li>
</ul>
<h2>3.1.2</h2>
<h3>Fixes</h3>
<ul>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/718">#718</a>
Update README.md</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/851">#851</a>
Remove unsupported path_to_write_report argument</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/898">#898</a>
codeql-analysis.yml</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/901">#901</a>
Update README to contain correct information - inputs and negate
feature</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/955">#955</a>
fix: add in all the extra arguments for uploader</li>
</ul>
<h3>Dependencies</h3>
<ul>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/819">#819</a>
build(deps): bump openpgp from 5.4.0 to 5.5.0</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/835">#835</a>
build(deps): bump node-fetch from 3.2.4 to 3.2.10</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/840">#840</a>
build(deps): bump ossf/scorecard-action from 1.1.1 to 2.0.4</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/841">#841</a>
build(deps): bump <code>@​actions/core</code> from 1.9.1 to 1.10.0</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/843">#843</a>
build(deps): bump <code>@​actions/github</code> from 5.0.3 to 5.1.1</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/869">#869</a>
build(deps): bump node-fetch from 3.2.10 to 3.3.0</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/872">#872</a>
build(deps-dev): bump jest-junit from 13.2.0 to 15.0.0</li>
<li><a
href="https://redirect.github.com/codecov/codecov-action/issues/879">#879</a>
build(deps): bump decode-uri-component from 0.2.0 to 0.2.2</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/codecov/codecov-action/commit/968872560f81e7bdde9272853e65f2507c0eca7c"><code>9688725</code></a>
Update README.md</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/2112eaec1bedbdabc7e93d5312449d0d62b07c60"><code>2112eae</code></a>
chore(deps): bump wrapper to 0.0.23 (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1644">#1644</a>)</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/193421c5b3d1aca4209c9754f224ca0d85729414"><code>193421c</code></a>
fixL use the correct source (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1642">#1642</a>)</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/6018df70b05b191502ce08196e76e30ea3578615"><code>6018df7</code></a>
fix: update container builds (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1640">#1640</a>)</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/eff1a643d6887ee5935d4ca343e9076dc377d416"><code>eff1a64</code></a>
fix: add missing vars (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1638">#1638</a>)</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/4582d54fd3d27d9130327cdb51361c32016fa400"><code>4582d54</code></a>
Update README.md (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1639">#1639</a>)</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/bb7467c2bce05781760a0964d48e35e96ee59505"><code>bb7467c</code></a>
feat: use wrapper (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1621">#1621</a>)</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/1d6059880cab9176d33e31e0f1ab076b20495f5e"><code>1d60598</code></a>
build(deps-dev): bump <code>@​typescript-eslint/eslint-plugin</code>
from 8.12.2 to 8.13.0 ...</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/e587ce276eb45f1fcd960de3c01c83119213efca"><code>e587ce2</code></a>
build(deps-dev): bump <code>@​typescript-eslint/parser</code> from
8.12.2 to 8.13.0 (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1635">#1635</a>)</li>
<li><a
href="https://github.com/codecov/codecov-action/commit/e43f28e103e52bb26d252b5a97fcdfa06175321e"><code>e43f28e</code></a>
build(deps-dev): bump <code>@​typescript-eslint/parser</code> from
8.11.0 to 8.12.2 (<a
href="https://redirect.github.com/codecov/codecov-action/issues/1628">#1628</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/codecov/codecov-action/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=codecov/codecov-action&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: SundarShankar89 <[email protected]>
Co-authored-by: Andrew Snare <[email protected]>
SnowFlake supports both SUBSTR and SUBSTRING, see
https://docs.snowflake.com/fr/sql-reference/functions/substr.
This PR fills the gap for missing SUBSTR

Supersedes #1226 which was
lacking gpg signature

---------

Co-authored-by: Andrew Snare <[email protected]>
Bumps [sqlglot](https://github.com/tobymao/sqlglot) from 25.30.0 to
25.32.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tobymao/sqlglot/blob/main/CHANGELOG.md">sqlglot's
changelog</a>.</em></p>
<blockquote>
<h2>[v25.32.1] - 2024-11-27</h2>
<h3>:bug: Bug Fixes</h3>
<ul>
<li><a
href="https://github.com/tobymao/sqlglot/commit/954d8fd12740071e0951d1df3a405a4b9634868d"><code>954d8fd</code></a>
- parse DEFAULT in VALUES clause into a Var <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4448">#4448</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4446">#4446</a>
opened by <a
href="https://github.com/ddh-5230"><code>@​ddh-5230</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/73afd0f435b7e7ccde831ee311c9a76c14797fdc"><code>73afd0f</code></a>
- <strong>bigquery</strong>: Make JSONPathTokenizer more lenient for new
standards <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4447">#4447</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4441">#4441</a>
opened by <a
href="https://github.com/patricksurry"><code>@​patricksurry</code></a></em></li>
</ul>
</li>
</ul>
<h2>[v25.32.0] - 2024-11-22</h2>
<h3>:boom: BREAKING CHANGES</h3>
<ul>
<li>
<p>due to <a
href="https://github.com/tobymao/sqlglot/commit/0eed45cce82681bfbafc8bfb78eb2a1bce86ae53"><code>0eed45c</code></a>
- Add support for ATTACH/DETACH statements <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4419">#4419</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>:</p>
<p>Add support for ATTACH/DETACH statements (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4419">#4419</a>)</p>
</li>
<li>
<p>due to <a
href="https://github.com/tobymao/sqlglot/commit/da48b68a4f1fa6a754fa2a0a789564675d59546f"><code>da48b68</code></a>
- Tokenize hints as comments <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4426">#4426</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>:</p>
<p>Tokenize hints as comments (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4426">#4426</a>)</p>
</li>
<li>
<p>due to <a
href="https://github.com/tobymao/sqlglot/commit/fe3539464a153b1c0bf46975d6221dee48a48f02"><code>fe35394</code></a>
- fix datetime coercion in the canonicalize rule <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4431">#4431</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>:</p>
<p>fix datetime coercion in the canonicalize rule (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4431">#4431</a>)</p>
</li>
<li>
<p>due to <a
href="https://github.com/tobymao/sqlglot/commit/fddcd3dfc264a645909686c201d2288c0adf9047"><code>fddcd3d</code></a>
- bump sqlglotrs to 0.3.0 <em>(commit by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>:</p>
<p>bump sqlglotrs to 0.3.0</p>
</li>
</ul>
<h3>:sparkles: New Features</h3>
<ul>
<li><a
href="https://github.com/tobymao/sqlglot/commit/0eed45cce82681bfbafc8bfb78eb2a1bce86ae53"><code>0eed45c</code></a>
- <strong>duckdb</strong>: Add support for ATTACH/DETACH statements
<em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4419">#4419</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em></li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/2db757dfec9ded26572b8e9a71dcc8ea8a2382fe"><code>2db757d</code></a>
- <strong>bigquery</strong>: Support FEATURES_AT_TIME <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4430">#4430</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>addresses issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4428">#4428</a>
opened by <a
href="https://github.com/YuvrajSoni-Ksolves"><code>@​YuvrajSoni-Ksolves</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/fc591ae2fa80be5821cb53d78906afe8e5505654"><code>fc591ae</code></a>
- <strong>risingwave</strong>: add support for SINK, SOURCE &amp; other
DDL properties <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4387">#4387</a>
by <a
href="https://github.com/lin0303-siyuan"><code>@​lin0303-siyuan</code></a>)</em></li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/a2bde2e03e9ef8650756bf304db35b4876746d1f"><code>a2bde2e</code></a>
- <strong>mysql</strong>: improve transpilability of CHAR[ACTER]_LENGTH
<em>(commit by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em></li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/0acc248361f49f68f17d799cbaf6b3de06c57f7e"><code>0acc248</code></a>
- <strong>snowflake</strong>: Support CREATE ... WITH TAG <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4434">#4434</a>
by <a
href="https://github.com/asikowitz"><code>@​asikowitz</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>addresses issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4427">#4427</a>
opened by <a
href="https://github.com/asikowitz"><code>@​asikowitz</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/37863ffd747cad9c2b9bed60119cc1551faeffda"><code>37863ff</code></a>
- <strong>snowflake</strong>: Transpile non-UNNEST exp.GenerateDateArray
refs <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4433">#4433</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em></li>
</ul>
<h3>:bug: Bug Fixes</h3>
<ul>
<li><a
href="https://github.com/tobymao/sqlglot/commit/83ee97b34cd0fe269b4820f15147d1ed7523612e"><code>83ee97b</code></a>
- <strong>parser</strong>: Do not parse window function arg as
exp.Column <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4415">#4415</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4410">#4410</a>
opened by <a
href="https://github.com/merlindso"><code>@​merlindso</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/b22e0c8680b0ee5a382e57904b698bf21a94f782"><code>b22e0c8</code></a>
- <strong>parser</strong>: Extend DESCRIBE parser for MySQL FORMAT &amp;
statements <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4417">#4417</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4414">#4414</a>
opened by <a
href="https://github.com/AhlamHani"><code>@​AhlamHani</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/d1d2ae7d1514abc9477d275352e5e126509157c6"><code>d1d2ae7</code></a>
- <strong>duckdb</strong>: Allow count arg on exp.ArgMax &amp;
exp.ArgMin <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4413">#4413</a>
by <a href="https://github.com/aersam"><code>@​aersam</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4412">#4412</a>
opened by <a
href="https://github.com/aersam"><code>@​aersam</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/e3c45d5ec0ae6827e4b0bcfb047aeac131379732"><code>e3c45d5</code></a>
- presto reset session closes <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4421">#4421</a>
<em>(commit by <a
href="https://github.com/tobymao"><code>@​tobymao</code></a>)</em></li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/fd81f1bfee9a566b8df8bb501828c20bd72ac481"><code>fd81f1b</code></a>
- more presto commands <em>(commit by <a
href="https://github.com/tobymao"><code>@​tobymao</code></a>)</em></li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/da48b68a4f1fa6a754fa2a0a789564675d59546f"><code>da48b68</code></a>
- Tokenize hints as comments <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4426">#4426</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4425">#4425</a>
opened by <a
href="https://github.com/mkmoisen"><code>@​mkmoisen</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/69d4a8ccdf5954f293acbdf61c420b72dde5b8af"><code>69d4a8c</code></a>
- <strong>tsql</strong>: Map weekday to %w <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4438">#4438</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4435">#4435</a>
opened by <a
href="https://github.com/travispaice"><code>@​travispaice</code></a></em></li>
</ul>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/tobymao/sqlglot/commit/73afd0f435b7e7ccde831ee311c9a76c14797fdc"><code>73afd0f</code></a>
fix(bigquery): Make JSONPathTokenizer more lenient for new standards (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4447">#4447</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/954d8fd12740071e0951d1df3a405a4b9634868d"><code>954d8fd</code></a>
Fix: parse DEFAULT in VALUES clause into a Var (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4448">#4448</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/74dc39ba8649fd8292c97c82088b39b08f531702"><code>74dc39b</code></a>
docs: update API docs, CHANGELOG.md for v25.32.0 [skip ci]</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/fddcd3dfc264a645909686c201d2288c0adf9047"><code>fddcd3d</code></a>
Chore!: bump sqlglotrs to 0.3.0</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/6aea9f346ef8f91467e1d5da5a3f94cf862b44fe"><code>6aea9f3</code></a>
fix: Refactor NORMALIZE_FUNCTIONS flag usage (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4437">#4437</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/fe3539464a153b1c0bf46975d6221dee48a48f02"><code>fe35394</code></a>
Fix(optimizer)!: fix datetime coercion in the canonicalize rule (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4431">#4431</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/41d6a13ccfb28fbcf772fd43ea17da3b36567e67"><code>41d6a13</code></a>
fix: add return type (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4440">#4440</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/b24aced2dbb7e471d2dd0eb830ea4f2e24f9d267"><code>b24aced</code></a>
Refactor(snowflake): clean up [WITH] TAG property / constraint (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4439">#4439</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/37863ffd747cad9c2b9bed60119cc1551faeffda"><code>37863ff</code></a>
feat(snowflake): Transpile non-UNNEST exp.GenerateDateArray refs (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4433">#4433</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/69d4a8ccdf5954f293acbdf61c420b72dde5b8af"><code>69d4a8c</code></a>
fix(tsql): Map weekday to %w (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4438">#4438</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/tobymao/sqlglot/compare/v25.30.0...v25.32.1">compare
view</a></li>
</ul>
</details>
<br />

<details>
<summary>Most Recent Ignore Conditions Applied to This Pull
Request</summary>

| Dependency Name | Ignore Conditions |
| --- | --- |
| sqlglot | [>= 24.a, < 25] |
| sqlglot | [>= 25.31.dev0, < 25.32] |
</details>


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=sqlglot&package-manager=pip&previous-version=25.30.0&new-version=25.32.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: SundarShankar89 <[email protected]>
This PR adjusts the way transpilation results are checked during tests
so that the formatted values can be easily checked in a debugger.
(IntelliJ/PyCharm have better ways of comparing output than the
side-by-side view we display.)
* Added support for format_datetime function in presto to Databricks
([#1250](#1250)). A new
`format_datetime` function has been added to the `Parser` class in the
`presto.py` file to provide support for formatting datetime values in
Presto on Databricks. This function utilizes the
`DateFormat.from_arg_list` method from the `local_expression` module to
format datetime values according to a specified format string. To ensure
compatibility and consistency between Presto and Databricks, a new test
file `test_format_datetime_1.sql` has been added, containing SQL queries
that demonstrate the usage of the `format_datetime` function in Presto
and its equivalent in Databricks, `DATE_FORMAT`. This standalone change
adds new functionality without modifying any existing code.
* Added support for SnowFlake `SUBSTR`
([#1238](#1238)). This
commit enhances the library's SnowFlake support by adding the `SUBSTR`
function, which was previously unsupported and existed only as an
alternative to `SUBSTRING`. The project now fully supports both
functions, and the `SUBSTRING` function can be used interchangeably with
`SUBSTR` via the new `withConversionStrategy(SynonymOf("SUBSTR"))`
method. Additionally, this commit supersedes a previous pull request
that lacked a GPG signature and includes a test for the `SUBSTR`
function. The `ARRAY_SLICE` function has also been updated to match
SnowFlake's behavior, and the project now supports a more comprehensive
list of SQL functions with their corresponding arity.
* Added support for json_size function in presto
([#1236](#1236)). A new
`json_size` function for Presto has been added, which determines the
size of a JSON object or array and returns an integer. Two new methods,
`_build_json_size` and `get_json_object`, have been implemented to
handle JSON objects and arrays differently, and the Parser and Tokenizer
classes of the Presto class have been updated to include the new
json_size function. An alternative implementation for Databricks using
SQL functions is provided, and a test case is added to cover a fixed `is
not null` error for json_extract in the Databricks generator.
Additionally, a new test file for Presto has been added to test the
functionality of the `json_extract` function in Presto, and a new method
`GetJsonObject` is introduced to extract a JSON object from a given
path. The `json_extract` function has also been updated to extract the
value associated with a specified key from JSON data in both Presto and
Databricks.
* Enclosed subqueries in parenthesis
([#1232](#1232)). This
PR introduces changes to the ExpressionGenerator and
LogicalPlanGenerator classes to ensure that subqueries are correctly
enclosed in parentheses during code generation. Previously, subqueries
were not always enclosed in parentheses, leading to incorrect code. This
issue has been addressed by enclosing subqueries in parentheses in the
`in` and `scalarSubquery` methods, and by adding new match cases for
`ir.Filter` in the `LogicalPlanGenerator` class. The changes also take
care to avoid doubling enclosing parentheses in the `.. IN(SELECT...)`
pattern. New methods have not been added, and existing functionality has
been modified to ensure that subqueries are correctly enclosed in
parentheses, leading to the generation of correct SQL code. Test cases
have been included in a separate PR. These changes improve the
correctness of the generated code, avoiding issues such as `SELECT *
FROM SELECT * FROM t WHERE a > `a` WHERE a > 'b'` and ensuring that the
generated code includes parentheses around subqueries.
* Fixed serialization of MultipleErrors
([#1177](#1177)). In the
latest release, the encoding of errors in the
`com.databricks.labs.remorph.coverage` package has been improved with an
update to the `encoders.scala` file. The change involves a fix for
serializing `MultipleErrors` instances using the `asJson` method on each
error instead of just the message. This modification ensures that all
relevant information about each error is included in the encoded output,
improving the accuracy of serialization for `MultipleErrors` class.
Users who handle multiple errors and require precise serialization
representation will benefit from this enhancement, as it guarantees
comprehensive information encoding for each error instance.
* Fixed presto strpos and array_average functions
([#1196](#1196)). This
PR introduces new classes `Locate` and `NamedStruct` in the
`local_expression.py` file to handle the `STRPOS` and `ARRAY_AVERAGE`
functions in a Databricks environment, ensuring compatibility with
Presto SQL. The `STRPOS` function, used to locate the position of a
substring within a string, now uses the `Locate` class and emits a
warning regarding differences in implementation between Presto and
Databricks SQL. A new method `_build_array_average` has been added to
handle the `ARRAY_AVERAGE` function in Databricks, which calculates the
average of an array, accommodating nulls, integers, and doubles. Two SQL
test cases have been added to demonstrate the use of the `ARRAY_AVERAGE`
function with arrays containing integers and doubles. These changes
promote compatibility and consistent behavior between Presto and
Databricks when dealing with `STRPOS` and `ARRAY_AVERAGE` functions,
enhancing the ability to migrate between the systems smoothly.
* Handled presto Unnest cross join to Databricks lateral view
([#1209](#1209)). This
release introduces new features and updates for handling Presto UNNEST
cross joins in Databricks, utilizing the lateral view feature. New
methods have been added to improve efficiency and robustness when
handling UNNEST cross joins. Additionally, new test cases have been
implemented for Presto and Databricks to ensure compatibility and
consistency between the two systems in handling UNNEST cross joins,
array construction and flattening, and parsing JSON data. Some
limitations and issues remain, which will be addressed in future work.
The acceptance tests have also been updated, with certain tests now
expected to pass, while others may still fail. This release aims to
improve the functionality and compatibility of Presto and Databricks
when handling UNNEST cross joins and JSON data.
* Implemented remaining TSQL set operations
([#1227](#1227)). This
pull request enhances the TSql parser by adding support for parsing and
converting the set operations `UNION [ALL]`, `EXCEPT`, and `INTERSECT`
to the Intermediate Representation (IR). Initially, the grammar
recognized these operations, but they were not being converted to the
IR. This change resolves issues
[#1126](#1126) and
[#1102](#1102) and
includes new unit, transpiler, and functional tests, ensuring the
correct behavior of these set operations, including precedence rules.
The commit also introduces a new test file, `union-all.sql`,
demonstrating the correct handling of simple `UNION ALL` operations,
ensuring consistent output across TSQL and Databricks SQL platforms.
* Supported multiple columns in order by clause in for ARRAYAGG
([#1228](#1228)). This
commit enhances the ARRAYAGG and LISTAGG functions by adding support for
multiple columns in the order by clause and sorting in both ascending
and descending order. A new method, sortArray, has been introduced to
handle multiple sort orders. The changes also improve the functionality
of the ARRAYAGG function in the Snowflake dialect by supporting multiple
columns in the ORDER BY clause, with an optional DESC keyword for each
column. The `WithinGroupParams` dataclass has been updated in the local
expression module to include a list of tuples for the order columns and
their sorting direction. These changes provide increased flexibility and
control over the output of the ARRAYAGG and LISTAGG functions
* Added TSQL parser support for `(LHS) UNION RHS` queries
([#1211](#1211)). In
this release, we have implemented support for a new form of UNION in the
TSQL parser, specifically for queries formatted as `(SELECT a from b)
UNION [ALL] SELECT x from y`. This allows the union of two SELECT
queries with an optional ALL keyword to include duplicate rows. The
implementation includes a new case statement in the
`TSqlRelationBuilder` class that handles this form of UNION, creating a
`SetOperation` object with the left-hand side and right-hand side of the
union, and an `is_all` flag based on the presence of the ALL keyword.
Additionally, we have added support for parsing right-associative UNION
clauses in TSQL queries, enhancing the flexibility and expressiveness of
the TSQL parser for more complex and nuanced queries. The commit also
includes new test cases to verify the correct translation of TSQL set
operations to Databricks SQL, resolving issue
[#1127](#1127). This
enhancement allows for more accurate parsing of TSQL queries that use
the UNION operator in various formats.
* Added support for inline columns in CTEs
([#1184](#1184)). In
this release, we have added support for inline columns in Common Table
Expressions (CTEs) in Snowflake across various components of our
open-source library. This includes updates to the AST (Abstract Syntax
Tree) for better TSQL translation and the introduction of the new case
class `KnownInterval` for handling intervals. We have also implemented a
new method, `DealiasInlineColumnExpressions`, in the
`SnowflakePlanParser` class to parse inline columns in CTEs and modify
the class constructor to include this new method. Additionally, a new
private case class `InlineColumnExpression` has been introduced to allow
for more efficient processing of Snowflake CTEs. The
SnowflakeToDatabricksTranspiler has also been updated to support inline
columns in CTEs, as demonstrated by a new test case. These changes
improve compatibility, precision, and usability of the codebase,
providing a better overall experience for software engineers working
with CTEs in Snowflake.
* Implemented AST for positional column identifiers
([#1181](#1181)). The
recent change introduces an Abstract Syntax Tree (AST) for positional
column identifiers in the Snowflake project, specifically in the
`ExpressionGenerator` class. The new `NameOrPosition` type represents a
column identifier, either by name or position. The `Id` and `Position`
classes inherit from `NameOrPosition`, and the `nameOrPosition` method
has been added to check and return the appropriate SQL representation.
However, due to Databricks' lack of positional column identifier
support, the generator side does not yet support this feature. This
means that the schema of the table is required to properly translate
queries involving positional column identifiers. This enhancement
increases the system's flexibility in handling Snowflake's query
structures, with the potential for more comprehensive generator-side
support in the future.
* Implemented GROUP BY ALL
([#1180](#1180)). The
`GROUP BY ALL` clause is now supported in the LogicalPlanGenerator class
of the remorph project, with the addition of a new case to handle the
GroupByAll type and updated implementation for the Pivot type. A new
case object called `GroupByAll` has been added to the relations.scala
file's sealed trait "GroupType". A new test case has been implemented in
the SnowflakeToDatabricksTranspilerTest class to check the correct
transpilation of the `GROUP BY ALL` clause from Snowflake SQL syntax to
Databricks SQL syntax. These changes allow for more flexibility and
control in grouping operations and enable the implementation of specific
functionality for the GROUP BY ALL clause in Snowflake, improving
compatibility with Snowflake SQL syntax.

Dependency updates:

* Bump codecov/codecov-action from 4 to 5
([#1210](#1210)).
* Bump sqlglot from 25.30.0 to 25.32.1
([#1254](#1254)).
This PR patches our ability to install `remorph transpile` from a
branch.
Fix modifies to run upgrade only when recon_config.
This PR sets up EditorConfig for the project. IntelliJ/PyCharm will pick
this up automatically, as will many other editors.

In addition to the `.editorconfig` file, this PR updates existing files
so that they:

 - have no trailing whitespace;
 - have an EOL at the end of the file.
… test (#1265)

This PR tidies up a test by removing case class arguments that specify
the default value. Although tests may wish to be explicit about this,
none of these tests seem to be related to the argument so this cuts down
on noise, and eliminates a bunch of IDE warnings.

An incidental change is the chopping down of a long line.
)

This PR updates the TSQL grammar and accompanying IR processing so that
the precedence of `INTERSECT` is handled by the grammar instead of
during transportation to our IR.

Relates #1255.
This PR replaces all private fields with either:

 - `private[this]`; 
 - or the intended scope.

This is recommended by two guidelines in the style guide:

-
https://github.com/databricks/scala-style-guide?tab=readme-ov-file#private-fields
-
https://github.com/databricks/scala-style-guide?tab=readme-ov-file#privatethis

(Aside from these justifications, my preferred reason is that by
inspection I don't need to go looking for cross-instance access, which
is normally a code smell.)

Some incidental changes:

- A few field initialisers for mutable types were updated to assign the
fully-initialised objects instead of partially-initialised instances
that are then mutated to their intended state.
…1273)

This PR fixes the handling of CTEs with respect to set operations; prior
to this the CTE was associated with only the first `SELECT` in the
sequence of set operations instead of the entire set.

Incidental changes:

- Additional functional tests (that already passed) for a few of the
CTEs.
- Equivalent functional tests for TSQL to verify the behaviour was
already correct.

Resolves: #1267
A preprocessor and workflow to allow conversion of embedded SQL in Jinja
templates while preserving the templates and, as much as possible, the
original text format.

See the preprocessors/README.md for full details of the implementation.

Note that this PR implements about 95% of the use cases for templates,
but there are likely other places to support them, such as `if`
templates in expression lists. But we will now address these on a case
by case basis throuhg testing.

NOTE: This PR only supports templates in TSQL, but it is already too
big, so Snowflake support will be added in a separate PR, which is now
trivial to do, but would add another 70+ lines of code changes to this
PR.

---------

Co-authored-by: Valentin Kasas <[email protected]>
)

This PR simply marks an existing functional test for Snowflake that
isn't correctly verifying the handling of `RANDOM()`.

Relates: #1280
These settings are applied automatically.
This PR is stacked upon #1273 (and should not be merged prior to it) and
fixes a problem whereby for Snowflake set operations within a CTE were
not being correctly processed to the IR. In this situation the CTE was
dropping everything from the first set operation onwards.

Resolves: #1272 

Testing:

 - [X] Updated existing unit test.
 - [x] New transpilation test.
 - [X] Functional tests.
Implements DealiasLCAs, a transformation rule that replaces LCAs by the
underlying alias expression whenever it's not supported by Databricks.

Replicates scala tests from tests/unit/snow/test_lca_utils.py 

Suffers from various limitations (already present in the Python
implementation):
- may call the same function multiple times which may be a problem if
the function is not idempotent
- may introduce precedence issues if an alias involves more than 2
expressions
 - does not raise warnings when the above is encountered

Fixes #1222

Supersedes #1225 
Supersedes #1233

---------

Co-authored-by: Valentin Kasas <[email protected]>
Co-authored-by: Andrew Snare <[email protected]>
Co-authored-by: Andrew Snare <[email protected]>
Added Dependantbot for dependency changes for maven
ericvergnaud and others added 12 commits December 19, 2024 12:21
Running unit tests locally creates files in the source hierarchy which
need to be removed each time before pushing to git
This PR fixes the issue
This PR enhances the CLI as follows:
- adds a `transpiler-config-path` arg to the CLI that points to a
pluggable transpiler configuration file
- defaults the transpiler-config-path` arg to 'sqlglot' such that
built-in transpilers can still be invoked during the transition phase
- makes the 'source-dialect' optional, it is only required if the
transpiler supports more than 1 dialect

It introduces an LSPEngine, which acts as an alternative to the existing
SqlglotEngine. It is not operational yet, this PR is about routing CLI
calls, not executing them.

The LSPEngine is configured from the configuration stored at the above
'transpiler' path.
The configuration file itself is a yaml file with 2 sections:
```
remorph:
  version: 1
  dialects:
    - snowflake
  environment:
    - SOME_ENV: abc
  command_line:
    - python
    - lsp_server.py
custom:
  whatever: xyz

```
The `remorph` section is versioned and must strictly follow our spec.
The `custom` section is free-style, for use by the LSP server itself.

Both engines are loaded by the CLI, and used throughout the code, for
both `transpile` and `lineage` methods.

Requires #1345 

Fixes #1298

---------

Co-authored-by: Guenia Izquierdo Delgado <[email protected]>
* main:
  Create launcher from cli (#1315)
  fix issue where unit tests would create files not ignored by git (#1381)

# Conflicts:
#	README.md
#	labs.yml
#	src/databricks/labs/remorph/cli.py
#	src/databricks/labs/remorph/config.py
#	src/databricks/labs/remorph/install.py
#	src/databricks/labs/remorph/transpiler/lsp/lsp_engine.py
#	src/databricks/labs/remorph/transpiler/sqlglot/sqlglot_engine.py
#	src/databricks/labs/remorph/transpiler/transpile_engine.py
#	tests/unit/conftest.py
#	tests/unit/contexts/test_application.py
#	tests/unit/deployment/test_installation.py
#	tests/unit/test_cli_transpile.py
#	tests/unit/test_install.py
#	tests/unit/transpiler/test_execute.py
…e/multiplexer/perform-checks-within-transpile

* feature/multiplexer/simplify-transpile-api:
  Create launcher from cli (#1315)
  fix issue where unit tests would create files not ignored by git (#1381)
…to feature/multiplexer/transpile-using-lsp

* feature/multiplexer/perform-checks-within-transpile:
  Create launcher from cli (#1315)
  fix issue where unit tests would create files not ignored by git (#1381)
Our current `transpile` APIs expose many types of return values, making
it difficult to condense the logic (as required by LSP)

This PR fixes the issue by providing a common base type for all
transpile errors.
It also clarifies related structures by renaming them or their fields.

Progresses #1299

Requires #1315
Our current implementation performs pre-checks (such as
`check_for_unsupported_lca`) separately from the `transpile` itself.
This puts to much responsibility on `remorph`, and should be done by the
`TranspileEngine`.
This also creates complexity for Pluggable Transpilers, who need to
implement several method instead of just a `transpile` one.

This PR fixes the issue.

Progresses #1299

Requires #1361
Base automatically changed from feature/multiplexer/perform-checks-within-transpile to main December 20, 2024 17:41
* main:
  Perform checks during transpile (#1364)
  Simplify transpiler api (#1361)

# Conflicts:
#	src/databricks/labs/remorph/transpiler/execute.py
#	src/databricks/labs/remorph/transpiler/lsp/lsp_engine.py
#	src/databricks/labs/remorph/transpiler/sqlglot/sqlglot_engine.py
#	src/databricks/labs/remorph/transpiler/transpile_engine.py
#	tests/unit/transpiler/test_sqlglot_engine.py
@ericvergnaud
Copy link
Contributor Author

@gueniai the python-no-pylint-disable CI job fails as expected. All these breaches are required, see inline comments.

ericvergnaud and others added 6 commits December 20, 2024 18:56
Bumps [sqlglot](https://github.com/tobymao/sqlglot) from 26.0.0 to
26.0.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tobymao/sqlglot/blob/main/CHANGELOG.md">sqlglot's
changelog</a>.</em></p>
<blockquote>
<h2>[v26.0.1] - 2024-12-18</h2>
<h3>:sparkles: New Features</h3>
<ul>
<li><a
href="https://github.com/tobymao/sqlglot/commit/5d3ee4cac1c5c9e45cbf6263c32c87fda78f9854"><code>5d3ee4c</code></a>
- <strong>snowflake</strong>: transpile date subtraction <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4506">#4506</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>addresses issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4485">#4485</a>
opened by <a
href="https://github.com/cisenbe"><code>@​cisenbe</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/efeb4bd870dd5c017b31d6b95c9bd6311c75b9ae"><code>efeb4bd</code></a>
- <strong>postgres</strong>: add support for XMLELEMENT <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4513">#4513</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>addresses issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4512">#4512</a>
opened by <a
href="https://github.com/fresioAS"><code>@​fresioAS</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/e495777b8612866041050c96d3df700cd829dc9c"><code>e495777</code></a>
- <strong>clickhouse</strong>: add support for bracket map syntax
<em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4528">#4528</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>addresses issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4527">#4527</a>
opened by <a
href="https://github.com/mrcljx"><code>@​mrcljx</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/cc44ed73fa4489e0bcb457b7eae8a9772415db65"><code>cc44ed7</code></a>
- <strong>mysql</strong>: Support SERIAL data type <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4533">#4533</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>addresses issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4529">#4529</a>
opened by <a
href="https://github.com/AhlamHani"><code>@​AhlamHani</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/ee7dc966d533228756c3294c66422c27eceae503"><code>ee7dc96</code></a>
- <strong>starrocks</strong>: add partition by range and unique key
<em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4509">#4509</a>
by <a
href="https://github.com/pickfire"><code>@​pickfire</code></a>)</em></li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/84ec47810e0a5c9e71a2b48e686656f9c2eafb39"><code>84ec478</code></a>
- <strong>lineage</strong>: Extend lineage function to work with pivot
operation <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4471">#4471</a>
by <a
href="https://github.com/step4"><code>@​step4</code></a>)</em></li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/52c8374876bc4037dcb81a50301fdd62cb14bb2a"><code>52c8374</code></a>
- include comments in gen <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4535">#4535</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em></li>
</ul>
<h3>:bug: Bug Fixes</h3>
<ul>
<li><a
href="https://github.com/tobymao/sqlglot/commit/8f8e84ae81d60bea224e35b9ca88b0bb4a59512b"><code>8f8e84a</code></a>
- <strong>snowflake</strong>: bitxor third parameter(padside) issue
<em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4501">#4501</a>
by <a
href="https://github.com/ankur334"><code>@​ankur334</code></a>)</em></li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/476024653e5b942faaaaa2b3bce30a3ea1873190"><code>4760246</code></a>
- <strong>snowflake</strong>: generate only one INPUT =&gt; clause in
unnest_sql <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4505">#4505</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4503">#4503</a>
opened by <a
href="https://github.com/harounp"><code>@​harounp</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/7649d5053e3305dadb83769bb5cec52ed8235a19"><code>7649d50</code></a>
- <strong>optimizer</strong>: only expand stars for select scopes
<em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4515">#4515</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4514">#4514</a>
opened by <a
href="https://github.com/florian-ernst-alan"><code>@​florian-ernst-alan</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/2b68b9b7967b68465042a1b8c2ee21bb30007712"><code>2b68b9b</code></a>
- <strong>snowflake</strong>: Allow alias expansion inside JOIN
statements <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4504">#4504</a>
by <a
href="https://github.com/florian-ernst-alan"><code>@​florian-ernst-alan</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4502">#4502</a>
opened by <a
href="https://github.com/florian-ernst-alan"><code>@​florian-ernst-alan</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/e15cd0be1c66e0e72d9815575fa9b210e66cf7c9"><code>e15cd0b</code></a>
- <strong>postgres</strong>: generate float if the type has precision
<em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4516">#4516</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4508">#4508</a>
opened by <a
href="https://github.com/RedTailedHawk"><code>@​RedTailedHawk</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/98906d4520a0c582a0534384ee3d0c1449846ee6"><code>98906d4</code></a>
- another interval parsing edge case <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4519">#4519</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4490">#4490</a>
opened by <a
href="https://github.com/fuglaeff"><code>@​fuglaeff</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/992f6e9fc867aa5ad60a255be593b8982a0fbcba"><code>992f6e9</code></a>
- <strong>tsql</strong>: Convert exp.Neg literal to number through
to_py() <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4523">#4523</a>
by <a
href="https://github.com/VaggelisD"><code>@​VaggelisD</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4520">#4520</a>
opened by <a
href="https://github.com/DzianisKryvasheya"><code>@​DzianisKryvasheya</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/946cd4234a2ca403785b7c6a026a39ef604e8754"><code>946cd42</code></a>
- <strong>optimizer</strong>: qualify snowflake queries with
<code>level</code> pseudocolumn <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4524">#4524</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4518">#4518</a>
opened by <a
href="https://github.com/florian-ernst-alan"><code>@​florian-ernst-alan</code></a></em></li>
</ul>
</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/bc68289d4d368b29241e56b8f0aefc36db65ad47"><code>bc68289</code></a>
- <strong>planner</strong>: ensure aggregate variable is bound <em>(PR
<a
href="https://redirect.github.com/tobymao/sqlglot/pull/4526">#4526</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em>
<ul>
<li>:arrow_lower_right: <em>fixes issue <a
href="https://redirect.github.com/tobymao/sqlglot/issues/4525">#4525</a>
opened by <a
href="https://github.com/EyalDlph"><code>@​EyalDlph</code></a></em></li>
</ul>
</li>
</ul>
<h3>:recycle: Refactors</h3>
<ul>
<li><a
href="https://github.com/tobymao/sqlglot/commit/cd6e00f55195e26c3d02e255e66b45ab781addad"><code>cd6e00f</code></a>
- clean up pivot lineage <em>(PR <a
href="https://redirect.github.com/tobymao/sqlglot/pull/4534">#4534</a>
by <a
href="https://github.com/georgesittas"><code>@​georgesittas</code></a>)</em></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/tobymao/sqlglot/commit/52c8374876bc4037dcb81a50301fdd62cb14bb2a"><code>52c8374</code></a>
Feat: include comments in gen (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4535">#4535</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/cd6e00f55195e26c3d02e255e66b45ab781addad"><code>cd6e00f</code></a>
Refactor: clean up pivot lineage (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4534">#4534</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/84ec47810e0a5c9e71a2b48e686656f9c2eafb39"><code>84ec478</code></a>
feat(lineage): Extend lineage function to work with pivot operation (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4471">#4471</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/392f99bdf823afd745162c2a6245f175cab4fc9c"><code>392f99b</code></a>
Clean up starrocks properties</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/ee7dc966d533228756c3294c66422c27eceae503"><code>ee7dc96</code></a>
feat(starrocks): add partition by range and unique key (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4509">#4509</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/cc44ed73fa4489e0bcb457b7eae8a9772415db65"><code>cc44ed7</code></a>
feat(mysql): Support SERIAL data type (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4533">#4533</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/e495777b8612866041050c96d3df700cd829dc9c"><code>e495777</code></a>
Feat(clickhouse): add support for bracket map syntax (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4528">#4528</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/bc68289d4d368b29241e56b8f0aefc36db65ad47"><code>bc68289</code></a>
Fix(planner): ensure aggregate variable is bound (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4526">#4526</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/946cd4234a2ca403785b7c6a026a39ef604e8754"><code>946cd42</code></a>
Fix(optimizer): qualify snowflake queries with <code>level</code>
pseudocolumn (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4524">#4524</a>)</li>
<li><a
href="https://github.com/tobymao/sqlglot/commit/992f6e9fc867aa5ad60a255be593b8982a0fbcba"><code>992f6e9</code></a>
fix(tsql): Convert exp.Neg literal to number through to_py() (<a
href="https://redirect.github.com/tobymao/sqlglot/issues/4523">#4523</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/tobymao/sqlglot/compare/v26.0.0...v26.0.1">compare
view</a></li>
</ul>
</details>
<br />

<details>
<summary>Most Recent Ignore Conditions Applied to This Pull
Request</summary>

| Dependency Name | Ignore Conditions |
| --- | --- |
| sqlglot | [>= 24.a, < 25] |
| sqlglot | [>= 25.31.dev0, < 25.32] |
</details>


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=sqlglot&package-manager=pip&previous-version=26.0.0&new-version=26.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: SundarShankar89 <[email protected]>
…e-using-lsp

* fix-pytest-cov-crash:
  fix issue by moving coverage config from pyproject.toml to .coveragerc
response = await self.transpile_document(file_path)
self.close_document(file_path)
transpiled_code = ChangeManager.apply(source_code, response.changes)
return TranspileResult(transpiled_code, 1, [])

def analyse_table_lineage(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is not an implementation we expect out of multiplexer itself.

class LSPEngine(TranspileEngine):

@classmethod
def from_config_path(cls, config_path: Path) -> LSPEngine:
config, custom = cls._load_config(config_path)
return LSPEngine(config, custom)
return LSPEngine(config_path.parent, config, custom)

@classmethod
def _load_config(cls, config_path: Path) -> tuple[_LSPRemorphConfigV1, dict[str, Any]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Once this is stable, we create it as part of application context and use Blueprint Installation.load_local to parse the config file.

@sundarshankar89 sundarshankar89 requested review from a team and bishwajit-db December 30, 2024 12:42
@ericvergnaud ericvergnaud deleted the feature/multiplexer/transpile-using-lsp branch January 6, 2025 10:04
@ericvergnaud ericvergnaud mentioned this pull request Jan 6, 2025
gueniai added a commit that referenced this pull request Jan 15, 2025
This PR implements `LSPEngine`, a `TranspileEngine` that leverages the
Language Server Protocol to launch and communicate with a pluggable
transpiler, implemented as a LSP server.

In our prototype, we used the existing LSP `CodeAction` mechanism with
`CodeActionKind.Refactor.` This design works well when using VSCode as a
client, but is not ideal for batch transpile:
- it does not provide a deterministic way to identify the command for
transpiling to databricks
- it requires 2 interactions with the server for each file (1 to fetch
the command, 1 to execute it)
- the transpile result is sent asynchronously, which complicates the
client engine's job
- the transpile errors are sent asynchronously, which complicates the
client engine's job

For those reasons, this PR chooses a different design, relying on LSP
custom capabilities.
The `LSPEngine` supports a "document/transpileToDatabricks" capability,
that the server must register during initialization.
Once the registration is done, the LSPEngine can then safely invoke that
capability, which returns all the results in a single response (changes
and errors).

This mechanism is tested successfully against a test LSPServer (which
for transpiling, simply sets the content to uppercase).

Worth noting, this PR works around a limitation (bug ?) in the
lsprotocol Python library, where custom capability messages are
incorrectly serialized. This is considered tech debt, for which an issue
has been created: #1378

Fixes #1299 
Requires #1364 
Requires #1390
Supersedes #1354

---------

Co-authored-by: Guenia Izquierdo Delgado <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE]: Multiplexer - have a pluggable transpiler transpile a file
8 participants