Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve implementation of storage of transpiled files #1409

Open
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

ericvergnaud
Copy link
Contributor

Our current implementation already stores transpiled files.
However, there is no testing.
Also the corresponding business logic is spread across functions with similar names.

This PR:

  • factorizes test code
  • checks existence of generated files when transpiling a directory
  • refactors transpile functions for improved clarity

Requires #1405 and #1406
Fixes #1302
Supersedes #1392

files: list[Path],
) -> tuple[int, list[TranspileError]]:

output_folder = config.output_folder
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this logic upwards

output_folder = input_source.parent / "transpiled"
make_dir(output_folder)
for source_dir, _, files in dir_walk(input_source):
relative_path = cast(Path, source_dir).relative_to(input_source)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'relative_path' clarifies the intent

return TranspileStatus([config.input_path], no_of_sqls, error_list)


@timeit
async def transpile(
workspace_client: WorkspaceClient, engine: TranspileEngine, config: TranspileConfig
) -> tuple[list[dict[str, Any]], list[TranspileError]]:
) -> tuple[dict[str, Any], list[TranspileError]]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return a dict, we never have less or more than 1 status item

@@ -153,7 +146,7 @@ async def transpile(

async def _do_transpile(
workspace_client: WorkspaceClient, engine: TranspileEngine, config: TranspileConfig
) -> tuple[list[dict[str, Any]], list[TranspileError]]:
) -> tuple[dict[str, Any], list[TranspileError]]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return a dict, we never have less or more than 1 status item

status = {
"total_files_processed": len(result.file_list),
"total_queries_processed": result.no_of_transpiled_queries,
"analysis_error_count": result.analysis_error_count,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

align names

@@ -80,7 +80,7 @@ async def transpile(
read_dialect = get_dialect(source_dialect)
error: TranspileError | None = self._check_supported(read_dialect, source_code, file_path)
if error:
return TranspileResult(str(file_path), 1, [error])
return TranspileResult(source_code, 1, [error])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix silent bug

skip_validation,
catalog_name,
schema_name,
mode,
)


def test_transpile_prints_errors(capsys, tmp_path, mock_workspace_client_cli):
def test_transpile_prints_errors(capsys, output_folder, error_file, mock_workspace_client_cli):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use fixtures to deduplicate test code and guarantee cleanup

@@ -31,260 +31,111 @@ def transpile(workspace_client: WorkspaceClient, engine: SqlglotEngine, config:
return asyncio.run(do_transpile(workspace_client, engine, config))


def safe_remove_dir(dir_path: Path):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to conftest

return input_dir


def test_with_dir_skip_validation(initial_setup, mock_workspace_client):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed duplicate test

assert (
expected_message in actual_message
), f"Message {actual_message} does not match the expected value {expected_message}"
assert match_count == len(expected_errors), "Not all expected errors were found"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix missing assert when checking errors

@@ -35,12 +35,14 @@ def raise_validation_exception(msg: str) -> Exception:


@remorph.command
# pylint: disable=too-many-arguments
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to increase the max-args in pyproject.toml than setting this at file level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ericvergnaud ericvergnaud added stacked PR Should be reviewed, but not merged and removed do-not-merge labels Jan 8, 2025
…ultiplexer/refactor-transpile-error

* feature/multiplexer/transpile-using-lsp:
  fix typo
  add docs
  bump max args
…ure/multiplexer/display-lsp-diagnostics-in-console

* feature/multiplexer/refactor-transpile-error:
  fix typo
  add docs
  bump max args
@ericvergnaud ericvergnaud requested a review from a team January 10, 2025 14:52
… into feature/multiplexer/store-transpiled-files

* feature/multiplexer/display-lsp-diagnostics-in-console:
  fix typo
  add docs
Base automatically changed from feature/multiplexer/display-lsp-diagnostics-in-console to main January 22, 2025 09:52
* main:
  Display lsp diagnostics in console (#1406)
  Refactor transpile error for LSP (#1405)
  Rename mock (#1423)
  Update databricks-sdk requirement from ~=0.29.0 to >=0.29,<0.42 (#1422)
  Transpile using LSP (#1404)
  Bump sqlglot from 26.0.1 to 26.1.3 (#1416)
  Configure transpile errors file (#1408)
  restore lost file and fix gitignore that led to that loss (#1414)
  bump max args (#1410)

# Conflicts:
#	README.md
#	labs.yml
#	src/databricks/labs/remorph/cli.py
#	src/databricks/labs/remorph/config.py
#	src/databricks/labs/remorph/install.py
#	src/databricks/labs/remorph/transpiler/execute.py
#	src/databricks/labs/remorph/transpiler/lsp/lsp_engine.py
#	tests/unit/test_cli_transpile.py
#	tests/unit/test_install.py
#	tests/unit/transpiler/test_execute.py
#	tests/unit/transpiler/test_lsp_engine.py
@ericvergnaud ericvergnaud removed the stacked PR Should be reviewed, but not merged label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE]: Multiplexer - store transpiled files
2 participants