Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: package_index #93

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

jjmaestro
Copy link
Contributor

@jjmaestro jjmaestro commented Sep 19, 2024

Note

Stacked on top of #92

refactor: apt_dep_resolver_test

Cleanup apt_dep_resolver_test removing the apt_deb_repository mock, it's not really needed once we have proper testing and mocking of apt_deb_repository that we can reuse.

test: apt_deb_repository

  • Add testing for apt_deb_repository mocking the external / side effects (downloads, decompression, etc).
  • Rename _parse_repository back to _parse_package_index

refactor: cleanup _resolve_package virtual_package logic

  • Refactor the virtual_package loop moving if version to the outside

feat: add parse_provides

  • Add a parse_provides method which parses and validates 'Provides'

refactor: nested_dict struct

  • Add a helper nested_dict struct that encapsulates the get-set logic
  • Remove all the (now) unnecessary wrappers like _package_versions, _virtual_packages, etc.
  • Fix get() method to return default_value if no keys are given
  • Add add() to nested_dict struct so that we can simplify the virtual package logic in apt_deb_repository.bzl _add_package

refactor: _resolve_all

  • Add assertions for unresolved_deps in apt_dep_resolver_test
  • Rename resolve_all to resolve
  • Reorder (name, version, arch) args to match the order of the index keys (arch, name, version)
  • Break up the long lines in comments
  • Improve _ITERATION_MAX_ fail message

refactor: _parse_version_constraint

Cleanup the version constraint parsing using the VERSION_OPERATORS that we now have in the version struct plus also validating the version being parsed.

chore: mv resolution_test to apt_dep_resolver_test

Tests should match the file they are testing

chore: move version_constraint tests to their own test file

version_constraint.bzl is a bunch of utils for package version dependency. Either it has enough entity to stand by itself or the functionality should be folded into version.bzl and package_index.bzl and/or package_resolution.bzl.

refactor: _version_relop

Refactor _version_relop into a compare method in version.bzl plus a VERSION_OPERATORS dict so that (1) we use the operator strings everywhere and (2) we can use the keys to validate the operators.

refactor: _fetch_package_index

While working with some flaky mirrors and trying to figure out why they were failing I found the _fetch_package_index code a bit hard to follow so here's my attempt at streamlining it a bit:

  • Change the general flow of the for-loop so that we can directly set the reasons for failure in failed_attempts
  • Remove both integrity as an argument and as a return value since neither is ever used.
  • return the content of the Packages index instead of the path so that (1) we don't need rctx anywhere else and (2) is easier to mock.
  • Reword failure messages adding more context and debug information
  • Shorter lines and templated strings, trying to make the code easier to read and follow.

refactor: apt_dep_resolver

  • Remove unnecessary state struct, just pass the repository struct to the resolver methods
  • Remove unused resolve_package from the public API

@jjmaestro jjmaestro force-pushed the refactor-package_index branch from 95da3ce to 88623f1 Compare September 22, 2024 18:43
@jjmaestro jjmaestro force-pushed the refactor-package_index branch 4 times, most recently from 110f8da to 911ea4b Compare October 29, 2024 08:23
@jjmaestro jjmaestro force-pushed the refactor-package_index branch from 911ea4b to 3f60c24 Compare November 14, 2024 01:07
@jjmaestro jjmaestro force-pushed the refactor-package_index branch from 3f60c24 to 2b09710 Compare November 22, 2024 19:58
@jjmaestro
Copy link
Contributor Author

jjmaestro commented Nov 22, 2024

@thesayyn I've re-done the PR on top of all of the new changes. I've now left the separation of apt_deb_repository.bzl and apt_dep_resolver.bzl.

I'll wait before rebasing the other PRs just in case you want changes in this one. I've re-done the whole stack of PRs because the changes were so big they needed to be fully re-done (manual patching files, git rebase was too hard / "not possible").

Please enable the checks workflow and review it when you can.

Thanks!

@jjmaestro jjmaestro force-pushed the refactor-package_index branch 4 times, most recently from 89a1e45 to 61dcfa2 Compare November 24, 2024 21:54
@jjmaestro jjmaestro mentioned this pull request Nov 24, 2024
@jjmaestro jjmaestro requested a review from thesayyn November 24, 2024 22:28
@thesayyn
Copy link
Collaborator

i don't think there is value in changing things just because it feels better. I can't help but say that most of these changes unnecessary in my opinion. I really don't see how they better objectively. There are some fixes here worth, i'd be happy to land those if you feel like splitting them.

It makes it equally hard for me to review this and decide if something is needed.

@jjmaestro
Copy link
Contributor Author

@thesayyn the changes were done to accommodate the stack of diffs on top, where I've refactored the lockfile (#95), the manifest (#94) the templates to deb_import (#96) and finally the architecture targets (#100) that you just landed in #115.

I'm sorry that you feel many of these are just "because it feels better". It would be good if you could point at the things you feel are that way.

@jjmaestro
Copy link
Contributor Author

There are some fixes here worth, i'd be happy to land those if you feel like splitting them.

This is probably the easiest, let me know what you want from the stack and I'll start splitting them into PRs.

@jjmaestro jjmaestro force-pushed the refactor-package_index branch from 61dcfa2 to 9ecc82e Compare November 28, 2024 12:47
@@ -0,0 +1,45 @@
"nested dict"

def _set(store, keys, value, add = False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can revert this, this is more involved than i would like the util .set/get_dict to be.

Copy link
Contributor Author

@jjmaestro jjmaestro Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it made sense to have it as a first-class data struct rather than just a helper function on a dict. The struct ensures the access to the dict is through the get-set while just having the functions doesn't prevent any direct change to the dict.

Also, I was using the functionality in other places in the code (e.g. the lockfile refactor) so having it all encapsulated in one data struct made more sense to me.

Do you still prefer it back to naked functions over a dict?

@@ -1,25 +1,5 @@
"utilities"

def _set_dict(struct, value = None, keys = []):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, revert

@@ -114,6 +114,14 @@ def _version_cmp_part(va, vb):
return res
return 0

VERSION_OPERATORS = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no reason for this to be a map, as its not going to grow overtime or change. I'd rather see this reverted as there is no value here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the commit, the value was in being able to validate the strings against the string keys. But I can remove it back to what it was.

Copy link
Contributor Author

@jjmaestro jjmaestro Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the relevant commit I was referring to: 4f504f4

diff --git a/apt/private/version_constraint.bzl b/apt/private/version_constraint.bzl
index 93e13e5..fe0e6b4 100644
--- a/apt/private/version_constraint.bzl
+++ b/apt/private/version_constraint.bzl
@@ -2,11 +2,22 @@
 
 load(":version.bzl", version_lib = "version")
 
-def _parse_version_constraint(rawv):
-    vconst_i = rawv.find(" ")
-    if vconst_i == -1:
-        fail('invalid version string %s expected a version constraint ">=", "=", ">=", "<<", ">>"' % rawv)
-    return (rawv[:vconst_i], rawv[vconst_i + 1:])
+def _parse_version_constraint(version_and_constraint):
+    chunks = version_and_constraint.split(" ")
+
+    if len(chunks) != 2:
+        fail("Invalid version constraint %s" % version_and_constraint)
+
+    version_constraint = chunks[0]
+    if version_constraint not in version_lib.VERSION_OPERATORS:
+        msg = "Invalid version constraint: %s\nValid constraints are: %s"
+        fail(msg % (version_constraint, version_lib.VERSION_OPERATORS))
+
+    version = chunks[1]
+
+    version_lib.parse(version)  # parsing version to validate it
+
+    return version_constraint, version
 
 def _parse_dep(raw):
     raw = raw.strip()  # remove leading & trailing whitespace

Previously, the validation was just checking for a space and if not found, it was assuming that the version didn't have a constraint.

With my changes, it (1) checks for the space and fails if none (2) it validates the operator and (3) it validates the version itself.

Is this enough reason for it to be a map? :-?

)

DO_NOT_DEPEND_ON_THIS_TEST_ONLY = struct(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, i don't see why this needs to be change, it's already working and there is no value in changing things without a reason. If this change is needed in a larger context, eg if your stack has a fix for something that needs this change, please put it right next to that fix to help me reason why this needs to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was in the same spirit as the test helpers that I added, standardizing on having a _TEST_SUITE_PREFIX constant, etc.

Basically, it was meant to be a clean way to link the stuff that's not publicly accessible for testing purposes only. This way, we always import the module in the test and test the public parts of the struct and if there are parts that are private and we want to test, I thought it would be easier to have a __test__ sub-struct in-place and not have to create another struct, with a long name in there (the __test__ already does signal that it's for testing only, I think).

I'll go back to the DO_NOT_DEPEND_ON_THIS_TEST_ONLY...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thesayyn I'm really looking into this, I was going to make the change but here's the thing. As mentioned, once I've added proper mocking, the tests don't need any of the _create_test_only() functionality. That is, all of this functionality is superseded by the mocks, which is the "added value".

So, TL;DR added value == better testing, like with the helper stuff you like in the tests. Is that OK? 🙏

@jjmaestro jjmaestro force-pushed the refactor-package_index branch from 9ecc82e to ff8eb6c Compare December 3, 2024 14:17
@jjmaestro
Copy link
Contributor Author

jjmaestro commented Dec 3, 2024

I've rebased the branch on top of main, and per your comments, rolled back my changes to _is_satisfied_by and the "leaking" of unmet dependencies.

I've added more context in your other reviews and left them unresolved to hear your opinion, here's a summary re.

  • nested_dict: TL;DR I also use it in the next PR in the stack so IMO it's better as its own struct VS helper methods that would leave the nested dict "naked" and available to changes via "non-API" (the get-set + other methods offered in the new nested_dict struct).

  • version op map: I use the keys to validate the operator in the commit I linked. If we go back to a struct of functions I think it would require to add a "validate op" method and other code that IMO is not needed by just using "the best data struct" to organize the operator lookup.

  • __test__ once I added the mocks and helpers that you mentioned you like, the DO_NOT_DEPEND_ON_THIS_TEST_ONLY struct becomes dead code.

Again, sorry for such a big PR of mix-and-match stuff, it's definitely hard to review it all, regardless of how much context I added in the commits. Thanks a ton for your reviews, happy to discuss this further and continue making the changes you want.

@jjmaestro jjmaestro force-pushed the refactor-package_index branch from ff8eb6c to 58fcdc9 Compare December 3, 2024 16:52
@jjmaestro
Copy link
Contributor Author

Ran a rebase executing tests and found an issue that I've corrected in the latest push.

* Remove unnecessary state struct, just pass repository struct to the
  resolver methods

* Remove unused resolve_package from the public API
While working with some flaky mirrors and trying to figure out why they
were failing I found the _fetch_package_index code a bit hard to follow
so here's my attempt at streamlining it a bit:

* Change the general flow of the for-loop so that we can directly set
  the reasons for failure in failed_attempts

* Remove both integrity as an argument and as a return value since
  neither is ever used.

* return the content of the Packages index instead of the path so that
  (1) we don't need rctx anywhere else and (2) is easier to mock.

* Reword failure messages adding more context and debug information

* Shorter lines and templated strings, trying to make the code easier to
  read and follow.
Refactor _version_relop into a compare method in version.bzl plus a
VERSION_OPERATORS dict so that (1) we use the operator strings
everywhere and (2) we can use the keys to validate the operators.
`version_constraint.bzl` is a bunch of utils for package version
dependency. Either it has enough entity to stand by itself or the
functionality should be folded into `version.bzl` and
`package_index.bzl` and/or `package_resolution.bzl`.
Tests should match the file they are testing
Cleanup the version constraint parsing using the VERSION_OPERATORS that
we now have in the version struct plus also validating the version being
parsed.
* Add assertions for unresolved_deps in apt_dep_resolver_test

* rename resolve_all to resolve

* Reorder (name, version, arch) args to match the order of the index
  keys (arch, name, version)

* Break up the long lines in comments

* Improve _ITERATION_MAX_ fail message

* Remove the "# buildifier: disable=print" since it's no longer needed.
* Refactor the nested dict get-set logic into its own nested_dict struct

* Remove all the (now) unnecessary wrappers like _package_versions,
  _virtual_packages, etc.

* Fix get() method to return default_value if no keys are given

* Add add() to nested dict struct so that we can simplify the virtual
  package logic in apt_deb_repository.bzl _add_package
Add a parse_provides method wich parses and validates 'Provides'
It's clear that the inside of the loop will never return a package if
there's no version provided so we can move the `if version` from within
the loop all the way out and greatly simplify the inner logic.
* Add testing for apt_deb_repository mocking the external / side effects
  (downloads, decompression, etc).

* Rename _parse_repository back to _parse_package_index

* Add test_util.asserts_dict_equals
Cleanup apt_dep_resolver_test removing the apt_deb_repository mock, it's
not really needed once we have proper testing and mocking of
apt_deb_repository that we can reuse.
@jjmaestro jjmaestro force-pushed the refactor-package_index branch from 58fcdc9 to fc107f5 Compare December 11, 2024 15:58
@jjmaestro
Copy link
Contributor Author

Just rebased to remove 2eda7c0 (superseded by #129, now merged in main)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants