Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update database method names #1829

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

dsblank
Copy link
Member

@dsblank dsblank commented Dec 22, 2024

Now that the raw data is actually useful, and the data is hardly "raw", it would make sense to update the method names for all of the get_object methods to make them more consistent.

This PR allows aliases for the common getters and iters. If this idea is accepted, I would also update the code base, and provide deprecation messages on the older forms.

Suggestions:

First, some of the methods (get_raw_person_data) have an implied from_handle. We can remove all from_handle from most method names for consistency and usability:

NOTE: new names on left, current names on right

    get_person = get_person_from_handle
    get_family = get_family_from_handle
    get_source = get_source_from_handle
    get_citation = get_citation_from_handle
    get_event = get_event_from_handle
    get_media = get_media_from_handle
    get_place = get_place_from_handle
    get_repository = get_repository_from_handle
    get_note = get_note_from_handle

Then, we don't need both raw and data, and is nicely parallel with the above:

    get_person_data = get_raw_person_data
    get_family_data = get_raw_family_data
    get_source_data = get_raw_source_data
    get_citation_data = get_raw_citation_data
    get_event_data = get_raw_event_data
    get_media_data = get_raw_media_data
    get_place_data = get_raw_place_data
    get_repository_data = get_raw_repository_data
    get_note_data = get_raw_note_data

We only have one id, so we don't need to explicitly say that it is gramps:

    get_person_from_id = get_person_from_gramps_id
    get_family_from_id = get_family_from_gramps_id
    get_source_from_id = get_source_from_gramps_id
    get_citation_from_id = get_citation_from_gramps_id
    get_event_from_id = get_event_from_gramps_id
    get_media_from_id = get_media_from_gramps_id
    get_place_from_id = get_place_from_gramps_id
    get_repository_from_id = get_repository_from_gramps_id
    get_note_from_id = get_note_from_gramps_id

These methods are inexplicably private, and using the above rules, suggest:

    get_person_data_from_id = _get_raw_person_from_id_data
    get_family_data_from_id = _get_raw_family_from_id_data
    get_source_data_from_id = _get_raw_source_from_id_data
    get_citation_data_from_id = _get_raw_citation_from_id_data
    get_event_data_from_id = _get_raw_event_from_id_data
    get_media_data_from_id = _get_raw_media_from_id_data
    get_place_data_from_id = _get_raw_place_from_id_data
    get_repository_data_from_id = _get_raw_repository_from_id_data
    get_note_data_from_id = _get_raw_note_from_id_data

Updated with iter-method suggested changes:

    iter_person_data = _iter_raw_person_data
    iter_family_data = _iter_raw_family_data
    iter_event_data = _iter_raw_event_data
    iter_place_data = _iter_raw_place_data
    iter_repository_data = _iter_raw_repository_data
    iter_source_data = _iter_raw_source_data
    iter_citation_data = _iter_raw_citation_data
    iter_media_data = _iter_raw_media_data
    iter_note_data = _iter_raw_note_data
    iter_tag_data = _iter_raw_tag_data

Changes from plural to singular:

    iter_citation = iter_citations
    iter_event = iter_events
    iter_family = iter_families
    iter_media = iter_media
    iter_note = iter_notes
    iter_people = iter_people
    iter_place = iter_places
    iter_repository = iter_repositories
    iter_source = iter_sources
    iter_tag = iter_tags

And these would stay the same:

    iter_citation_handles
    iter_event_handles
    iter_family_handles
    iter_media_handles
    iter_note_handles
    iter_person_handles
    iter_place_handles
    iter_repository_handles
    iter_source_handles
    iter_tag_handles

@dsblank
Copy link
Member Author

dsblank commented Dec 22, 2024

Note that db.get_person_data() is almost 100% compatible with db.get_person() due to merged (and upcoming) implementations of DataDict and DataList.

@dsblank dsblank self-assigned this Dec 22, 2024
@dsblank
Copy link
Member Author

dsblank commented Dec 22, 2024

We can also review the db.iter_* method names.

@emyoulation
Copy link
Contributor

emyoulation commented Dec 22, 2024

Did a search for an example "iter_citation"
https://github.com/search?q=repo%3Agramps-project%2Fgramps++%22iter_citation%22&type=code
7 files

Did a search for an the related comment "Return an iterator over objects for"
repo:gramps-project/gramps "Return an iterator over objects for"
https://github.com/search?q=repo%3Agramps-project%2Fgramps++%22Return+an+iterator+over%22&type=code
3 files

@dsblank
Copy link
Member Author

dsblank commented Dec 22, 2024

This is what I would suggest:

    iter_person_data = _iter_raw_person_data
    iter_family_data = _iter_raw_family_data
    iter_event_data = _iter_raw_event_data
    iter_place_data = _iter_raw_place_data
    iter_repository_data = _iter_raw_repository_data
    iter_source_data = _iter_raw_source_data
    iter_citation_data = _iter_raw_citation_data
    iter_media_data = _iter_raw_media_data
    iter_note_data = _iter_raw_note_data
    iter_tag_data = _iter_raw_tag_data

@dsblank dsblank requested a review from Nick-Hall December 22, 2024 19:25
@emyoulation
Copy link
Contributor

Is it likely that there will ever be a variant that interates through a person, the families where that person is a child or spouse AND the persons of those families? (So, their parents, siblings, themself, spouse, children and the Family objects too.)

That superset of a person is where the action is most of the time.

@dsblank
Copy link
Member Author

dsblank commented Dec 22, 2024

Is it likely that there will ever be a variant that iterates through a person, the families where that person is a child or spouse AND the persons of those families?

No, for technical reasons. All of the iterators just step over each table in the database.

@stevenyoungs
Copy link
Contributor

stevenyoungs commented Dec 23, 2024

The improvements in name consistency seems good to me.
I would also suggest it may be worth the one time cost to upgrade all existing "gramps" code to the new consistent names. We will still need the function aliases for compatibility with addons \ external code.
If we don't then over time we end up with both old and new names in widespread use increasing the learning curve for new developers.
I've restructured the above, for a person object, from which the improvement in consistency is clearer to me

Current Proposed
get_person_from_handle get_person
get_raw_person_data get_person_data
get_person_from_gramps_id get_person_from_id
_get_raw_person_from_id_data get_person_data_from_id
_iter_raw_person_data iter_person_data

Could we go further?

Current Proposed
iter_person_handles iter_person

@dsblank
Copy link
Member Author

dsblank commented Dec 23, 2024

@stevenyoungs, yes that would be the plan if approved.

@dsblank
Copy link
Member Author

dsblank commented Dec 25, 2024

I'd also like to add the following methods (feel free to suggest better names):

    get_people_data
    get_families_data
    get_sources_data
    get_citations_data
    get_events_data
    get_media_objects_data - suggests get_media_object for the above methods
    get_places_data
    get_repositories_data
    get_notes_data

These will be simple loops over the singular versions in the generic db class, but can be performed in SQL to a great speed advantage.

We could also add the get_people versions (returns Person objects) for completeness, but I don't have a strong opinion (or see the need) for those.

@stevenyoungs
Copy link
Contributor

I'd also like to add the following methods (feel free to suggest better names):

    get_people_data
    get_families_data
    get_sources_data
    get_citations_data
    get_events_data
    get_media_objects_data - suggests get_media_object for the above methods
    get_places_data
    get_repositories_data
    get_notes_data

I'm assuming that these methods will return an array of Json data (DataDict) of all records of the relevant type in the db?

get_media_objects_data stuck out and I was thinking about how you could maintain consistency across the full set of methods.
A second smaller point is the use of plurals, which may make it harder for non-native English speakers (person -> people, family -> families, event -> events, media -> media), and may make searching the code base more complex.
So I'd like to put forward an alternative pattern: get_all_*_data

get_all_person_data
get_all_family_data
get_all_source_data
get_all_citation_data
get_all_event_data
get_all_media_data
get_all_place_data
get_all_repository_data
get_all_note_data

This maintains a clear link to the method returning a single record get_person_data etc. and avoids the vagaries of pluralisation\pluralization in the English language.
On the downside, the method names are slightly longer. Is it clear that get_all returns data on all people in the db, not just all the data on a single person? In my view, the absence of a handle or id in the method parameter list should help make that clear.

@dsblank
Copy link
Member Author

dsblank commented Dec 26, 2024

So I'd like to put forward an alternative pattern: get_all_*_data

Yes, I like that better!

@dsblank
Copy link
Member Author

dsblank commented Dec 26, 2024

BTW, if the basic foundation of the work-in-progress #1828 is accepted then we don't need the get_all_*_data methods, because they could be written like:

db.select_from_person("person.handle in ['HANDLE1', 'HANDLE2', 'HANDLE3']")

@stevenyoungs
Copy link
Contributor

stevenyoungs commented Dec 26, 2024

BTW, if the basic foundation of the work-in-progress #1828 is accepted then we don't need the get_all_*_data methods, because they could be written like:

db.select_from_person("person.handle in ['HANDLE1', 'HANDLE2', 'HANDLE3']")

This comment might be better on #1828 but... if all people records were wanted, it would be inefficient to prepare a suitable WHERE clause with thousands of handles.
db.select_from_person() would be more efficient.

See https://github.com/gramps-project/gramps/pull/1828/files#r1897940380

@Nick-Hall
Copy link
Member

Most of these suggestions seem reasonable to me. We should post about this on the gramps-devel mailing list so that all developers are aware of the proposal. I'd welcome more feedback on this one.

The iterators need a little more thought. iter_person_handles actually iterates though handles, and iter_people through objects. Perhaps the _handles is useful here. Do we want to keep the plurals in the naming?

For each group of methods, we could also consider adding a method that takes an object type. For example, db.get(dbconst.PERSON, handle) in addition to db.get_person(handle), etc...

@dsblank
Copy link
Member Author

dsblank commented Jan 2, 2025

The iterators need a little more thought. iter_person_handles actually iterates though handles, and iter_people through objects. Perhaps the _handles is useful here. Do we want to keep the plurals in the naming?

I would say no. If we use singular forms throughout (use "person" rather than "people") then that standardizes the names, and makes db.method() work for all method names.

So, let's leave _handles on on the iter method names. So maybe:

  1. db.iter_person()
  2. db.iter_person_handles()

@dsblank
Copy link
Member Author

dsblank commented Jan 3, 2025

I updated the description with the recommendations. I left out the ones that could have SQL backing as those would be obsolete if #1828 is approved. Next, I'll post on the gramps-devel mailing list as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants