-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Freebase API to be retired #38
Comments
Thanks @thatandromeda it really looks like this really is happening June 30th. Wikidata have it on their roadmap to provide a Wikidata Suggest type of service. But who knows if it will be ready in time. Some work that needs to be done:
|
This API call is being used by Wikidata's search, and seems to have the basics of what we would need in the UI to select employers and tags. There is a JSON-P callback to allow it to be used, to maybe help get around cross-origin requests (JavaScript from jobs.code4lib.org that wants to talk to wikidata.org). |
One possible way to map our Freebase ids to WikiData ids. https://gist.github.com/edsu/c95c9ae9f60ecdf80077 |
Google has said that the shutdown will be delayed. I'm pretty sure it was mentioned on the Freebase mailing list, but I can't find the thread right now. If you look at the Wikidata Freebase project page, you'll see the same info:
Because we're already inside the three month window for June 30, the API retirement won't be happening then. I'd suggest deferring planning of your migration strategy until things are a little clearer, but here are a few random thoughts:
The whole thing is kind of a mess, but it seems unlikely that Freebase will get shut down without a fair amount of notice, so I'd hold off committing to a transition plan until both Wikidata and Google firm up their plans. If/when you need to map Freebase IDs to Wikidata IDs, this bulk dump might be easier to use than an API. |
Thanks for those details @tfmorris ; I didn't know that the announcement on the Freebase website was out of date. Still, I think it should be doable to use the wbsearchentities API call to do the suggest portion, and to use WDQ as a temporary way to turn a few thousand Freebase IDs into WikiData IDs. I'd like to rip this bandaid off now rather than wait, but we'll see since I'm the only person actually maintaining shortimer at this point, and I have other things contending for my attention. |
@edsu - I think it's early days still for Wikidata and I have concerns about performance and stability of the API, but it's your call. I'd be happy to generate the ID mapping table for you, if that helps. At a DPLA Hackathon a few years ago, we hacked up Freebase Suggest to work with the DPLA API. You might consider doing something similar for Wikidata. Suggest is actually one of the nicer autocomplete widgets out there (in my opinion). https://github.com/scande3/dpla-discovery I don't know if you constrain your Suggest searches by type, etc, but if you're using the Freebase schema at all (types or properties), mapping to the Wikidata schema is another task that needs to be added to the list. |
The API may change, but it's hard to imagine it going away entirely after all the integration work that has gone on at Wikimedia. I'm ok with things changing -- in fact that's the best situation, because it means the service isn't dying, and people are working on it. Alas, the writing is definitely on the wall for Freebase. The suggestions are constrained by type in a few places in shortimer: by employer and location. I see that wbsearchentities has a type parameter that could be used similarly, maybe. If a mapping of types/properties is put together that would be very useful. I think I will be OK with mapping the IDs, but I will be in touch if it gets tricky. |
It looks like there may be a path forward using the Google Knowledge Graph, which now has an API and they are planning on adding a suggest widget, similar to the one Freebase offers, and which is so important to the workflow here in shortimer. Apparently even the freebase identifiers are being used, so there may not be a whole lot of cleanup work that needs to happen in the shortimer database. I think I would prefer to use Wikidata on principle, but it may be easier to transition to the Knowledge Graph. |
I think using the KG Suggest is the right call. The KG Search API is much less powerful than old Freebase Search API, but it should be fine for this application. The Wikidata Refine Reconciliation Service uses the websearchentities followed by WDQ/SPARQL approach internally and it doesn't appear to me that the search is very robust. One of the things that I've got on my (long) list of spare time projects is to improve the coverage of matching for Freebase<->Wikidata mappings, which will help provide an escape path if it's needed in the future (plus having the Wikidata reconciliation service for OpenRefine should help with these types of mapping tasks). BTW, the beta SPARQL endpoint is much faster than the experimental WDQ API, and the data is more current, if you ever have a need to query Wikidata. |
p.s. My interpretation is that the 3 month clock doesn't start until the KG Suggest API is available too, so there's still some time... |
@tfmorris thanks for your comments. If you notice KG Suggest get announced and remember this issue it would be really helpful if you can add a note here. I feel like I only accidentally noticed the KG API announcement! |
In preparation for the shortimer db should be updated to store the Freebase Machine ID or |
Freebase switched to MIDs for most purposes a while ago, so you may find that the IDs coming back from the Suggest API were MIDs already. If you have historical /en/... IDs, you can look up the MID with this query: Replace the (encoded) BTW, haven't heard anything additional on shutdown timeframes... |
over time the db has accumulated a fair number of subjects and employers with duplicate names, which causes problems for views that use a slugified version of the name. this commit tightens up the lookups to use the freebase id and also includes a new command line utility to help diagnose and correct these duplicates. refs #38
@tfmorris thanks for the update! I did get the database converted over to the mids. I looked them up by resolving URLs like:
which seemed to work pretty well still... |
It looks like the new Knowledge Graph Search Widget is available. Also some of the old Freebase API calls are starting to fail now, for example getting the location for an organization. |
Well, now the old Freebase APIs for looking up Employers and Locations are dead. So people can't enter in new jobs. I guess it would be good to move over to the Knowledge Graph API now ;-) |
@tfmorris @danbri do you happen know (or know someone who might know) why topical things like "Semantic Web" don't show up in the Knowledge Graph Search Widget? I get lots of books but not the topic. I even tried with a Search API call to see if I could find the topic in there, but I couldn't find it in 200 results. Using the JSON-LD context I can see that Google have URIs for entities which is cool. So I can easily turn the old Freebase IDs into Knowledge Graph URIs. For example here's the URI for Semantic Web: So I can see the entity "Semantic Web" is in the Knowledge Graph, but how can I get the search widget to return it? Would one of the available entity types work? |
Maybe this is the push I need to move over to using Wikidata.... |
I don't know but I'll see what I can find out |
(and +1 for Wikidata, regardless) |
From a quick guess, is it only returning entities whose types are in https://developers.google.com/knowledge-graph/ (and mapped there to schema.org)? |
Hmm, that does seem to be the case? Here are the types returned in the first 200 results when searching for 'semantic web' from the search API:
Unfortunately it seems like a lot of terms used to tag jobs in shortimer are rendered invisible in the KG search api ... |
this is step one in moving form freebase to wikidata. I added wikidata_id to the Employer, Location and Subject models. Then I added a migration to lookup the existing entities in Wikidata using Wikidata's SPARQL endpoint. The matching logic thus far is: 1. Look up entity using the Freebase ID 2. Use the name of the entity to derive the Wikipedia URL and look that up 3. To search for the label The next step is to purge entities that don't have Wikidata IDs, and then to create new suggest functionality that uses Wikidata instead of Freebase. refs #38 refs #57
I've been doing some preliminary work trying to migrate things to Wikidata. If you are interested you can track the work over on the wikidata branch. |
WIkidata does offer an autosuggest API interface but it doesn't allow you to limit by particular entity types (locations, organizations, etc). This leads to a lot of noise when looking things up. I also tried using the SPARQL endpoint with regex filters, but it seemed very unstable. There were lots of 502 errors. Perhaps that was just something else going on at the time, but it doesn't lend much confidence as a foundation for building on. Actually, it does look like other people were experiencing problems. |
So, even with the Wikidata SPARQL endpoint back to functioning normally it still can take multiple seconds for regex queries (what is needed for autosuggest) to come back. Unfortunately this won't be good enough. The wbsearchentities API call is fast, but it doesn't return back much information, and can't be limited to entities of a particular type (Locations, Organizations, etc). So, my current thinking is to use the entities that have already been collected in jobs.code4lib.org and run autosuggest against them, and let people enter new entities as needed. This will have the downside that they aren't mapped to Google Knowledge Graph or Wikidata, but I just don't have the cycles to do that at the moment...and the site risks dying completely if it's not possible to post new jobs. |
Could the Geonames service be used to look up institutions and locations? It has a rich and snappy API, and support for linked data: http://www.geonames.org/ |
It could, but that's only part of the puzzle. Unfortunately I don't have the bandwidth to fully address this problem. I'm planning on shutting the site down on November 1st after making static snapshots of the data and website available on Internet Archive. |
Hi! Sorry to ping this thread. I came across this trying to figure out how to map freebase MIDs to their entities without downloading and searching the 200gb data dump. |
It should have that, but query.wikidata.org might also have it...
…On Wed, 15 Aug 2018, 20:03 David, ***@***.***> wrote:
Hi! Sorry to ping this thread. I came across this trying to figure out how
to map freebase MIDs to their entities without downloading and searching
the 200gb data dump.
Does anyone know if the Google Knowledge Graph API contains MIDs in
freebase and if it can be queried using freebase MIDs?
Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKZGcd2YdTmQPsXRBHQGAvMZH4yM7Wyks5uROEJgaJpZM4DMc2D>
.
|
A late reply to the late question (I apparently had this accidentally muted - yay, gmail keyboard shortcuts). The Freebase MIDs were retained in the Google Knowledge graph and can be used for lookups. The /g IDs (as opposed to the /m IDs which are MIDs) post-date Freebase. As @danbri mentioned, some of them have been mapped to Wikidata entities, but only a small fraction of them. The Google Knowledge Graph will have many more (but the mapping to Wikidata is potentially more useful, if it exists). |
Google's retiring the Freebase API on 30 June 2015. Parts of this code depend on Freebase. What's the fallback?
The text was updated successfully, but these errors were encountered: