-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU (Unicode) support? #145
Comments
To the best of my understanding, SQLite should at least support storing Unicode UTF-8 strings? So I attempted to run a simple query I also tried running I then swapped to I found this article, which sums up 5 ways to handle Unicode, all of them slow or impractical in various ways. So is custom collations and bad performance what people put up with in JS? Or does everyone build their own binaries from source? |
I was able to reproduce your issue with a straight sql string - However, if it is parameterized, there is no problem. So this means it is practically not a big deal. sqlite's CLI does allow unicode literals, though, so I suppose it should be supported. |
Can confirm, it works with parameter binding. 👍
I think most clients allow it? I guess queries containing unicode literals might be rare - I just did it for testing. 🙂 I guess I've kind of mixed two different issues into one though. Really the main thing that baffles me is why no client ships with something as basic and essential as Unicode support? I know ICU is "big" and makes things "slower". I also understand there are some issues with OS kernel differences, leading to practical subtleties around DB file portability. Yet, the internet runs on Unicode - everything right down to something as trivial as JSON requires Unicode. So I just spent another half day on this, and finally came up with a tolerable solution - this relies on a lot of fragile, user-supported tooling though. Do you think it would make sense to bundle some loadable extensions like Alternatively, would it make sense to integrate Perhaps providing a function like e.g. Alternatively some maybe some way to literally I'm just wondering how we can save the next person from a day or two of adventuring through a long list of half solutions and bad ideas before discovering how to do something as basic as just searching and sorting by Unicode strings. 😅 |
Is there any particular reason you ship a binary without Unicode (ICU) support?
You ship with the full-text search extension, which makes sense - but how is full-text search any use without Unicode support? Literally all the text on the internet is in Unicode format.
I can understand shipping SQLite without Unicode support for things like internal databases in apps, or the system registry in your computer or phone - but (I would assume) a primary reason people are interested in SQLite in Deno, is to build microservices, or smaller/simpler web apps, and I can't imagine what those services or apps would even be doing that doesn't require Unicode support?
Maybe this is a cultural divide of some sort? But I swear, I have researched this topic everywhere, and the only explanation I can find is, the ICU extension is "big" and makes some things "slow" - and I can understand this position from SQLite itself, being designed and optimized for use as an embedded database.
But shipping this as a Deno package is very different from the SQLite default use-case, I think? Are we more concerned about being "small" and "fast", or being actually useful?
I understand it's possible to build the extension from source and load it myself, but I'm just a humble web developer, and Unicode support seems very much like something basically everyone would need?
I can't understand how people are building anything with SQLite without Unicode?
Or are they all going out on their own to solve this problem first? am I just dumb or lazy? 😅
The text was updated successfully, but these errors were encountered: