Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the json and jsonb types be treated more like bytea under pg_enable_utf8? #90

Open
rafl opened this issue Jan 22, 2022 · 3 comments

Comments

@rafl
Copy link

rafl commented Jan 22, 2022

DBD::Pg provides a pg_enable_utf8 option to automatically decode textual values from bytes to character strings when your database encoding is utf8. This is great!

DBD::Pg also goes out of its way to treat values of type bytea specially when pg_enable_utf8 is enabled in order to not attempt to decode arbitrary binary data as if it was utf8 encoded data. This too is really helpful and what I'd expect for bytea values.

As per https://datatracker.ietf.org/doc/html/rfc8259#section-8.1, I'd expect JSON data to be a sequence of bytes in utf8 encoding and would've expected any jsonb data I retrieve using DBD::Pg to be valid JSON which I can pass to the standard decode_json functions from modules like JSON::XS, Cpanel::JSON::XS, Mojo::JSON, etc, independently of whether or not I chose to automatically decode textual data to character strings with pg_enable_utf8. Having to use different JSON encoding/decoding functions depending on the pg_enable_utf8 setting doesn't seem ideal.

I'm wondering if the notion of byte-like types to not be decoded under pg_enable_utf8, which currently only includes bytea, should perhaps be (optionally?) extended to also include jsonb and/or json and would like to hear others' thoughts on that.

I'd be happy to provide a patch to that effect, but I thought it'd be better to reach out first to discuss what such a change would look like, as it has potential implications on backwards compatibility and might need to be something to explicitly opt into by users of the library.

Thanks!

@jonjensen
Copy link
Member

I see your point, and it makes sense, but I expect DBD::Pg to return a native Perl hash or array from a jsonb or json type, to do the work or unmarshaling as it does with arrays, not for me to have to manually decode it.

@rafl
Copy link
Author

rafl commented Jan 25, 2022

@jonjensen I think what you're asking for is more related to #32 than the issue I'm trying to address, though I can certainly see how allowing json/jsonb types to not be treated as character strings could be a first helpful step in the direction of that larger goal.

@jonjensen
Copy link
Member

@rafl Yes, I think you're right. We want to hand raw bytes to any JSON decoder anyway, whether we manually decode or have DBD::Pg do it automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants