RFC: native algebra data type/tagged union/sum type/enum support #48883
Replies: 6 comments 49 replies
-
Type inference in Julia doesn't emit warnings, it's purely for optimization, not type checking. |
Beta Was this translation helpful? Give feedback.
-
It's great to see that there are already several package implementations of Algebraic Data Types (ADT) in Julia, and that they are being used for different use cases. However, you raise an interesting point about the limitations of package implementations of ADT, specifically with regards to type instability and memory layout. Having ADT supported natively in Julia would certainly be beneficial in terms of performance and memory management. However, it is important to carefully consider the design and implementation details of such a feature to ensure that it is consistent with Julia's type system and type inference. In addition to pattern matching, which is a common feature of ADT in other programming languages, it would also be useful to have some reflection capabilities for built-in ADT. This would allow users to introspect on the structure of the ADT and its fields, which could be useful in certain use cases. Overall, I think it's a great idea to explore the possibility of adding native support for ADT in Julia, and it's encouraging to see that there are already several package implementations that can serve as a starting point for such a feature. It will be interesting to see how the Julia community approaches this topic and how a native implementation of ADT could be integrated into the language. |
Beta Was this translation helpful? Give feedback.
-
I'm not sure we really need a new This LLVM guide is a good read if you want to think about how this is done. Again, we already kind of do this, but haven't settled on a single way of making this happen internally and efficiently. |
Beta Was this translation helpful? Give feedback.
-
Yes. Rust's Enum type is just a named union instance that uses tag bits, almost exactly how that LLVM guide does it just under the hood. Now our unions aren't exactly that because as soon as it's not a bits type within a certain size, then we just do a pointer to a boxed value. It may be worth exploring if it's worth having a specific type that adjusts this storage model a bit, taking advantage of quick runtime inference on the bits type. With the exception of generating optimized methods to access those bit tags, I don't think it's a compiler issue as much at this point. You can jerry rig this with mutable types but immutable types you really need to do this in C first with how stuff is designed right now. So if someone is really interested in making this happen I'd start with making an ergonomic named and tagged Union type work first. We can do compiler optimizations in places like the "compiler/tfuncs.jl" once we know what works and won't take too long in JIT compilation. |
Beta Was this translation helpful? Give feedback.
-
Thanks for bringing this up @Roger-luo. After thinking about this conversation, I've made some updates to SumTypes.jl that I think makes some steps towards the concerns raised here.
SumTypes.jl is able to handle this pretty well now, fully generically and without type instability.
My structs are a bit bigger than Expronicon's / Unityper's structs because I can't compactify at macroexpansion time, but IMO they're still quite small. It'd be great to go smaller though. I think the answer to this would be
and
I've sidestepped this by overloading julia> @sum_type Foo{T} :hidden begin
A
B{T}(::T)
end
julia> A
ERROR: UndefVarError: A not defined
julia> B
ERROR: UndefVarError: B not defined
julia> Foo{Int}'.A
A::Foo{Int64}
julia> Foo'.B(1)
B(1)::Foo{Int64}
julia> let (; B) = Foo'
B(1)
end
B(1)::Foo{Int64} |
Beta Was this translation helpful? Give feedback.
-
I found now this discussion and I'd just like to mention that I built a new package for all of this: https://github.com/JuliaDynamics/DynamicSumTypes.jl. What distinguish this package from others is that it avoids pattern-matching and also if-else statements (mostly). To define a function differently for each (possibly parametrized) variant, it implements a "fake dispatch" mechanism, I think operating this way is a bit more idiomatic and intuitive. It can be also faster than all the other approaches in some cases, because the default option is to merge fields in a unique struct, while if memory efficiency is required it can use a sum type instead. I managed to integrate it in a few packages easily enough because types implemented in this way behave mostly like normal structs. |
Beta Was this translation helpful? Give feedback.
-
I tried to find an issue that contains a discussion for native support of this but didn't find any.
I think this is a very well-explored feature in the community already and people seem to have been using it for many different cases through packages, I collect some implementations by searching JuliaHub (that might not be complete, so please feel free to add yours):
Expronicon.ADT
(my own): discourse postUnityper
used by Symbolics.jl: https://github.com/YingboMa/Unityper.jlSumTypes
(by @MasonProtter ): https://github.com/MasonProtter/SumTypes.jlMLStyle
(by @thautwarm ): https://thautwarm.github.io/MLStyle.jl/latest/preview.html#generalized-algebraic-data-typesenum
should also fall in this categoryThe reason why I'm requesting opinions on having this supported natively is that I find there are some limitations as a package implementation, mostly because of type-instability created by package implementation and not being able to actually manage the behaviour of allocation, I will refer to this feature as
ADT
in the following context:I'm not if this is the only solution, but to make it static, the
ADT
requires a special fixed memory layout, similar to a C union, that is in principle the maximum size of the variant it contains. CurrentlyExpronicon
andUnityper
implements this as a special structure, which has a size larger than theoretical because there is no access to control the memory layout from Julia (and we probably shouldn't do it...)On the other hand, supporting generic ADT can be problematic without support from Julia's type inference, taking the
Option<T>
type (similar cases likeResult
etc. are similar) from rust as an example,currently, if we try to make
None
andSome
share the same Julia struct (which also means the same size), theNone
fromOption{Float64}
will be different fromOption{Int}
, so it seems an unintuitive, I don't have a good solution to this if we allow something like this it will create type instability, but perhaps if this is a native type, then type inference can at least warn people.Better support of the ADT requires also pattern matching, but I think we can do that separately as long as reflections are implemented for the built-in ADT.
There probably gonna be more design details that the community wants to work on since I'm not the first one proposing this. But I want to open this post so people could push it forward.
Beta Was this translation helpful? Give feedback.
All reactions