-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make write(IO, Char) actually return the amount of printed bytes instead of the attempted written bytes. #56980
Conversation
…ead of the attempted written bytes.
Isn't this aligning the implementation with the documented behavior? So this should actually be a bugfix, no?
|
Yes, @topolarity and I found this while griping about how it's hard to know if write truncated bytes when taking in non string like things, and I got confused as to why me writing to a full buffer was always succeeding |
This should definitely get a regression test before its merged though - perhaps something like io = IOBuffer(;maxsize=1)
write(io, 'a')
@test write(io, 'a') == 0 ? |
while true | ||
write(io, u % UInt8) | ||
n += write(io, u % UInt8) | ||
(u >>= 8) == 0 && return n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently unconditionally advances the given character, but what happens in case the first write
fails, and the second succeeds? Now there's suddenly a torn write involved here, and even though you can theoretically know that not all of the given Char
has been written (e.g. getting a return value of 3
when a 4-byte Char
is passed), you still wouldn't know which byte was dropped.
I think it would be good to return after the first failing write, so that it's at least knowable that a valid prefix has been written (if the return value is nonzero).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any Julia IO type where writing a byte can fail, return zero, and then succeed, without some error being thrown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, writing a byte with TranscodingStreams.jl will either return 1 or throw an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, a non-blocking buffered IO whose buffer is temporarily full, for example. I don't know whether there currently is such a type in the ecosystem, but the point is that it could exist and would be a valid IO
, as far as I can tell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a (slightly contrived) example:
julia> io = IOBuffer(; maxsize=1)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=1, ptr=1, mark=-1)
julia> write(io, 'a')
1
julia> write(io, 'a') # should be 0 with this PR, since the write doesn't succeed
1
julia> seekstart(io); # simulate a read-end on some other process, for example
julia> read(io, Char) # happens on the read-end
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
julia> write(io, 'a') # continue writing
1
It's a bit awkward to do this with an IOBuffer
, but the principle is the same for some IO
type that has an actual read-end that's distinct from the write end. For arbitrary I/O, it's usually preferrable to drop data on the write end and retry later once the buffer is ready to send again. With the current behavior, the writer wouldn't know what to try to retransmit over the I/O, since it's impossible to know which byte(s) of the Char
was/were not transmitted correctly. Effectively, the number returned by write
becomes irrelevant, and only matters when it matches sizeof(Char)
- at which point we might as well only return true
/false
. If we instead abort as soon as any internal write
fails, we know that at least a correct prefix of the Char
(or any data, in the general case) was returned, and we can retry with only the data that we haven't attempted to transmit at all yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think write ever errors for us, given asyncio and other stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, e.g. trying to write to a read-only stream. write
itself has a synchronous API, i.e. it is (task-)blocking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zero is not a "valid" return value in that case.
There is interesting historical data suggesting that some implementations of libc write
were indeed able to return 0
: https://stackoverflow.com/a/41970485
For quite a bunch of kinds of files, the behavior is unspecified, so more or less anything goes either way 🤷
I don't think write ever errors for us, given asyncio and other stuff?
Right, and for a non-blocking buffered IO it would be incredibly awkward to throw actual errors just because it's full. That possibility would be incredibly detrimental in the common case of success. I admit having 0
signal that is quite a bad API though. I guess this is yet-another case something like a Result{Int, Err}
sum type would be nice, to distinguish success from errors 🤔
Maybe let me put it another way - would this be a valid IO
subtype (barring some other missing methods)?
struct FlakyIO <: IO
io::IO
end
Base.write(fio::FlakyIO, b::UInt8) = rand(Bool) ? write(fio.io, b) : 0
You could get very fancy and record which writes succeeded & which ones failed for introspection later on, or do some more complicated scheme for deciding when exactly it "fails" to write anything. This kind of type would be incredibly useful for fuzzing stuff that accidentally depends on writes to IO
always succeeding (like the fallback method of write
in Base does, for example).
One issue I see with just throwing an error for partial writes/write failures of parts of larger types is that then the return value of write
becomes meaningless - either we always get a full write, or we get an error. There would be no more room for partial writes, which can happen in a bunch of cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conversation is worth continuing, but for the purposes of fixing this bug I think it's orthogonal.
Our AbstractArray
write method can also suffer "torn" writes in the same way:
function unsafe_write(s::IO, p::Ptr{UInt8}, n::UInt)
written::Int = 0
for i = 1:n
written += write(s, unsafe_load(p, i))
end
return written
end
This is probably worth splitting into a separate issue and fixing across-the-board. The only thing I think this needs to merge @gbaraldi is a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #57011 to continue discussion here
This might break some tests but I want to see which