Make write(IO, Char) actually return the amount of printed bytes instead of the attempted written bytes. #56980

gbaraldi · 2025-01-07T18:36:58Z

This might break some tests but I want to see which

…ead of the attempted written bytes.

Seelengrab · 2025-01-07T18:50:25Z

Isn't this aligning the implementation with the documented behavior? So this should actually be a bugfix, no?

write(io::IO, x)

[...] Return the number of bytes written into the stream.

gbaraldi · 2025-01-07T18:55:09Z

Yes, @topolarity and I found this while griping about how it's hard to know if write truncated bytes when taking in non string like things, and I got confused as to why me writing to a full buffer was always succeeding

Seelengrab · 2025-01-07T21:50:55Z

This should definitely get a regression test before its merged though - perhaps something like

io = IOBuffer(;maxsize=1)
write(io, 'a')
@test write(io, 'a') == 0

?

Seelengrab · 2025-01-07T21:53:37Z

base/io.jl

    while true
-        write(io, u % UInt8)
+        n += write(io, u % UInt8)
        (u >>= 8) == 0 && return n


This currently unconditionally advances the given character, but what happens in case the first write fails, and the second succeeds? Now there's suddenly a torn write involved here, and even though you can theoretically know that not all of the given Char has been written (e.g. getting a return value of 3 when a 4-byte Char is passed), you still wouldn't know which byte was dropped.

I think it would be good to return after the first failing write, so that it's at least knowable that a valid prefix has been written (if the return value is nonzero).

Is there any Julia IO type where writing a byte can fail, return zero, and then succeed, without some error being thrown?

For example, writing a byte with TranscodingStreams.jl will either return 1 or throw an error.

Sure, a non-blocking buffered IO whose buffer is temporarily full, for example. I don't know whether there currently is such a type in the ecosystem, but the point is that it could exist and would be a valid IO, as far as I can tell.

Here's a (slightly contrived) example:

julia> io = IOBuffer(; maxsize=1) IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=1, ptr=1, mark=-1) julia> write(io, 'a') 1 julia> write(io, 'a') # should be 0 with this PR, since the write doesn't succeed 1 julia> seekstart(io); # simulate a read-end on some other process, for example julia> read(io, Char) # happens on the read-end 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase) julia> write(io, 'a') # continue writing 1

It's a bit awkward to do this with an IOBuffer, but the principle is the same for some IO type that has an actual read-end that's distinct from the write end. For arbitrary I/O, it's usually preferrable to drop data on the write end and retry later once the buffer is ready to send again. With the current behavior, the writer wouldn't know what to try to retransmit over the I/O, since it's impossible to know which byte(s) of the Char was/were not transmitted correctly. Effectively, the number returned by write becomes irrelevant, and only matters when it matches sizeof(Char) - at which point we might as well only return true/false. If we instead abort as soon as any internal write fails, we know that at least a correct prefix of the Char (or any data, in the general case) was returned, and we can retry with only the data that we haven't attempted to transmit at all yet.

I don't think write ever errors for us, given asyncio and other stuff?

It does, e.g. trying to write to a read-only stream. write itself has a synchronous API, i.e. it is (task-)blocking.

zero is not a "valid" return value in that case.

There is interesting historical data suggesting that some implementations of libc write were indeed able to return 0: https://stackoverflow.com/a/41970485

For quite a bunch of kinds of files, the behavior is unspecified, so more or less anything goes either way 🤷

I don't think write ever errors for us, given asyncio and other stuff?

Right, and for a non-blocking buffered IO it would be incredibly awkward to throw actual errors just because it's full. That possibility would be incredibly detrimental in the common case of success. I admit having 0 signal that is quite a bad API though. I guess this is yet-another case something like a Result{Int, Err} sum type would be nice, to distinguish success from errors 🤔

Maybe let me put it another way - would this be a valid IO subtype (barring some other missing methods)?

struct FlakyIO <: IO io::IO end Base.write(fio::FlakyIO, b::UInt8) = rand(Bool) ? write(fio.io, b) : 0

You could get very fancy and record which writes succeeded & which ones failed for introspection later on, or do some more complicated scheme for deciding when exactly it "fails" to write anything. This kind of type would be incredibly useful for fuzzing stuff that accidentally depends on writes to IO always succeeding (like the fallback method of write in Base does, for example).

One issue I see with just throwing an error for partial writes/write failures of parts of larger types is that then the return value of write becomes meaningless - either we always get a full write, or we get an error. There would be no more room for partial writes, which can happen in a bunch of cases.

This conversation is worth continuing, but for the purposes of fixing this bug I think it's orthogonal.

Our AbstractArray write method can also suffer "torn" writes in the same way:

function unsafe_write(s::IO, p::Ptr{UInt8}, n::UInt) written::Int = 0 for i = 1:n written += write(s, unsafe_load(p, i)) end return written end

This is probably worth splitting into a separate issue and fixing across-the-board. The only thing I think this needs to merge @gbaraldi is a test.

Filed #57011 to continue discussion here

Make write(IO, Char) actually return the amount of printed bytes inst…

7ae5839

…ead of the attempted written bytes.

gbaraldi added io Involving the I/O subsystem: libuv, read, write, etc. bugfix This change fixes an existing bug backport 1.10 Change should be backported to the 1.10 release backport 1.11 Change should be backported to release-1.11 labels Jan 7, 2025

JeffBezanson approved these changes Jan 7, 2025

View reviewed changes

Seelengrab reviewed Jan 7, 2025

View reviewed changes

JeffBezanson added the needs tests Unit tests are required for this change label Jan 7, 2025

Add test

4e340e4

topolarity added merge me PR is reviewed. Merge when all tests are passing and removed needs tests Unit tests are required for this change labels Jan 9, 2025

topolarity approved these changes Jan 9, 2025

View reviewed changes

IanButterworth merged commit 6ac351a into master Jan 10, 2025
8 of 9 checks passed

IanButterworth deleted the gb/writebytes branch January 10, 2025 02:24

topolarity mentioned this pull request Jan 10, 2025

write(...) output "torn" after a partial write #57011

Open

topolarity removed the merge me PR is reviewed. Merge when all tests are passing label Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make write(IO, Char) actually return the amount of printed bytes instead of the attempted written bytes. #56980

Make write(IO, Char) actually return the amount of printed bytes instead of the attempted written bytes. #56980

gbaraldi commented Jan 7, 2025

Seelengrab commented Jan 7, 2025 •

edited

Loading

gbaraldi commented Jan 7, 2025

Seelengrab commented Jan 7, 2025

Seelengrab Jan 7, 2025 •

edited

Loading

nhz2 Jan 7, 2025

nhz2 Jan 7, 2025

Seelengrab Jan 8, 2025

Seelengrab Jan 8, 2025

gbaraldi Jan 9, 2025

JeffBezanson Jan 9, 2025 •

edited

Loading

Seelengrab Jan 9, 2025 •

edited

Loading

topolarity Jan 9, 2025

topolarity Jan 10, 2025 •

edited

Loading

Make write(IO, Char) actually return the amount of printed bytes instead of the attempted written bytes. #56980

Make write(IO, Char) actually return the amount of printed bytes instead of the attempted written bytes. #56980

Conversation

gbaraldi commented Jan 7, 2025

Seelengrab commented Jan 7, 2025 • edited Loading

gbaraldi commented Jan 7, 2025

Seelengrab commented Jan 7, 2025

Seelengrab Jan 7, 2025 • edited Loading

Choose a reason for hiding this comment

nhz2 Jan 7, 2025

Choose a reason for hiding this comment

nhz2 Jan 7, 2025

Choose a reason for hiding this comment

Seelengrab Jan 8, 2025

Choose a reason for hiding this comment

Seelengrab Jan 8, 2025

Choose a reason for hiding this comment

gbaraldi Jan 9, 2025

Choose a reason for hiding this comment

JeffBezanson Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

Seelengrab Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

topolarity Jan 9, 2025

Choose a reason for hiding this comment

topolarity Jan 10, 2025 • edited Loading

Choose a reason for hiding this comment

Seelengrab commented Jan 7, 2025 •

edited

Loading

Seelengrab Jan 7, 2025 •

edited

Loading

JeffBezanson Jan 9, 2025 •

edited

Loading

Seelengrab Jan 9, 2025 •

edited

Loading

topolarity Jan 10, 2025 •

edited

Loading