Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem Getting "Big" Values #27

Open
JRaspass opened this issue Oct 1, 2022 · 13 comments
Open

Problem Getting "Big" Values #27

JRaspass opened this issue Oct 1, 2022 · 13 comments

Comments

@JRaspass
Copy link
Owner

JRaspass commented Oct 1, 2022

Migrated from rt.cpan.org #93140 (status was 'open')

Requestors:

From [email protected] on 2014-02-18 16:59:35
:

Hi there,

I have been playing around with a few Memcached client libraries recently
and, ideally, would like to stick with Perl, but with decent performance.
Memcached::Client is simple enough to use but performance isn't close
Python or C# libraries offer.

Thinking it was probably due to its pure Perl nature, I then came across
your library.

During the CPAN installation I ran into the following unit test issue: "Failed
test 'Fetch# at t/big_value.t line 40."

With the Memcached server I am running, it's possible to up the default
value-size limit from the default 1MB to 128MB. Doing so and setting the
"max_size" parameter in your library I am able to set very large values
without any problem.
The problem comes when I try to get them out, it just doesn't return
anything.
The key'/value most definitely has been set as I can retrieve it with a get
call from Memcached::Client.

I've placed a code sample here, rather simple but it might help:
http://pastebin.com/QtBxqMhB

My setup:

  • Windows 2012
  • Strawberry Perl 5.16 64bits
  • Cache::Memcached::Fast v0.21
  • Memcached v1.4.5

I've had no problem using my Memcached server build with:

  • python-memcached
  • EnyimMemcached (C#)
  • Memcache::Client
    but it's worth nothing that Cache::Memcached throws up quite a bit of
    unit tests failures and therefore hasn't been tested.

Let me know if this makes any sense and/or if you need more information.

Thanks,
Damien

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-18 18:00:46
:

On Tue Feb 18 11:59:35 2014, [email protected] wrote:

During the CPAN installation I ran into the following unit test issue:
"Failed
test 'Fetch# at t/big_value.t line 40."

This test has been written before memcached acquired -I option to raise 1MB limit. The test assumes that values slightly below 1MB will be accepted by the server, but slightly above will be rejected. My guess you run the tests on memcached -I 128m and value >1MB wasn't rejected by the server, hence the test failure.

With the Memcached server I am running, it's possible to up the
default
value-size limit from the default 1MB to 128MB. Doing so and setting
the
"max_size" parameter in your library I am able to set very large
values
without any problem.
The problem comes when I try to get them out, it just doesn't return
anything.

Can't reproduce. I ran memcached -I 128m -vv (version 1.4.17), adjusted your code to use 100000 * 1024 value, everything worked on Linux (I don't have access to any Windows host so can't test there):

$ perl /tmp/a.pl
Setting keys with Fast..
done.

Getting keys with Fast..
data_short=1024
data_long=102400000

Getting keys with Client..
data_short=1024
data_long=102400000

Output from memcached:
...
slab class 64: chunk size 134217728 perslab 1
...
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 set data_short 0 0 1024

30 STORED
<30 set data_long 0 0 102400000
30 STORED
<30 get data_short
30 sending key data_short
30 END
<30 get data_long
30 sending key data_long
30 END
<31 new auto-negotiating client connection
31: Client using the ascii protocol
<31 get data_short
31 sending key data_short
31 END
<31 get data_long
31 sending key data_long
31 END
<30 connection closed.
<31 connection closed.

Could you please provide the output of memcached -vv when C::M::F fails?

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 09:50:58
:

On 18 February 2014 18:00, Tomash Brechko via RT <
[email protected]> wrote:

<URL: https://rt.cpan.org/Ticket/Display.html?id=93140 >

On Tue Feb 18 11:59:35 2014, [email protected] wrote:

During the CPAN installation I ran into the following unit test issue:
"Failed
test 'Fetch# at t/big_value.t line 40."

This test has been written before memcached acquired -I option to raise
1MB limit. The test assumes that values slightly below 1MB will be
accepted by the server, but slightly above will be rejected. My guess you
run the tests on memcached -I 128m and value >1MB wasn't rejected by the
server, hence the test failure.

As per the big_value.t file it seems that line 40 isn't about setting
something larger than 1MB (as this is on line 43) but about getting the
regular test value ($value = 'x' x THRESHOLD). Which seems rather identical
to the problem I experience.

With the Memcached server I am running, it's possible to up the
default
value-size limit from the default 1MB to 128MB. Doing so and setting
the
"max_size" parameter in your library I am able to set very large
values
without any problem.
The problem comes when I try to get them out, it just doesn't return
anything.

Can't reproduce. I ran memcached -I 128m -vv (version 1.4.17), adjusted
your code to use 100000 * 1024 value, everything worked on Linux (I don't
have access to any Windows host so can't test there):

$ perl /tmp/a.pl
Setting keys with Fast..
done.

Getting keys with Fast..
data_short=1024
data_long=102400000

Getting keys with Client..
data_short=1024
data_long=102400000

Output from memcached:
...
slab class 64: chunk size 134217728 perslab 1
...
<30 new auto-negotiating client connection
30: Client using the ascii protocol
<30 set data_short 0 0 1024

30 STORED
<30 set data_long 0 0 102400000
30 STORED
<30 get data_short
30 sending key data_short
30 END
<30 get data_long
30 sending key data_long
30 END
<31 new auto-negotiating client connection
31: Client using the ascii protocol
<31 get data_short
31 sending key data_short
31 END
<31 get data_long
31 sending key data_long
31 END
<30 connection closed.
<31 connection closed.

Could you please provide the output of memcached -vv when C::M::F fails?

Sure, please find below the output:

...
slab class 64: chunk size 134217728 perslab 1
...
<604 new auto-negotiating client connection
604: Client using the ascii protocol
<604 set data_short 0 0 1024

604 STORED
<604 set data_long 0 0 10240000
604 STORED
<604 get data_short
604 sending key data_short
604 END
<604 get data_long
604 sending key data_long
604 END
<612 new auto-negotiating client connection
612: Client using the ascii protocol
<612 get data_short
612 sending key data_short
612 END
<612 get data_long
612 sending key data_long
612 END
<612 connection closed.
608 END
<608 connection closed.

Nothing seems out of the ordinary, except that the C::M::F get for
"data_long" didn't actually return anything.

Thanks,
Damien

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 12:05:41
:

On Wed Feb 19 04:50:58 2014, [email protected] wrote:

As per the big_value.t file it seems that line 40 isn't about setting
something larger than 1MB (as this is on line 43) but about getting the
regular test value ($value = 'x' x THRESHOLD). Which seems rather identical
to the problem I experience.

Yup, you are right, I didn't check with the code. My final wild guess is that your server is too busy and memcached doesn't respond quite fast. Please set io_timeout parameter to zero to disable I/O timeout (default is 1 second), i.e.

my $memd_fast = Cache::Memcached::Fast->new ({ servers => [ 'localhost:11211' ],
max_size => 10000 * 1024,
io_timeout => 0,
});

If this won't solve the problem then it would be interesting to know the minimum item size that reveals the problem and also if this size is always the same or varies slightly. However I don't have any further ideas.

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 12:11:29
:

On Wed Feb 19 07:05:41 2014, KROKI wrote:

My final wild guess
is that your server is too busy and memcached doesn't respond quite
fast. Please set io_timeout parameter to zero to disable I/O timeout

Actually, by default close_on_error is enabled, so if it was a timeout issue the connetion would be closed. However from your memcached output I don't see close event for 604 at all, which is puzzling.

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 12:31:05
:

On 19 February 2014 12:11, Tomash Brechko via RT <
[email protected]> wrote:

<URL: https://rt.cpan.org/Ticket/Display.html?id=93140 >

On Wed Feb 19 07:05:41 2014, KROKI wrote:

My final wild guess
is that your server is too busy and memcached doesn't respond quite
fast. Please set io_timeout parameter to zero to disable I/O timeout

Actually, by default close_on_error is enabled, so if it was a timeout
issue the connetion would be closed. However from your memcached output I
don't see close event for 604 at all, which is puzzling.

The server has beefy specs and is otherwise idling, and the same behaviour
was witnessed when client and server was run on my local machine which
again is beefy and not overloaded at all.

I've re-run the same script but with only the data_short set and gets and
the C::M::F get connection does close but later in time:
<604 new auto-negotiating client connection
604: Client using the ascii protocol
<604 set data_short 0 0 1024

604 STORED
<604 get data_short
604 sending key data_short
604 END
<612 new auto-negotiating client connection
612: Client using the ascii protocol
<612 get data_short
612 sending key data_short
612 END
<612 connection closed.
<604 connection closed.

Running the Memcached server in -vvv I can see rows and rows of:
604: going from conn_read to conn_waiting
604: going from conn_waiting to conn_read
604: going from conn_read to conn_waiting
604: going from conn_waiting to conn_read
604: going from conn_read to conn_waiting
604: going from conn_waiting to conn_read
604: going from conn_read to conn_waiting
604: going from conn_waiting to conn_read
604: going from conn_read to conn_waiting

Which probably explains why the data isn't returned and C::M::F connection
isn't closed in a timely fashion?

Thanks,
Damien

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 13:35:18
:

On Wed Feb 19 07:31:05 2014, [email protected] wrote:

I've re-run the same script but with only the data_short set and gets and
the C::M::F get connection does close but later in time:
...
<612 connection closed.
<604 connection closed.

I wonder how later the close of 604 happens, simply being after 612 is OK, any delay makes me think of io_timeout.

604: going from conn_waiting to conn_read
604: going from conn_read to conn_waiting

Which probably explains why the data isn't returned and C::M::F connection
isn't closed in a timely fashion?

Those lines mean memcached server is reading (or trying to read) from the client. If they are output during the STORE then this is expected, if after GET then this is puzzling.

Please do one more try before giving up: set io_timeout to zero, do gets from Memcached::Client before gets from C::M::F, and with C::M::F do get(data_long) before get(data_short), and finally paste output of memcached -vv here. Upgrading memcached to the latest versions may also help (yes, I know that only C::M::F triggers the problem, but still ;)).

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 14:14:02
:

On 19 February 2014 13:35, Tomash Brechko via RT <
[email protected]> wrote:

<URL: https://rt.cpan.org/Ticket/Display.html?id=93140 >

On Wed Feb 19 07:31:05 2014, [email protected] wrote:

604: going from conn_waiting to conn_read
604: going from conn_read to conn_waiting

Which probably explains why the data isn't returned and C::M::F
connection
isn't closed in a timely fashion?

Those lines mean memcached server is reading (or trying to read) from the
client. If they are output during the STORE then this is expected, if
after GET then this is puzzling.

They are definitely happening at the GET stage, not at the STORE.
On the data_long C::M::F run it'll go on for ever, I let it go for a few
minutes and it didn't stop.
To be fair if it is trying to communicate with the client it's much too
late at this point as we have exited a long time ago.

Please do one more try before giving up: set io_timeout to zero, do gets
from Memcached::Client before gets from C::M::F, and with C::M::F do
get(data_long) before get(data_short), and finally paste output of
memcached -vv here. Upgrading memcached to the latest versions may also
help (yes, I know that only C::M::F triggers the problem, but still ;)).

I've done as requested (first two gets are C::M::F), behaviour is the same
as per below:
<608 new auto-negotiating client connection
608: Client using the ascii protocol
<608 set data_short 0 0 1024

608 STORED
<608 set data_long 0 0 10240000
608 STORED
<616 new auto-negotiating client connection
616: Client using the ascii protocol
<616 get data_long
616 sending key data_long
616 END
<616 get data_short
616 sending key data_short
616 END
<608 get data_long
608 sending key data_long
608 END
<620 new auto-negotiating client connection
620: Client using the ascii protocol
<620 get data_short
620 sending key data_short
620 END
<616 connection closed.
<620 connection closed.

There doesn't seem to be official Windows builds for Memcached any more,
and it's difficult to justify spending too much time in trying to build
from source when there doesn't seem to be a known issue with this revision
and other clients aren't having the same problem.

I understand if you can't devote more time to find this Windows specific
problem, but could you perhaps point me to the places in the C code I
should look first should I want to troubleshoot myself?

Thanks,
Damien

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 14:19:43
:

On 19 February 2014 14:13, Damien Chaumette [email protected]wrote:

I've done as requested (first two gets are C::M::F)[..]

Sorry I meant the first two gets are with M::C and the second batch with
C::M::F, as requested.

Thanks,
Damien

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 15:32:28
:

Apologies for the spam, I've tried with a Couchbase Memcached bucket (they
seem to provide up to date Windows Memcached) and get the same behaviour.

Thanks,
Damien

On 19 February 2014 14:19, Damien Chaumette [email protected]wrote:

On 19 February 2014 14:13, Damien Chaumette [email protected]wrote:

I've done as requested (first two gets are C::M::F)[..]

Sorry I meant the first two gets are with M::C and the second batch with
C::M::F, as requested.

Thanks,
Damien

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-19 16:17:14
:

On Wed Feb 19 09:14:02 2014, [email protected] wrote:

They are definitely happening at the GET stage, not at the STORE.
On the data_long C::M::F run it'll go on for ever, I let it go for a few
minutes and it didn't stop.
To be fair if it is trying to communicate with the client it's much too
late at this point as we have exited a long time ago.

C::M::F client sends get request, reads the reply and sends nothing else after that. If I got you correctly then memcached server is trying to read something from the client, and I don't understand why.

I've done as requested (first two gets are C::M::F), behaviour is the same
as per below:

What I see from the trace:

<608 new auto-negotiating client connection
608: Client using the ascii protocol
<608 set data_short 0 0 1024

608 STORED
<608 set data_long 0 0 10240000
608 STORED

C::M::F stored two keys in memcached.

<616 new auto-negotiating client connection
616: Client using the ascii protocol
<616 get data_long

616 sending key data_long
616 END
<616 get data_short
616 sending key data_short
616 END

Other client got two keys.

<608 get data_long

608 sending key data_long
608 END

C::M::F requested data_long and got some error at this point, closed the connection, and requested data_short via another connection:

<620 new auto-negotiating client connection
620: Client using the ascii protocol
<620 get data_short

620 sending key data_short
620 END
<616 connection closed.
<620 connection closed.

Given that you disabled io_timeout I can only guess what error could happen, but C::M::F definitely got one, otherwise it wouldn't open another connection. So the question is why C::M::F thinks it got an error, and whether its perception is valid.

There doesn't seem to be official Windows builds for Memcached any more,
and it's difficult to justify spending too much time in trying to build
from source when there doesn't seem to be a known issue with this revision
and other clients aren't having the same problem.

There's a tiny possibility that C::M::F specifically triggers some race in memcached server that other clients don't, and race fixes happen almost every memcached release (without detailed description what they could affect). But I got your last reply, the problem is reproduceable with the latest memcached.

I understand if you can't devote more time to find this Windows specific
problem, but could you perhaps point me to the places in the C code I
should look first should I want to troubleshoot myself?

First of all an easier way would be to use another module like Cache::Memcached::libmemcached (provided that it builds on Windows and works) - should be comparably fast. Debugging Perl modules in C is a pain even on Linux, and simply looking into the code won't reveal much I think because the problem somehow relates to your setup (I built memcached 1.4.5 here but couldn't reproduce). Value reading happens in src/client.c:read_value(). But if you really want to devote time to it, the first thing I would try is to capture network traffic to see actual packet contents (data_long is big, but protocol commands will always be at packet beginnings), and to trace system calls (on Linux we have strace utility that shows system calls and their results; don't know what Windows has).

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2014-02-20 09:26:47
:

I understand if you can't devote more time to find this Windows specific
problem, but could you perhaps point me to the places in the C code I
should look first should I want to troubleshoot myself?

First of all an easier way would be to use another module like
Cache::Memcached::libmemcached (provided that it builds on Windows and
works) - should be comparably fast. Debugging Perl modules in C is a pain
even on Linux, and simply looking into the code won't reveal much I think
because the problem somehow relates to your setup (I built memcached 1.4.5
here but couldn't reproduce). Value reading happens in
src/client.c:read_value(). But if you really want to devote time to it,
the first thing I would try is to capture network traffic to see actual
packet contents (data_long is big, but protocol commands will always be at
packet beginnings), and to trace system calls (on Linux we have strace
utility that shows system calls and their results; don't know what Windows
has).

I've just had a look at Cache::Memcached::libmemcached on CPAN although
sadly it doesn't compile out of the box on my Strawberry Perl..
There also seems to be a few Couchbase Perl module on CPAN as well, with a
bit of luck one of them will be in C and compile on Windows.
I'll try to get C::M::libmemcached to work first, if not then a fast
Couchbase module, failing that I may do some network sniffing around and/or
look into Windows system calls.

Thanks,
Damien

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2022-06-18 03:32:19
:

Hello,

I just noticed that src/client.c:read_value() returns 4294967295 where it should return -1.
4294967295 is 2^32 - 1.

My setup:

  • Windows 8.1
  • Strawberry Perl 5.32.1 64bits
  • Cache::Memcached::Fast v0.27
  • Memcached v1.6.9

Thnak you,

twata

@JRaspass
Copy link
Owner Author

JRaspass commented Oct 1, 2022

From [email protected] on 2022-06-18 03:44:29
:

Sorry. It was src/client.c:readv_restart(), not src/client.c:read_value() .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant