Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic redis errors #428

Open
ajsharp opened this issue Aug 25, 2019 · 20 comments
Open

Periodic redis errors #428

ajsharp opened this issue Aug 25, 2019 · 20 comments

Comments

@ajsharp
Copy link

ajsharp commented Aug 25, 2019

I'm using ngx_mruby to do dynamic lets encrypt ssl resolution. Currently building from source against nginx 1.16.0 and ngx_mruby 2.1.5.

Here's the relevant parts of the configuration:

# /etc/nginx/nginx.conf
http {
  # ...
  include /etc/nginx/conf.d/*.conf;

  mruby_init_worker_code '
    userdata = Userdata.new
    redis_url = "redis://my.redis.url:6379"
    redis_host, redis_port = redis_url[/redis:\/\/(.+)/, 1].split(":")
    userdata.redis = Redis.new redis_host, redis_port.to_i
    userdata.redis.select 2
  ';
}

# /etc/nginx/conf.d/app.conf
# ...
server {
  listen 443 ssl;
  # ...
mruby_ssl_handshake_handler_code '
    ssl = Nginx::SSL.new
    domain = ssl.servername

    redis = Userdata.new.redis
    ssl_certificate = redis["#{domain}.crt"]
    ssl_key = redis["#{domain}.key"]

    if ssl_certificate && ssl_certificate != "" && ssl_key && ssl_key != ""
      ssl.certificate_data = ssl_certificate
      ssl.certificate_key_data = ssl_key
    end
  ';

  # ...
}

After roughly 45 minutes of running ngx_mruby, i start to see these redis connection failures:

2019/08/25 20:53:03 [error] 23164#0: *17960 ngx_mruby : mrb_run failed: return 500 HTTP status code to client: error: INLINE CODE:6: could not read reply (Redis::ConnectionError) while SSL handshaking, client: 122.36.17.229, server: 0.0.0.0:443

If I reload nginx, the errors stop. If I let the server run for about 45 minutes, they will inevitably return. Right now I'm avoiding these by reloading nginx every 15 minutes on a cron job, but it seems like there's some sort of issue with the embedded ruby code that causes the redis connection to stop working eventually. Maybe the embedded ruby code is leaving redis connections hanging or something.

The server this running on has unlimited ulimit and the number of open redis connections has never exceeded a couple hundred.

Has anyone else seen these types of errors?

@matsumotory
Copy link
Owner

Thank you for you report! I'll investigate the issue.

@ajsharp
Copy link
Author

ajsharp commented Aug 26, 2019

Something else I've noticed is that the existing nginx workers take an unusually long amount of time to shut down when I run a reload. Like a good 30 seconds, and the request load is very light.

@ajsharp
Copy link
Author

ajsharp commented Oct 14, 2019

@matsumotory Any ideas on this?

@matsumotory
Copy link
Owner

matsumotory commented Dec 13, 2019

Sorry for my late response. I think Redis client timed out. So Could you implement reconnecting Redis when catching timeout error like Redis::ConnectionError?

@ajsharp
Copy link
Author

ajsharp commented Dec 13, 2019

Yes, I can try. I can try wrapping the mruby_ssl_handshake_handler_code in a begin/rescue. I'm not sure how/why a reconnect would do anything there, but I'll try and report back.

@matsumotory
Copy link
Owner

Yes, it's strange behavior. Maybe OS or middleware layer settings disconnects the TCP connection I guess.

@ajsharp
Copy link
Author

ajsharp commented Dec 27, 2019

@matsumotory Is there a reconnect interface or method on the mruby-redis client?

@adz624
Copy link

adz624 commented Feb 12, 2020

@ajsharp @matsumotory
I have same issue, is this problem solve? thank you all

@ajsharp
Copy link
Author

ajsharp commented Feb 14, 2020

Yes, it's strange behavior. Maybe OS or middleware layer settings disconnects the TCP connection I guess.

@matsumotory We're using the same redis server (AWS elasticache) in our main app from the same host machine without connection issues, so the problem is almost definitely in this library in some way. It's unclear whether the problem is the redis client this project uses or some lower level issue with the nginx extension. However, it's not an OS/middleware issue b/c we're maintaining persistent connections to the same redis server from the same host machine without any problems.

@afunction The only "fix" we've found is to restart every 8 mins or so. It seems like there's some sort of redis connection issue.

When I tried to diagnose this before, I started looking through the redis client this project bundles, but I don't have the expertise or the time to get into the C++ code. It would be great if it was possible to add additional logging instrumentation around the connection logic.

@yyamano
Copy link
Collaborator

yyamano commented Feb 17, 2020

@ajsharp @afunction I'm wondering if you guys can move Redis.new into mruby_ssl_handshake_handler_code? I don't think redis[], mrb_redis_get() in C, supports reconnect.

It looks like _redisContextConnectTcp() in hredis supports it, but it is only called via redisConnectWithTimeout() from the following methods.

  • initialize(): mrb_redis_connect() in C
  • connect_set_raw(): mrb_redis_connect_set_raw() in C
  • connect_set_udptr(): mrb_redis_connect_set_raw() in C

Disclaimer: I have never used redis. I just looked at mruby-redis and hired code while waiting for windows update.

@matsumotory
Copy link
Owner

@yyamano Thank you for your suggestion.

@ajsharp @afunction Could you try to use Redis#enable_keepalive in doc ?

@matsumotory
Copy link
Owner

Or you can implement error handling like this example.

@ajsharp
Copy link
Author

ajsharp commented Mar 3, 2020

@matsumotory I tried using #enable_keepalive and it didn't help -- still getting the errors. I will try the begin rescue you linked to and report back.

@ajsharp
Copy link
Author

ajsharp commented Mar 3, 2020

I've put begin/rescue blocks in both the mruby_ssl_handshake_handler_code block and the mruby_init_worker_code block. Errors are only being generated from the handshake handler block. Here's my code:

  begin
    ssl = Nginx::SSL.new
    domain = ssl.servername

    redis = Userdata.new.redis
    ssl_certificate = redis["#{domain}.crt"]
    ssl_key = redis["#{domain}.key"]

    if ssl_certificate && ssl_certificate != "" && ssl_key && ssl_key != ""
      ssl.certificate_data = ssl_certificate
      ssl.certificate_key_data = ssl_key
    end
   rescue Redis::ReplyError => e
     STDERR.puts "HANDSHAKE REPLY ERROR: #{e}"
   rescue Redis::ConnectionError => e
     STDERR.puts "HANDSHAKE CONN ERROR: #{e}"
   end

I'm seeing lots of Redis::ConnectionError exceptions in the nginx error logs. Unfortunately, all the reply says is: "could not read reply"

@ajsharp
Copy link
Author

ajsharp commented Mar 3, 2020

@matsumotory
Copy link
Owner

@ajsharp Thank you for your details. I have implemented your suggestion on matsumotory/mruby-redis#102 . Could you try this?

@yyamano
Copy link
Collaborator

yyamano commented Aug 12, 2020

@ajsharp Do you still have the problem with the newer version of mruby-redis ?

@ajsharp
Copy link
Author

ajsharp commented Aug 13, 2020

@yyamano I will try an upgrade and let you know. Thanks for checking in on this.

@ajsharp
Copy link
Author

ajsharp commented Aug 13, 2020

@yyamano Is the latest version of mruby-redis installed with the latest version of this library?

@yyamano
Copy link
Collaborator

yyamano commented Aug 13, 2020

@ajsharp If you rebuild ngx_mruby, the latest version of mruby-redis including the change is installed.
You can check build_config.rb.lock to make sure if the latest version is installed. Mine is:

    https://github.com/matsumotory/mruby-redis.git:
      url: https://github.com/matsumotory/mruby-redis.git
      branch: HEAD
      commit: f1f98d9450783b8c281b5064554b12bfb9f0a65a
      version: 0.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants