Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce RDMA transport (experimental) #55

Merged
merged 4 commits into from
Aug 1, 2024

Conversation

pizhenwei
Copy link
Contributor

@pizhenwei pizhenwei commented Jul 23, 2024

Hi,

Valkey Over RDMA has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side.

There are commits in this PR:

  • Fully abstract IO framework
  • Fix timing issue on reconnect
  • Introduce RDMA transport (experimental)
  • cmake related support

These commits are already in order, they have separate function with detailed commit message, so no need to squash them into one.

Copy link
Collaborator

@michael-grunder michael-grunder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. I've never used RDMA before so I plan on getting this building locally and playing around with it locally.

include/valkey/valkey_rdma.h Outdated Show resolved Hide resolved
src/conn.c Outdated Show resolved Hide resolved
src/rdma.c Outdated Show resolved Hide resolved
src/rdma.c Outdated Show resolved Hide resolved
src/rdma.c Outdated Show resolved Hide resolved
src/rdma.c Outdated Show resolved Hide resolved
src/rdma.c Outdated Show resolved Hide resolved
src/rdma.c Show resolved Hide resolved
src/rdma.c Show resolved Hide resolved
src/rdma.c Outdated Show resolved Hide resolved
@pizhenwei pizhenwei force-pushed the feature-rdma branch 2 times, most recently from e519823 to aa40c41 Compare July 26, 2024 06:54
@pizhenwei
Copy link
Contributor Author

pizhenwei commented Jul 26, 2024

This is awesome. I've never used RDMA before so I plan on getting this building locally and playing around with it locally.

Hi,

If there is already a physical RDMA interface on your platform, build this branch and run command VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh.

There is also a benchmark guide in Redis style, I'll create valkey style once this repo gets ready. Enjoy the 2.5x performance!

It's also possible to setup a virtual RoCE device by command rdma link add rxe0 type rxe netdev $IFACENAME, or use this script to make it easy.

@michael-grunder
Copy link
Collaborator

cc @bjosv @zuiderkwast Does Anybody want to take a look before we merge. Unfortunately, I haven't been able to get SoftRoCE working at all, but have an RDMA-enabled NIC coming soon.

Copy link
Collaborator

@bjosv bjosv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was new to me, but seems like a good feature!
Found some nits, and then I'll guess we need to add something to CI.
Maybe the plan was do do that after there is a (pre-)release of valkey with the feature?

Makefile Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Show resolved Hide resolved
valkey_rdma-config.cmake.in Outdated Show resolved Hide resolved
src/conn.c Outdated Show resolved Hide resolved
include/valkey/valkey_rdma.h Outdated Show resolved Hide resolved
src/rdma.c Outdated Show resolved Hide resolved
- Hide valkeyNetRead & valkeyNetWrite
Move valkeyContextDefaultFuncs into net.c, then valkeyNetRead &
valkeyNetWrite could be declared as static functions.
async.h is no longer needed by valkey.c, that makes valkey.c
independent from networking IO specific operations.

- Introduce .connect for valkeyContextFuncs

Abstract .connect method, then new connection type has a chance to
implement its own connect function. Finally, TCP/Unix/Userfd related
functions have been removed from valkey.c.

- Introduce .set_timeout method, setsockopt is only available for
socket family. None-socket transport would fail on this. .set_timeout
allows transport specific timeout setting function.

Finally, the fully abstract IO framework is expected to support more
transport. In especial, RDMA like none-socket transport.

Signed-off-by: zhenwei pi <[email protected]>
If necessary private context gets freed first, NULL pointer would be
accessed in .close method. The right and safe timing should be
.close then .free_privctx.

Signed-off-by: zhenwei pi <[email protected]>
Valkey Over RDMA[1] has been supported as experimental feature since
Valkey 8.0. Support RDMA transport for the client side.

RDMA is not a builtin feature, supported as module only, so we have to
run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR.

An example to run test.sh:
VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh

 ...
 Testing against RDMA connection (192.168.122.1:56379):
 valkey-io#138 Is able to deliver commands: PASSED
 valkey-io#139 Is a able to send commands verbatim: PASSED
 valkey-io#140 %s String interpolation works: PASSED
 valkey-io#141 %b String interpolation works: PASSED
 valkey-io#142 Binary reply length is correct: PASSED
 valkey-io#143 Can parse nil replies: PASSED
 valkey-io#144 Can parse integer replies: PASSED
 valkey-io#145 Can parse multi bulk replies: PASSED
 valkey-io#146 Can handle nested multi bulk replies: PASSED
 valkey-io#147 Send command by passing argc/argv: PASSED
 valkey-io#148 Can pass NULL to valkeyGetReply: PASSED
 valkey-io#149 RESP3 PUSH messages are handled out of band by default: PASSED
 valkey-io#150 We can set a custom RESP3 PUSH handler: PASSED
 valkey-io#151 We properly handle a NIL invalidation payload: PASSED
 valkey-io#152 With no handler, PUSH replies come in-band: PASSED
 valkey-io#153 With no PUSH handler, no replies are lost: PASSED
 valkey-io#154 We set a default RESP3 handler for valkeyContext: PASSED
 valkey-io#155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED
 valkey-io#156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED
 valkey-io#157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED
 valkey-io#158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED
 valkey-io#159 We can use valkeyOptions to set privdata: PASSED
 valkey-io#160 Our privdata destructor fires when we free the context: PASSED
 valkey-io#161 Successfully completes a command when the timeout is not exceeded: PASSED
 valkey-io#162 Does not return a reply when the command times out: SKIPPED
 valkey-io#163 Reconnect properly reconnects after a timeout: PASSED
 #164 Reconnect properly uses owned parameters: PASSED
 #165 Returns I/O error when the connection is lost: PASSED
 #166 Returns I/O error on socket timeout: PASSED
 #167 Set error when an invalid timeout usec value is used during connect: PASSED
 #168 Set error when an invalid timeout sec value is used during connect: PASSED
 #169 Append format command: PASSED
 #170 Throughput:
	(1000x PING: 0.010s)
	(1000x LRANGE with 500 elements: 0.060s)
	(1000x INCRBY: 0.012s)
	(10000x PING (pipelined): 0.066s)
	(10000x LRANGE with 500 elements (pipelined): 0.523s)
	(10000x INCRBY (pipelined): 0.024s)
 ...

Thanks to Michael Grunder for lots of review suggestions!

Link[1]: valkey-io/valkey#477
Signed-off-by: zhenwei pi <[email protected]>
Signed-off-by: zhenwei pi <[email protected]>
@pizhenwei
Copy link
Contributor Author

pizhenwei commented Jul 30, 2024

This was new to me, but seems like a good feature! Found some nits, and then I'll guess we need to add something to CI. Maybe the plan was do do that after there is a (pre-)release of valkey with the feature?

Sure, RDMA CI is in the plan.

The RDMA support is a module (shared library) of valkey-8.0:

  • if the library is built with valkey together, valkey-8.0 docker would be fine
  • otherwise, we need clone valkey-8.0 source code in libvalkey CI, and build valkey-8.0 with RDMA support.

There is a simple RDMA client in valkey, and valkey runs RDMA CI test.

And all the suggestions are appled, thanks!

@pizhenwei
Copy link
Contributor Author

Hi @bjosv @michael-grunder

This PR requests no squash, if force squash is needed, please let me know.

@michael-grunder michael-grunder merged commit 35e0ec2 into valkey-io:main Aug 1, 2024
36 of 43 checks passed
@pizhenwei pizhenwei deleted the feature-rdma branch August 1, 2024 10:54
Makefile Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants