Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Skeleton for DMA layer #306

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

bhargavshah1988
Copy link
Contributor

@bhargavshah1988 bhargavshah1988 commented Nov 12, 2024

}
}

pub fn map_dma_ranges(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is a draft ... could you add a doc comment so that folks know the intended contract and usage here?

}

/// Adds a new client to the list and stores its pinning threshold
fn register_client(&self, client: &Arc<DmaClient>, threshold: usize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to have per-client bounce buffers?

}

// Trait for the DMA interface
pub trait DmaInterface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you envision that this would replace other uses of bounce buffering? (for example, copying from private memory into shared memory for isolated VMs, or when the block disk bounces for arm64 guests)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i do envision that.
Policy on GlobalDmaManager can be control the behavior system wide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. How do you envision handling the case:

  1. Have a VTL0 VM where sometimes memory needs to be pinned.
  2. This particular transaction's memory must be placed in a bounce buffer, even if pinning would otherwise succeed?

(I'm thinking about the block device driver here, where it would never want to pin memory - the kernel doesn't know about the VTL0 addresses.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this offline. map_dma_ranges will take additional per-transaction parameters. For example, some clients may want all transactions to be placed into the bounce buffer.

static GLOBAL_DMA_MANAGER: OnceCell<Arc<GlobalDmaManager>> = OnceCell::new();

/// Global DMA Manager to handle resources and manage clients
pub struct GlobalDmaManager {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What settings will this manager have? Which of those do you expect to expose in Vtl2Settings?

let mut dma_transactions = Vec::new();
let force_bounce_buffer = options.map_or(false, |opts| opts.force_bounce_buffer);

let threshold = manager.get_client_threshold(self).ok_or(DmaError::InitializationFailed)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be moved before defining dma_transactions to fail earlier.


for range in ranges {
let use_bounce_buffer = force_bounce_buffer || range_size > threshold || !self.can_pin(range);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra empty line


for transaction in dma_transactions {
if transaction.is_bounce_buffer {
// Code to release bounce buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need copy out from bounce buffer here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller may not know if it's bounce buffer or not, so I think we need handle it here. And need pass the Memory range to copy data out. Actually, I think we need know the IO direction to decide which copy is needed (including the one in map_dma_ranges).


/// Allocates a bounce buffer if available, otherwise returns an error
pub fn allocate_bounce_buffer(&self, size: usize) -> Result<usize, DmaError> {
Err(DmaError::BounceBufferFailed) // Placeholder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need ensure the bounce buffer is page aligned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I envision that bounce buffer management will be aligned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also note it: our current bounce buffer allocation function has infinite loop issue which we want to avoid it in the new implementation.

let result = self.issue_raw(command).await;

dma_client
.unmap_dma_ranges(&dma_transactions.transactions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need handle the same functionality as copy_to_guest_memory in unmap_dma_ranges if opcode.transfer_controller_to_host()

let result = self.issue_raw(command).await;

dma_client
.unmap_dma_ranges(&dma_transactions.transactions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify usage, can we just pass dma_transactions to unmap_dma_ranges? So caller needn't know details of DmaTransactionHandler?

pub original_addr: usize,
pub dma_addr: usize,
pub size: usize,
pub is_pinned: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have comments explaining these fields? like is_pinned and is_bounce_buffer cannot be both true, why do we need keep both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i will add it.

Copy link
Contributor

@mattkur mattkur Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Juan. It seems you can go further, and do something like:

pub enum MemoryBacking {
Pinned(prepinned: bool),
InBounceBuffer
}

And rather than doing if x.is_pinned, instead do match x.backing { Pinned => ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if dma mapping options are disjoint (ie pinned or is bounce buffer), then they should be represented with an enum like Matt suggested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will change this to enum.

let threshold = manager.get_client_threshold(self).ok_or(DmaError::InitializationFailed)?;

for range in ranges {
let use_bounce_buffer = force_bounce_buffer || range_size > threshold || !self.can_pin(range);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If can_pin returns true, would it grantee the later pin_memory succeed?
I guess not. Then what's the strategy to handle pin/bounce buffer failures?
I mean, should we fallback to bounce buffer if pin failed?
Should we try pin if bounce buffer allocation failed?

Ok(DmaTransactionHandler { transactions })
}

pub fn unmap_dma_ranges(&self, dma_transactions: &[DmaTransaction]) -> Result<(), DmaError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this need to be an &mut reference to the dma_transactions?

use memory_range::MemoryRange;
use once_cell::sync::OnceCell;

pub use dma_client::{DmaClient, DmaInterface, DmaTransaction, DmaTransactionHandler};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either clippy or fmt will want you to split these out. Doesn't hurt to run cargo xtask fmt --fix on your code just to avoid folks noticing this kind of stuff.

MapFailed,
UnmapFailed,
PinFailed,
BounceBufferFailed,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use source attributes so that we don't lose error origination. E.g.

#[derive(Error, Debug)]
pub enum DmaError {
...
PinFailed(#[source] ... error type)

@mattkur
Copy link
Contributor

mattkur commented Jan 10, 2025

Please update this pr description with a high level overview of what's going on (including any design choices that you are making, and other options considered but not implemented & why).

As we get past the draft stage, this code will also need tests. But I appreciate having the design dialog via code - thanks!

@mattkur
Copy link
Contributor

mattkur commented Jan 10, 2025

High level question: this machinery needs to work across a save & restore (e.g. an nvme device can have outstanding IO across an openhcl servicing operation). Have you yet considered how this would plug in to that?

@bhargavshah1988
Copy link
Contributor Author

High level question: this machinery needs to work across a save & restore (e.g. an nvme device can have outstanding IO across an openhcl servicing operation). Have you yet considered how this would plug in to that?

@mattkur DMA manager will save its self. However, in flight transection and client needs to save and restore by consumer(NVMe/MANA).
Do you agree ?

@mattkur
Copy link
Contributor

mattkur commented Jan 13, 2025

High level question: this machinery needs to work across a save & restore (e.g. an nvme device can have outstanding IO across an openhcl servicing operation). Have you yet considered how this would plug in to that?

@mattkur DMA manager will save its self. However, in flight transection and client needs to save and restore by consumer(NVMe/MANA). Do you agree ?

Sure. We will need some mechanism to:

  • save the DmaTransaction objects themselves (e.g. they need to have a stable save state defined), and/or
  • reconstruct the state. E.g. let's say there are dma buffers that need to be saved/restored ... what is the API by which the devices hook up the save state so that the right thing happens when IOs complete

@chris-oo chris-oo reopened this Jan 13, 2025
// Trait for the DMA interface
pub trait DmaInterface {
fn map_dma_ranges(&self, ranges: &[MemoryRange], options: Option<&DmaMapOptions>,) -> Result<DmaTransactionHandler, DmaError>;
fn unmap_dma_ranges(&self, dma_transactions: &[DmaTransaction]) -> Result<(), DmaError>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not returning an opaque handle but asking the caller to provide some pub struct fields seems a bit odd to me, but we can always iterate on this api later since all users will be in-tree.

At least, it seems like perhaps map should return something opaque which also you can get the associated info that was mapped. I don't quite remember - we'd expect us to unmap the whole map call right? The way this is specified, a caller could just unmap a portion of the map call, is that what we want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd expect us to unmap the whole map call right?
yes

The way this is specified, a caller could just unmap a portion of the map call, is that what we want?
So unmap can process multiple mapped transection at one go.

@bhargavshah1988
Copy link
Contributor Author

High level question: this machinery needs to work across a save & restore (e.g. an nvme device can have outstanding IO across an openhcl servicing operation). Have you yet considered how this would plug in to that?

@mattkur DMA manager will save its self. However, in flight transection and client needs to save and restore by consumer(NVMe/MANA). Do you agree ?

Sure. We will need some mechanism to:

  • save the DmaTransaction objects themselves (e.g. they need to have a stable save state defined), and/or
  • reconstruct the state. E.g. let's say there are dma buffers that need to be saved/restored ... what is the API by which the devices hook up the save state so that the right thing happens when IOs complete

The bounce buffer will be created from persistent memory across the UH servicing. Also the command and completion queues will be allocated from the persistent memory.
Drivers/DMA manager will save and restore pointers to these queues and buffer.

On restore, NVMe driver will restore the transection/completion the way its doing today.

DmaTransaction needs to be saved by NVMe/MANA so on restore size it can reconnect and replenish the transection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants