-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add advanced stem loading COs #13268
feat: add advanced stem loading COs #13268
Conversation
30fd666
to
2f2a22a
Compare
b4db8c8
to
fdd8810
Compare
a36c6a0
to
31becb9
Compare
31becb9
to
fab6880
Compare
Please note that UI changes (1st and 3rd commits) will be moved to #13515 once this PR is ready to be merged! |
2fc47ba
to
868a67e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some doubts if this is consistent with the other PRs and the overall user experience.
The users sees only the track in the library which has four stems. The use can now load a single stem form a file or load the whole stem file and mute all 3 of them.
In the first case they cannot add a second stem in the second case they suffer the penalty of stem processing. They need a brief understanding of the engine interna to make the right decisions during mixing which could be limiting.
Can we join both? How can we allow to seamlessly switch between these cases.
A typical use case is probably:
- Play/Cue a full track
- Notice unusable part
- Switch to stem mode an silence that part
- Instant double the track in a second deck
- Bring the silenced part in using an effect
How can we allow this using the new single stem mode introduced here behind the scenes?
The first scenario is the only way on non-primary decks (sampler and preview), while second is the default for primary deck. realistically, I only added the second option for primary deck for feature parity and enabled that advanced capability, or workaround for light platform which would struggle with more than one stem deck + rubberband (eg. raspberrypi)
By default, the By the sounds of it, the use case you are describing (clone switching between individually loaded stem vs full deck with muted tracks) are falling under the the |
Ah, OK I forget about the sampler use case. Than let's think about a way to gerearize this topic. Since it is handled by the same engine class anyway. How is the transition from single stereo to stem now done? Is there way to keep the transport going during the switch? A switch from the full track to a stem mix is probably the same as a switch from a stem mix to one of the other stems. Do we need to consider something like "load as karaoke" where we process three of the stems? I am not demanding to implement it here. I just like to brainstorm possible future solution to have the Co interface proved for that. Idea: can we use the mute controls for selecting the loaded stems? |
Currently there is no way transition "live" for a stem deck to a stereo deck with a track. With that current feature, can chose the load a track in an empty deck. If we think of a case where one would want to switch from a stem deck to a stereo deck, they would need to use an empty deck and handle the transport synch.
It would be fairly trivial to implement to some kind of
I think it would be great to even go further an allow loading a set of stem pre-mixed to stereo (eg. Note that in case not all the above feature wouldn't be ready for 2.6 and we would like to wait for completeness before releasing, the
This is already how we were planning to make the |
Works as expected now! Thank you! |
Friendly ping @daschuer, in case you've missed the notification :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left some comments. Unfortunately a conflict has developed.
src/engine/engine.h
Outdated
// not. | ||
constexpr int kMaxSupportedStems = 4; | ||
#ifdef __STEM__ | ||
constexpr uint kNoStemSelected = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it is a part of a 1 based stem index. But It is a value of the stem mask, right? I think it would be the best to introduce a type for stem mask, may an enum or even a complex type with getteer an setter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an enum + QFlag - this certianly makes the code cleaner!
src/library/librarycontrol.cpp
Outdated
#ifdef __STEM__ | ||
m_loadSelectedTrackStems = | ||
std::make_unique<ControlPushButton>(ConfigKey(group, "load_selected_track_stems")); | ||
connect(m_loadSelectedTrackStems.get(), | ||
&ControlObject::valueChanged, | ||
this, | ||
[this](double value) { | ||
if (value >= 0 && value <= 2 << mixxx::kMaxSupportedStems) { | ||
emit loadToGroup(m_group, static_cast<uint>(value), false); | ||
} | ||
}); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still in doubt that this will ever get used. Let's remove this until we have a use case for this along with a mapping that actually makes use of it. We may set up a n experimental branch, that includes this CO and the other ideas we have discussed here, that we have a test build ready and not block mappers on this. What do you think?
#ifdef __STEM__ | |
m_loadSelectedTrackStems = | |
std::make_unique<ControlPushButton>(ConfigKey(group, "load_selected_track_stems")); | |
connect(m_loadSelectedTrackStems.get(), | |
&ControlObject::valueChanged, | |
this, | |
[this](double value) { | |
if (value >= 0 && value <= 2 << mixxx::kMaxSupportedStems) { | |
emit loadToGroup(m_group, static_cast<uint>(value), false); | |
} | |
}); | |
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something I'm actually planning to use in the S4 mapping. Due to performance reasons, loading the whole stem track to just play a track accapella or instrumental is very expansive.
Often in my set, I would want to play a track instrumental, with the vocals of another track, this means having to load 8 stereo track when 2 would have been enough. This makes is particularly unusable as I will likely use a third, sometime a fourth deck.
As we are adding that feature on the UI, I think it is wrong to not allow that to be mapped - I certainly would find it a bummer if I have to pick my mouse and keyboard every time I want to go over that usecase.
What would be the downside to get this control in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes 2 of us using stems like that (using a cappella like that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can very much acknowledge the use case. I am just In doubt that exactly this implementation is particularly useful.
But I might be wrong. So how about postponing this question to the mapping PR? Or do you want to outline some details here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CO is the essence of this PR - and the use case is obviously there now!
I am therefore against removing this CO from this PR!
Since all the arguments have already been exchanged, I think it's time to start a vote of the Mixxx Core developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added documentation of the CO in the description to help summarising the proposed behaviour
Third = 0x4, | ||
Fourth = 0x8, | ||
|
||
None = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does None mean the stereo track?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like. At least it is zero if we have no stem file, right. Did you consider to implement #13700 right away.
Is there a use case for "None". Probably in future so we may want to shift Everything by 1 to free up 0x01 for stereo and model the Stem file layout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loading the premastered stereo track is clearly out of scope of this PR!
But it's possible to assign any unused value for it. The current implementation of 2^stemNo ensures simple understandable and unique stem masking accross the entire code base.
Therefore I think this is the best possible code design here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does None mean the stereo track?
None
means no stem track selected, so the deck will be loaded as stem deck, except for samplers, as they cannot load stems. All
means load the stereo track
Did you consider to implement #13700 right away.
Please see this comment and this one. TL; DR: loading the pre-mastered stereo is not possible yet.
Probably in future so we may want to shift Everything by 1 to free up 0x01 for stereo and model the Stem file layout?
I don't think we need to - AFAIK, there is 16 usecases, all covered by the flag:
- 0: load as stem deck
- 1..14: load as stereo with only the selected stems
- 15: load as stereo (currently mixed manually to address DSP limitation, using the pre-mastered track when we have the DSP)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If "None" load as "stem deck" for a stem file and the stereo track in case of a normal file, we need to rename it to "Default" or such.
Loading the stereo track IS a valid use case.
Will it be used when all stems are selected or will we than just select all stems?
This question is uncritical if we remove the exposal to the co interface from this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If "None" load as "stem deck" for a stem file and the stereo track in case of a normal file, we need to rename it to "Default" or such.
Sounds fair, I will change that
Loading the stereo track IS a valid use case.
it is
Will it be used when all stems are selected or will we than just select all stems?
Currently, option 2 because no DSP support. In the future Option 1.
@@ -323,6 +323,12 @@ SoundSource::OpenResult SoundSourceSTEM::tryOpen( | |||
|
|||
AVStream* firstAudioStream = nullptr; | |||
int stemCount = 0; | |||
uint selectedStemMask = params.stemMask(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint selectedStemMask = params.stemMask(); | |
mixxx::StemChannelSelection selectedStemMask = params.stemMask(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure though. Maybe keep uint for now. At one point we need integer.
I think changing the meaning behind the scenes in future is not optimal, because both versions will sound different. In terms of the CO interface, we should be able to express both. We may do any of these in the GUI or the mapping. We may also add a "premastered" enum that is temporary implemented as the sum of all stems. |
No they shouldn't sound different, except if our DSP support is poor or incomplete, in which case it will mean we are not ready to support pre-mastered track anyway. If we really feel strongly about this, then we should add a new CO in the future (e.g |
But they can. This is out of our control, we play what found in the file.
Since we have not yet a DSP stage we are already at that point when playing all stems together. I consider the "pre-matered" track as the "truth" and the stems more or less good reproductions. But that depends on wether the stem file was created during mastering or later using AI.
I do, but not yet about the final CO interface. Here I am only concerned about the internal interface, that we may expose one by one as proposed here in a CO. My idea is to have an universal 'TrackLoad(track, mode)` command that is universal usable. I like by the way the idea to combine track and mode into a single structure very much. I have not yet proped it because I actually want to have this merged soon as a minimum step, but this would allow to add getter and setter to implement a clean API on top of an obscure encoding of the Stem mask internals. |
How about introducing a stem independent loading mode?
This would get rid of the -1 here: mixxx/src/sources/soundsourcestem.cpp Line 375 in f439645
And opens access ato any number of streams in a certain container format
In case of instant doubles from a StemDeck, Vocals is muted and the cloned deck is loaded as In case of a vocal sample you may load the sampler just with An alternative instant double feature from a StemDeck may use I have no demand for details of my proposals, I have just considered possible modes we need when loading a track to a player and how express them as a single double precision number (Limited to 32 bit in my propsal) . What do you think? For this PR we even continue with the currenty implemnetd proposal as long as it is not exposed as CO. |
But the spec says it should. Now I agree that it will never be byte perfect, but it sounds wrong to add CO behaviour to factor that in. Surely, we wouldn't want want mapping developer to have some way to decide if the STEM file is actually spec compliant or not? Looking at my controller, it would also feels weird to add that "load my track in stereo but don't use the pre-mastered stream" logic in here. It sounds like, when we support DSP thus pre-mastered track load, we could add a setting to allow to still use manually mixed stereo track instead of the pre-mastered, this way if someone using Steam Creator in a wrong way or another tool that doesn't quite match the spec, they can still find the "No DSP mitigating behaviour"
That's true - the replaygain tends to do a fairly good job on mitigating that issue tho, at least that was the case when testing with the NI Stem pack (non AI stem)
I think that makes sense, but I'm not sure why it would be invalidating the proposed CO?
It also includes a few edge case to handle (e.g I would just add with the existing CO, the terminology |
I agree that we need certein limitations on the mapping and GUI to make it reasonable. My point is that the load mode is used deep in the software stack where we have STEM files and other multi stream file formats. The API in that state should be cover all and not be limited by certain recommendedn but not asserted assumptions. Mixing form the All-Stem mode to the Premastered mode is a solution for the lossy format and CPU issue. So an underlying API should be able to deal with to be open for inuitive solution, which my proposal is not. Just a technical detail that allows to transport all info we may think of.
I have designed them to acheve the oposite, so maybe I was not clear enough. I think we only need to change <20 lines of code to partially support it.
The idea with passing and adjusting the |
Just noticed we have alread a mixxx/src/engine/channels/enginedeck.cpp Lines 205 to 213 in e779238
|
Sorry. but I am still not convinced. I think not loading the pre-mastered if From my perspective, this is an overly complicated approach, which is planing on exposing technical details for a future that we haven't yet confirmation we will reach (supporting DSL requires a limiter and compressor that support all the spec settings, and from what I can see, this is a significant amount of work) Appreciate we have now come in full circle, so I'm going to close this. I have open a PR on my fork for visibility. |
Maybe the discussion has shift the focus away form the essence: I consider this PR well done and ready to merge if you remove the "load_selected_track_stems" CO. This gives the freedom to adjust is exactly to your mapping, without a breaking API change. I am sorry if my posts read too demanding, this was not intended. I started this basically to collect all ideas and find the best solution for both of us. I am curious to se your mapping proposal. |
Is there a way to get a compromise? You both want the best and you both have good arguments. We can document that developers are in dubio about this part (and there is nothing wrong with it) and that these CO's are 'experimental'. |
Thank you. An extra |
5a19be4
to
f439645
Compare
The CI is failing due to the ongoing brownouts for MacOS 12 runners. I will kick the job again on Wednesday, once the brownout is over. @daschuer could you please discard your change request? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
Currently, this goes with a mix of extending the existing load signal/slot and creating a second one, called subsequently. Is there any preference? Should I change the implementation to use only one? (only extending existing signal or only subsequently emit)
Here is documentation, as per in the EPIC (#13116)
Pre-mixed stereo load
[ChannelX],load_selected_track_stems
can be used as followed:~1
(0b0001
) means only the first stem is selected for stereo pre-mixing.13
(0b1101
) means all stem but the second are selected for stereo pre-mixing.15
(0b1111
) means the entire track as stereo. Since it is impossible to load that track without invalidating the replaygain due to the missing DSP support, all the four stem are currently mixed upstream. In the future, the pre-mastered track will be used.Note that using
0
is not possible as this is the "idle state" of the CO.0
also meansdefault
(load as stem deck on primary decks, or load the pre-mastered track on sampler/preview), which is equivalent to[ChannelX],LoadSelectedTrack
Individual bit mask will be reflected in the
stem_Y_selected
CO.