-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] inspectable set-valued domains for distributions #244
Comments
FYI @VascoSch92 |
I will start to work on a first version of a module for symbolic representation of sets. The idea is to extend from I still don't get 100% what it would be the application in I will open a draft PR as soon as I have something interesting. In this way we can discuss the code. |
Great! So you think it's better to inherit from If I may ask, what are your pros/cons and weighting? Just curious. |
Actually I was playing a little bit with the set implementation of One could think to use that and extend it to implement the measure of a set and integral computations. However, adding Perhaps, clarifying the exact API needed to the project could lead to a decision. From what I understand, the main purpose of this module is to computed |
Basically, yes - that's the key requirement. Yes, the "weighty dependency" argument is convincing. I'd agree it outweights the "do not reinvent the wheel" one, as it's going to be a small wheel (for now). |
Why we don't try to solve the problem directly using the integration provided by scipy ? |
I don't think that works for mixed distributions, i.e., mix of deltas and (abs) continuous? I'd still think you need some explicit representation of the discrete part. |
Should we try with |
yes but we cannot separate the two parts?
Yes |
What do you mean by that? Do you mean this as a suggestion, or a statement? |
Let's say you have a mixed distribution. Can we separate the two parts (dense and discrete), compute integrals on both and then put them together? |
yes, that is exactly my thinking. But for that, you'd need to represent pmf and pdf separately. For both, you'd need some representation of domain to set up the integration, which brings us to the topic of this issue. |
Can you give a concrete example of a mixed distribution you would like to implement? |
Sure, here are two:
|
Sorry but i still have problem to understand the clipped normal example. We have the max of two continuous functions, therefore should we have a continuous support for pdf and cdf not? |
Yes, the full support is continuous, but the distribution is mixed, so by the Lebesgue decomposition theorem we can decompose it in a non-trivial absolutely continuous part, a pure point part, and there's no singular part because those aren't distributions that we want to look at (😁) The clipped normal has two supports, therefore:
Some confusion can be coming from the word "continuous", which is overloaded, as it could be used as a property or qualifier for
|
Ah ok now is much clear. Sorry I'm not an expert in probability theory :-( In practical, you want to compose the normal distribution and the mass measure at 0 to have the clipped normal. In the instance of the normal distribution you give the support and same for the mass measure, right? from that you can compute what you need. ok 👍 now it is clear... i can start to try coding something and see if it fit the needs of the package |
Yes, exactly.
What's your design, if I may ask? I'd go with modifying |
Basic idea: parent class Set which extend from Then we have the following questions:
We can expose in the I will try to open a draft/sketch PR for feedbacks and guidance as soon as possible :-) |
Agreed - we may have to distinguish domains for the discrete and the continuous part as well.
Regarding requirements:
|
After a very first draft for the module We will work on the branch #326 until a stable API is found. After that, we will merge into To find a valid and stable API, domains should be introduced for at least on of the following distributions:
With domains, we want also to introduce the 2 new methods:
Questions:
|
The API is already specified - it has been introduced since 2.2.2, after you branched off. If you update from
That's a good question. I was thinking about working with properties and attributes primarily, but now that you mention it, we may consider tags as well. I have no clear answer to this yet, input appreciated.
Will reply with a list in the next post. |
don't have that
no "atomic" distribution of this type currently, but you can construct one using |
A question is also if we are interested to the domain of a distribuition or to the support. I think the second one is more interesting right? |
Yes, for the moment it is, given that all distributions - even discrete ones - have a support that embeds canonically into the reals. With a distinction on continuous and discrete part. |
It would be useful if distributions had some degree of inspectability with respect to their domains (sets), use cases include:
pdf
andpmf
in distributions, discrete, continuous, and mixed #229Some discussion has already taken place here: VascoSch92/sequentium#46
also regarding possible ways to implement this.
Options discussed:
scipy
Sets
, possibly alsostats
BaseObject
sklearn.utils
parameter checkingSome issues from
skpro
architecture which may not be obvious how to cover:BaseObject
supports parametric objects? Composites are supported by all three options above.The text was updated successfully, but these errors were encountered: