Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify usage of initial values in selection intervals with dates #3643

Open
1 of 3 tasks
dsmedia opened this issue Oct 19, 2024 · 4 comments
Open
1 of 3 tasks

Clarify usage of initial values in selection intervals with dates #3643

dsmedia opened this issue Oct 19, 2024 · 4 comments

Comments

@dsmedia
Copy link
Contributor

dsmedia commented Oct 19, 2024

What is your suggestion?

When setting initial values for a selection interval based on date values, users must manually convert dates to timestamps, as discussed here. This requirement does not appear to be clearly documented or exemplified in this repository. Solutions could be one or more of the following:

Potential Solutions

  • Example Gallery: Update this example to include an initial value for the interval (see below)
  • API Reference: Briefly reference in the description of the value paramater in the API Reference here
  • Docs: Incorporate into the docs on selections

Questions:

  1. Which, if any, of these suggestions should be implemented?
  2. The linked stackoverflow discussion above highlights an important distinction between datetime.datetime and numpy.datetime64 in this context. Should this point be referenced in any documentation, and if so, how?

Sample diff for current example gallery item

import altair as alt
from vega_datasets import data

source = data.sp500.url

+ # Define an initial date range as timestamps
+ x_init = pd.to_datetime(['2005-01-01', '2009-01-01']).astype(int) / 1E6

+# Create a brush (interval) selection with initial range
+brush = alt.selection_interval(
+    encodings=['x'], 
+   value={'x': list(x_init)}  # Initialize selection
+)

brush = alt.selection_interval(encodings=['x'])

base = alt.Chart(source, width=600, height=200).mark_area().encode(
    x = 'date:T',
    y = 'price:Q'
)

upper = base.encode(
    alt.X('date:T').scale(domain=brush)
)

lower = base.properties(
    height=60
).add_params(brush)

upper & lower

Have you considered any alternative solutions?

I've proposed three possible solutions above.

@dangotbanned
Copy link
Member

dangotbanned commented Oct 20, 2024

Thanks for raising this @dsmedia

When setting initial values for a selection interval based on date values, users must manually convert dates to timestamps, as discussed here. This requirement does not appear to be clearly documented or exemplified in this repository.

I agree with your assessment that this lacks documentation, I certainly didn't know this was possible!

However, I must say I really dislike the solution provided on stackoverflow
(this isn't a critique directed towards you @dsmedia).

Given that was written over 4 years ago, I'd be more interested in either:

  • Fixing the underlying limitation, that prevents using date/datetime/timestamps ranges
  • Trying to find a solution that isn't pandas-only
    • Some links that could be helpful: 1, 2, 3
  • At the very least, solving w/ pandas in an easy to understand way
    • I'm not a pandas user, but writing pd.DatetimeIndex(...).astype(int) / 1e6 doesn't seem intuitive

Confirmed solutions

These are all the solutions I can confirm work so far:

Code block

from datetime import date, datetime

import altair as alt
import pandas as pd
import polars as pl
from vega_datasets import data

source = data.sp500.url

# Define an initial date range as timestamps
window_pd = pd.to_datetime(["2005-01-01", "2009-01-01"]).astype(int) / 1e6
window_pl = pl.Series([date(2005, 1, 1), date(2009, 1, 1)]).dt.timestamp("ms")
window_stdlib = (
    datetime(2005, 1, 1).timestamp() * 1e3,
    datetime(2009, 1, 1).timestamp() * 1e3,
)
window_alt = alt.DateTime(year=2005), alt.DateTime(year=2009)

# Create a brush (interval) selection with initial range
brush = alt.selection_interval(encodings=["x"], value={"x": window_pd})
brush = alt.selection_interval(encodings=["x"], value={"x": window_pl})
brush = alt.selection_interval(encodings=["x"], value={"x": window_stdlib})
brush = alt.selection_interval(encodings=["x"], value={"x": window_alt})

base = (
    alt.Chart(source, width=600, height=200).mark_area().encode(x="date:T", y="price:Q")
)
upper = base.encode(alt.X("date:T").scale(domain=brush))
lower = base.properties(height=60).add_params(brush)
chart = upper & lower
chart

Since this is both the shortest, and doesn't introduce a dependency - I would lean towards it personally:

window_alt = alt.DateTime(year=2005), alt.DateTime(year=2009)

  • Example Gallery: Update this example to include an initial value for the interval (see below)

I think it would make this functionality more discoverable if it were a separate example, with a more specific title like:

Interval Selection (fixed-width sliding window)

Or include some other time-series terminology, if that can provide a more accurate description?

Would be similar to this pair of examples:


  • API Reference: Briefly reference in the description of the value paramater in the API Reference here

AFAIK, the description there is copy/pasted from the generated docs - which traces back via these two:


  • Docs: Incorporate into the docs on selections

Probably a good idea, also maybe a mention in times_and_dates


  1. The linked stackoverflow discussion above highlights an important distinction between datetime.datetime and numpy.datetime64 in this context. Should this point be referenced in any documentation, and if so, how?

I want to say no, simply because we can't do this for every datetime type across all libraries.
Obviously numpy's popularity might be worth considering, but I think it would be more valuable to describe what properties vega-lite will be expecting.

It seems to me that it wants a POSIX timestamp with [ms] resolution.
Knowing that, a user can then search for how library X can convert to that representation.

@dangotbanned
Copy link
Member

@dsmedia happy for you to move forward on this now that (#3653), (#3662) have been merged

@dangotbanned
Copy link
Member

@dsmedia how you feel about this issue, following #3667?

You mentioned 3 potential solutions, I think some narrative docs might work well here:

cc @joelostblom as we recently updated interactions in #3544 and you have two other PRs open (#3629, #3628) that would add more content

@dsmedia
Copy link
Contributor Author

dsmedia commented Jan 22, 2025

Sure. There was a section on this in the now-obsolete V4 Vega Lite docs (https://vega.github.io/vega-lite-v4/docs/init.html#:~:text=To%20initialize%20a%20selection%2C%20set,initial%20value%20of%20the%20selection.) I don't see this as clearly spelled out now in the vega-lite docs now that init is no longer a valid property.

My instinct would normally be to apply what's in the vega-lite docs to the Altair documentation. Do we feel this also would benefit from better documentation upstream?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants