Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heat maps #87

Closed
eaburns opened this issue Mar 15, 2015 · 14 comments
Closed

Heat maps #87

eaburns opened this issue Mar 15, 2015 · 14 comments

Comments

@eaburns
Copy link
Member

eaburns commented Mar 15, 2015

Original issue 60 created by eaburns on 2012-08-15T13:38:34.000Z:

The key for the heat map can be a 'glyph' that is drawn past the maximum x value.

@eaburns
Copy link
Member Author

eaburns commented Mar 15, 2015

Comment #1 originally posted by eaburns on 2012-08-24T20:49:36.000Z:

<empty>

@eaburns
Copy link
Member Author

eaburns commented Mar 15, 2015

Comment #2 originally posted by eaburns on 2012-09-06T15:33:01.000Z:

<empty>

@eaburns
Copy link
Member Author

eaburns commented Mar 15, 2015

Comment #3 originally posted by eaburns on 2013-02-03T14:00:18.000Z:

Peter A. Cejchan sent me this pallet that should work well for heat maps.

@eaburns
Copy link
Member Author

eaburns commented Mar 15, 2015

Comment #4 originally posted by eaburns on 2013-02-03T17:03:19.000Z:

Chris made this patch a while ago, but he said that it needs a lot of work.

@eaburns
Copy link
Member Author

eaburns commented Mar 15, 2015

Comment #5 originally posted by eaburns on 2013-02-20T19:24:42.000Z:

<empty>

@eaburns
Copy link
Member Author

eaburns commented Mar 15, 2015

Comment #6 originally posted by eaburns on 2013-04-24T11:58:32.000Z:

Here's the latest code from Peter A. Cejchan. Probably needs some tweaking. IIRC, it assumes that the data is pre-binned. We will want two Make functions for it: 1) pre-binned data and 2) non-binned data.

@ctessum
Copy link
Contributor

ctessum commented Mar 18, 2015

I have an alternative implementation of this for continuous data: https://github.com/ctessum/carto/blob/master/colormap.go

It may (or may not) be worthwhile to combine them.

@kortschak
Copy link
Member

kortschak commented Mar 18, 2015 via email

@ctessum
Copy link
Contributor

ctessum commented Mar 19, 2015

My understanding is that the color palettes currently in plot are all for binned or categorical data, while mine is for continuous data. Mine will also draw a legend for the color map; I didn't look at the other one too closely but I didn't see that ability. So it looks like they're mainly complementary rather than competing.

@ctessum
Copy link
Contributor

ctessum commented Mar 19, 2015

@kortschak
Copy link
Member

It is not true that the current heat map implementation is for binned data; the qualitative nature of the palette and the actions taken by the HeatMap plotter are orthogonal.

The issue here then is whether the palettes are solely for binned data - again, this is not true (my use of heat maps - not created by this particular plotter, but by one that this takes its code from - is entirely continuous, being gene expression level data).

In the palette/brewer package, there are three classes of palettes, only one is for categorical data. Now, while the others are for continuous data they are binned into visually differentiable colours.

In the palette package itself, three palette creation functions are provided, all are intended for use with continuous data and allow the client to specify the colour resolution.

There are colour theory approaches to generating arbitrary colour resolution palettes arguably as effective as the Brewer palettes, but I didn't see the complexity of the code being worth it for the pay back. I can dig up some papers if you are interested. (Alternatively, it is trivial to take an arbitrary palette and interpolate colours between the palette-defined colours to achieve arbitrary colour resolution, but in practice arbitrary resolution is not wildly helpful - I'd recommend you read some of Cynthia Brewer's commentary on palette creation - I considered this approach when I was writing the palette package and decided against it).

Which ever palette creation you use, we necessarily bin continuous data to some degree (we work on finite state machines, but fortunately have limited colour differentiation ability, particularly when figures are small and distances between points on a map are not great - I'd argue that in your figure above, more than 256 colours would be overkill, but we could do more without any problem). The HeatMap plotter gets around this by automatically binning continuous data to the level of colour resolution provided by the given palette.

The design of colour palettes is an open issue, #20. However, I don't see a good consistent approach that improves the situation.

The absence of a legend is also an open issue, #17. You are welcome to pick that up if you want.

@ctessum
Copy link
Contributor

ctessum commented Mar 19, 2015

While I don't disagree with anything you say, I think it is useful to distinguish between the treatment of categorical and continuous variables in the API, although I agree that it is not necessary in the underlying palette implementation. For instance, a legend for a color map of a categorical variable would have a label for each color and visual discontinuities between them, whereas a legend for a color map of a continuous variable should appear to the human eye as a continuous gradient (even though it would only need to use 256 or fewer colors), and only some of the colors would be labeled.

I think this figure is an example of why the distinction is important:

http://www.nrel.gov/gis/images/80m_wind/USwind300dpe4-11.jpg

For whatever reason, but I think quite often it is for lack of good tools, people often use a categorical color scheme to represent continuous data, as is the case in the figure I link to here. In the case of this figure, 8.01 and 8.99 m/s are the same color, but 9.01 m/s is a different color. This particular figure isn't too egregious in this sense, but it's also difficult to read the legend with so many labels. Sometimes people will decrease the number of colors to make the legend easier to read, which ends up creating patterns where there aren't any. (This figure also creates patterns where there aren't any, but in my mind it's moreso because of their choice of colors rather than the binning issue).

@kortschak
Copy link
Member

While I don't disagree with anything you say, I think it is useful to distinguish between the treatment of categorical and continuous variables in the API

Sure. How should we do this? Maybe have a method Labels() []string extending the palette.Palette interface to make palette.Qualitative. This would require that we have a function that associates a []string with the given palette. The possible extension of this is that we could define a palette.Continuous with a method FractionalColor(float64) color.Color (looking for a much better name here) which returns the fractional colour of the palette over the range [0, 1] (either interpolated or not - we leave this to the implementer).

These are probably worth adding to #20.

Addressing the rest: sure again. The issue is that many people are lousy at figures. We can't fix that. That colour scheme should have been picked up by reviewers if not by the authors.

@kortschak
Copy link
Member

I think this can be closed. Reopen if you disagree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants