-
-
Notifications
You must be signed in to change notification settings - Fork 287
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[docs] clean up window function guide
- Loading branch information
1 parent
4fcf9ac
commit bebc9a9
Showing
1 changed file
with
41 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,46 +1,62 @@ | ||
# Perform operations on groups of rows | ||
--- | ||
sheet: Sheet | ||
--- | ||
# Create a window over consecutive rows | ||
|
||
The window function creates a new column where each row contains of rows before and/or after the current row in the source column. | ||
Window functions enable computations that relate the current window to surrounding rows, like cumulative sum, rolling averages or lead/lag computations. | ||
|
||
Window functions enable computations that relate the current window to surrounding rows, for example: | ||
- cumulative sum | ||
- rolling averages | ||
- lead/lag computations | ||
{help.commands.addcol-window} | ||
|
||
## Window functions operation on columns | ||
With large window sizes, [:code]g'[/] (`freeze-sheet`) to calculate all cells and copy the entire sheet into a new source sheet, which will conserve CPU. | ||
|
||
Create a window for a column. The new column will contain the current row, and also any before or after rows specified when creating the window. | ||
## Examples | ||
|
||
- {help.command.addcol-window} | ||
date color price | ||
---------- ----- ----- | ||
2024-09-01 R 30 | ||
2024-09-02 B 28 | ||
2024-09-03 R 100 | ||
2024-09-03 B 33 | ||
2024-09-03 B 99 | ||
|
||
To conserve memory and speed with large windows, one approach is to: | ||
1. add any expressions that operate on the window expression. | ||
2. Freeze the sheet [:keys]g'[/]. | ||
|
||
## Examples | ||
1. [:keys]#[/] (`type-int`) on the **price** column to type as int. | ||
2. [:keys]w[/] (`addcol-window`) on the **price** column, followed by `1 2`, to create a window consisting of 4 rows: 1 row before the current row, and 2 rows after. | ||
3. To create a moving average of the values in the window, add a new column with a python expression: [:keys]=[/] (`addcol-expr`) | ||
followed by `sum(price_window)/len(price_window)` | ||
|
||
After creating a window, use a python expression to operate on it. | ||
date color price price_window sum(price_window)/len(price_window) | ||
---------- ----- ----- ------------------- ----------------------------------- | ||
2024-09-01 R 38 [4] ; 38; 28; 100 41.5 | ||
2024-09-02 B 28 [4] 38; 28; 100; 33 49.75 | ||
2024-09-03 R 100 [4] 28; 100; 33; 99 65.0 | ||
2024-09-03 B 33 [4] 100; 33; 99; 58.0 | ||
2024-09-03 B 99 [4] 33; 99; ; 33.0 | ||
|
||
For example, given a windown column 'win', to create a moving average of the | ||
values in the window, add a new column with a python expression. | ||
|
||
``` | ||
=sum(win)/len(win) | ||
``` | ||
## Workflows | ||
|
||
### Create a cumulative sum | ||
|
||
- set the before window size to >= the total number of rows in the table, and the after rows to 0. | ||
- add an expression of `sum(windows)` where `window` is the name of the window function column. | ||
1. Set the before window size to the total number of rows in the table, and the after rows to 0. In the above example that would be `w 5 0` (`addcol-window`). | ||
2. Add an expression ([:keys]=[/] (`addcol-expr`) of `sum(window)` where `window` is the name of the window function column. | ||
|
||
### Compute rank | ||
|
||
https://github.com/saulpw/visidata/discussions/2280#discussioncomment-8314593 | ||
See https://github.com/saulpw/visidata/discussions/2280 for a discussion on how to use window functions to compute a rank column, where the rank restarts from 1 each time the value changes. E.g: | ||
|
||
value rank | ||
----- ---- | ||
A 1 | ||
A 2 | ||
B 1 | ||
C 1 | ||
C 2 | ||
C 3 | ||
|
||
### Compute the change between rows | ||
|
||
1. Create a window function of size 1 before and 0 after | ||
2. Add a python expression. Assume the window function column is 'win', and the current (integer) column is named seconds: | ||
1. `w 1 0` to create a window function of size 1 before and 0 after | ||
2. Add a python expression. Assume the window function column is 'win': | ||
`=win[1] - win[0] if len(win) > 1 else None` | ||
|
||
|