Skip to content

Commit

Permalink
Rework and clean up documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
amotl committed Jan 16, 2025
1 parent 9df6702 commit a69a9ca
Show file tree
Hide file tree
Showing 13 changed files with 268 additions and 194 deletions.
4 changes: 2 additions & 2 deletions doc/cfr/backlog.md → doc/backlog/cfr.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# CrateDB CFR Backlog
# Backlog for `ctk cfr`

## Iteration +1
- Software tests
- Converge output into tar archive
- Combine with `ctk wtf info`
- Combine with `ctk info cluster`
- On sys-export, add it to the CFR package
- After sys-import, use it to access the imported data

Expand Down
9 changes: 9 additions & 0 deletions doc/backlog/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Backlogs

```{toctree}
:maxdepth: 1
main
info
cfr
```
8 changes: 4 additions & 4 deletions doc/wtf/backlog.md → doc/backlog/info.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CrateDB WTF Backlog
# Backlog for `ctk info`

## Iteration +1
- Display differences to the standard configuration
Expand All @@ -10,8 +10,8 @@
- Inform about [shard imbalances](https://community.cratedb.com/t/cratedb-database-logs-showing-shard-is-now-inactive-and-threads-are-getting-blocked/1617/16).

## Iteration +2
- Make `cratedb-wtf logs` also optionally consider `sys.` tables.
- cratedb-wtf explore table|shard|partition|node
- Make `ctk info logs` also optionally consider `sys.` tables.
- Make `ctk info ...` explore table|shard|partition|node
- High-level analysis, evaluating a set of threshold rules
- High-level summary reports with heuristics support
- Network diagnostics?
Expand All @@ -35,7 +35,7 @@
- Proper marshalling of timestamp values (ISO 8601)
- Expose collected data via HTTP API
```
cratedb-wtf serve
ctk info serve
```
- Provide `scrub` option also via HTTP
- Complete collected queries and code snippets
Expand Down
4 changes: 2 additions & 2 deletions doc/backlog.md → doc/backlog/main.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Backlog
# Main Backlog

## Iteration +1
- Table Loader: Refactor URL dispatcher, use fsspec
Expand Down Expand Up @@ -61,7 +61,7 @@
- Kafka:
- https://github.com/bakdata/streams-bootstrap
- https://pypi.org/project/kashpy/
- CFR/WTF
- CTK INFO/CFR
- https://github.com/peekjef72/sql_exporter
- Migrate / I/O adapter
- https://community.cratedb.com/t/migrating-from-postgresql-or-timescale-to-cratedb/620
Expand Down
96 changes: 5 additions & 91 deletions doc/cfr/index.md
Original file line number Diff line number Diff line change
@@ -1,99 +1,13 @@
(cfr)=
# CrateDB Cluster Flight Recorder (CFR)

CFR helps to collect information about CrateDB clusters for support requests
and self-service debugging.

## Install
```shell
pip install --upgrade 'cratedb-toolkit[cfr]'
```
Alternatively, use the Docker image at `ghcr.io/crate/cratedb-toolkit`.

## Synopsis

Define CrateDB database cluster address using the `CRATEDB_SQLALCHEMY_URL`
environment variable.
```shell
export CRATEDB_SQLALCHEMY_URL=crate://localhost/
```

Export system table information into timestamped file, by default into the
current working directory, into a directory using the pattern
`cfr/{clustername}/{timestamp}/sys` directory.
```shell
ctk cfr sys-export
```

Import system table information from given directory.
```shell
ctk cfr sys-import file://./cfr/crate/2024-04-18T01-13-41/sys
```


## Usage

### Target and source directories

The target directory on the export operation, and the source directory on the
import operation, can be specified using a single positional argument on the
command line.

Export system table information into given directory.
```shell
ctk cfr sys-export file:///var/ctk/cfr
```

Import system table information from given directory.
```shell
ctk cfr sys-import file:///var/ctk/cfr/crate/2024-04-18T01-13-41/sys
```

Alternatively, you can use the `CFR_TARGET` and `CFR_SOURCE` environment
variables.

### CrateDB database address

The CrateDB database address can be defined on the command line, using the
`--cratedb-sqlalchemy-url` option, or by using the `CRATEDB_SQLALCHEMY_URL`
environment variable.
```shell
ctk cfr --cratedb-sqlalchemy-url=crate://localhost/ sys-export
```


## OCI

If you don't want or can't install the program, you can also use its OCI
container image, for example on Docker, Postman, Kubernetes, and friends.

Optionally, start a CrateDB single-node instance for testing purposes.
```shell
docker run --rm -it \
--name=cratedb --publish=4200:4200 --env=CRATE_HEAP_SIZE=4g \
crate/crate:nightly -Cdiscovery.type=single-node
```

Define the database URI address, and an alias to the `cfr` program.
```shell
echo "CRATEDB_SQLALCHEMY_URL=crate://localhost/" > .env
alias cfr="docker run --rm -it --network=host --volume=$(PWD)/cfr:/cfr --env-file=.env ghcr.io/crate/cratedb-toolkit:latest ctk cfr"
```

Export system table information.
```shell
cfr sys-export
```

Import system table information.
```shell
cfr sys-import cfr/crate/2024-04-18T01-13-41/sys
```

CrateDB Toolkit provides a few utilities about diagnostics and metadata
information collection and recording per `ctk cfr`.

```{toctree}
:maxdepth: 1
:hidden:
backlog
info
jobstats
systable
```
10 changes: 10 additions & 0 deletions doc/cfr/info.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
(cfr-info)=
# Cluster information recorder

Record complete outcomes of `ctk info cluster` and `ctk info jobs`.
```shell
ctk cfr info record
```
:::{tip}
See also {ref}`cluster-info`.
:::
8 changes: 8 additions & 0 deletions doc/cfr/jobstats.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
(cfr-jobstats)=
# Job statistics collector

Collect and display job statistics.
```shell
ctk cfr jobstats collect
ctk cfr jobstats view
```
96 changes: 96 additions & 0 deletions doc/cfr/systable.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
(cfr-systable)=
# System table exporter

CFR's `sys-export` and `sys-import` commands support collecting and analyzing
information about CrateDB clusters for support requests and self-service
debugging.

## Install
```shell
pip install --upgrade 'cratedb-toolkit[cfr]'
```
:::{tip}
Alternatively, use the Docker image per `ghcr.io/crate/cratedb-toolkit`.
For more information about installing CrateDB Toolkit, see {ref}`install`.
:::


## Synopsis

Define CrateDB database cluster address using the `CRATEDB_SQLALCHEMY_URL`
environment variable.
```shell
export CRATEDB_SQLALCHEMY_URL=crate://localhost/
```

Export system table information into timestamped file, by default into the
current working directory, into a directory using the pattern
`cfr/{clustername}/{timestamp}/sys` directory.
```shell
ctk cfr sys-export
```

Import system table information from given directory.
```shell
ctk cfr sys-import file://./cfr/crate/2024-04-18T01-13-41/sys
```


## Usage

### Target and source directories

The target directory on the export operation, and the source directory on the
import operation, can be specified using a single positional argument on the
command line.

Export system table information into given directory.
```shell
ctk cfr sys-export file:///var/ctk/cfr
```

Import system table information from given directory.
```shell
ctk cfr sys-import file:///var/ctk/cfr/crate/2024-04-18T01-13-41/sys
```

Alternatively, you can use the `CFR_TARGET` and `CFR_SOURCE` environment
variables.

### CrateDB database address

The CrateDB database address can be defined on the command line, using the
`--cratedb-sqlalchemy-url` option, or by using the `CRATEDB_SQLALCHEMY_URL`
environment variable.
```shell
ctk cfr --cratedb-sqlalchemy-url=crate://localhost/ sys-export
```


## OCI

If you don't want or can't install the program, you can also use its OCI
container image, for example on Docker, Postman, Kubernetes, and friends.

Optionally, start a CrateDB single-node instance for testing purposes.
```shell
docker run --rm -it \
--name=cratedb --publish=4200:4200 --env=CRATE_HEAP_SIZE=4g \
crate/crate:nightly -Cdiscovery.type=single-node
```

Define the database URI address, and an alias to the `cfr` program.
```shell
echo "CRATEDB_SQLALCHEMY_URL=crate://localhost/" > .env
alias cfr="docker run --rm -it --network=host --volume=$(PWD)/cfr:/cfr --env-file=.env ghcr.io/crate/cratedb-toolkit:latest ctk cfr"
```

Export system table information.
```shell
cfr sys-export
```

Import system table information.
```shell
cfr sys-import cfr/crate/2024-04-18T01-13-41/sys
```
45 changes: 4 additions & 41 deletions doc/cmd/index.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,9 @@
# Utility Commands

(tail)=
## ctk tail
A bundle of ad hoc inquiry utilities, for diagnostics and more.

`ctk tail` displays the most recent records of a database table.
It also provides special decoding options for the `sys.jobs_log` table.
```{toctree}
:maxdepth: 1
:::{rubric} Synopsis
:::
```shell
ctk tail -n 3 sys.summits
tail
```

:::{rubric} Options
:::
You can combine `ctk tail`'s JSON and YAML output with programs like `jq` and `yq`.
```shell
ctk tail -n 3 sys.summits --format=json | jq
ctk tail -n 3 sys.summits --format=yaml | yq
```
Optionally poll the table for new records by using the `--follow` option.
```shell
ctk tail -n 3 doc.mytable --follow
```

:::{rubric} Decoder for `sys.jobs_log`
:::
`ctk tail` provides a special decoder when processing records of the `sys.jobs_log`
table. The default output format `--format=log` prints records in a concise
single-line formatting.
```shell
ctk tail -n 3 sys.jobs_log
```
The `--format=log-pretty` option will format the SQL statements for optimal
copy/paste procedures. Together with the `--follow` option, this provides
optimal support for ad hoc tracing of SQL statements processed by CrateDB.
```shell
ctk tail -n 3 sys.jobs_log --follow --format=log-pretty
```

:::{warning}
Because `ctk tail` works by submitting SQL commands to CrateDB, using its `--follow`
option will spam the `sys.jobs_log` with additional entries. The default interval
is 0.1 seconds, and can be changed using the `--interval` option.
:::
44 changes: 44 additions & 0 deletions doc/cmd/tail.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
(tail)=
# ctk tail

`ctk tail` displays the most recent records of a database table.
It also provides special decoding options for the `sys.jobs_log` table.

:::{rubric} Synopsis
:::
```shell
ctk tail -n 3 sys.summits
```

:::{rubric} Options
:::
You can combine `ctk tail`'s JSON and YAML output with programs like `jq` and `yq`.
```shell
ctk tail -n 3 sys.summits --format=json | jq
ctk tail -n 3 sys.summits --format=yaml | yq
```
Optionally poll the table for new records by using the `--follow` option.
```shell
ctk tail -n 3 doc.mytable --follow
```

:::{rubric} Decoder for `sys.jobs_log`
:::
`ctk tail` provides a special decoder when processing records of the `sys.jobs_log`
table. The default output format `--format=log` prints records in a concise
single-line formatting.
```shell
ctk tail -n 3 sys.jobs_log
```
The `--format=log-pretty` option will format the SQL statements for optimal
copy/paste procedures. Together with the `--follow` option, this provides
optimal support for ad hoc tracing of SQL statements processed by CrateDB.
```shell
ctk tail -n 3 sys.jobs_log --follow --format=log-pretty
```

:::{warning}
Because `ctk tail` works by submitting SQL commands to CrateDB, using its `--follow`
option will spam the `sys.jobs_log` with additional entries. The default interval
is 0.1 seconds, and can be changed using the `--interval` option.
:::
Loading

0 comments on commit a69a9ca

Please sign in to comment.