diff --git a/README.md b/README.md
new file mode 100644
index 00000000..a47339b5
--- /dev/null
+++ b/README.md
@@ -0,0 +1,79 @@
+# Geneve
+
+Geneve is a data generation tool, its name stands for GENerate EVEnts.
+
+To better understand its basics, consider the Elastic Security's
+[detection engine](https://www.elastic.co/guide/en/security/current/detection-engine-overview.html).
+It regularly searches one or more indices for suspicious events, when a
+match is found it creates an alert. To do so it needs detection rules
+which define what a _suspicious event_ looks like.
+
+The original goal of Geneve is then summarized by:
+
+> Given a detection rule, generate source events that would trigger an alert creation.
+
+It does so by analyzing the rule, building an abstract syntax tree of the
+enclosed query and translating it to an intermediate language that is used
+for generating documents (= events) over and over.
+
+What became obvious over time is that the query at the heart of each rule
+is actually a powerful way to drive the documents generation that goes
+well beyond the alerts triggering.
+
+Additionally, one thing is generating garbage data that satisfies a rule
+and another is generating realistic data that can be analyzed with Kibana,
+which is an implicit goal of the tool.
+
+This last is a quite harder nut to crack than the original goal and is
+currently under development.
+
+If you want to try it, read [Getting started](docs/getting_started.md).
+
+# Status
+
+## Data modeling
+
+The rules/queries parsing, AST creation and IR generation are quite
+developed and rigorously tested by the CI/CD pipelines. The generated
+events are good enough to trigger many of the expected alerts on various
+versions of the stack, from 8.2.0 to 8.6.0, but the work is necessarily
+incomplete albeit as correct as possible.
+
+The detection rules set used for the tests is separately loaded into
+Geneve and is currently locked to version 8.2.0 (718 rules in total). Next
+step is to use the rules preloaded in the Kibana under test
+(https://github.com/elastic/geneve/issues/125).
+
+Kind of issues observed in this area:
+
+1. skipped rules due to unimplemented rule type (ie. threshold) or query
+ language (ie. lucene).
+ 73 rules.
+2. generation errors due to unimplemented query language features or
+ improvements needed in what is already implemented.
+ 80 rules.
+3. incorrect generation, the expected alerts are actually not created.
+ 5 rules.
+
+The first two points are detailed in the
+[Documents generation from detection rules](/tests/reports/documents_from_rules.md)
+test report, the last is in the
+[Alerts generation from detection rules](tests/reports/alerts_from_rules.md) one.
+
+Number of rules for which correct data is generated and alerts are created: 560.
+
+## Data realism
+
+Allowing the user to "click through" requires that generated data exploits
+the relations that Kibana is made to observe. Having relations implies
+having also the entities that such relations connect together, entities
+that need to be consistent in the whole generation batch.
+
+The problem is being understood more and more, parts of its solution are
+already implemented others are still sketched.
+
+## User interface
+
+Geneve is composed of a Python module and a REST API server that exposes
+it. The Python API is quite simple and stable, the REST API instead has
+raw edges and needs proper simplification.
diff --git a/docs/data_model.md b/docs/data_model.md
new file mode 100644
index 00000000..baf17602
--- /dev/null
+++ b/docs/data_model.md
@@ -0,0 +1,105 @@
+# Data model
+
+The Geneve data model describes what data Geneve is expected to generate,
+it guides and constrains the data generation process so that the output
+satisfies your criteria.
+
+Think in this way: data generation is a random process, at its root it
+just produces a long random string made of 0s and 1s. What you actually
+want is to shape the result and channel the randomness so that the
+generated data looks sensible in your context and at the same time never
+quite the same.
+
+In essence, you tell Geneve what you are searching for and it will return
+a json document that is a plausible answer to your search, every time the
+answer is different. If this sounds like "queries" to you, you're right:
+Geneve input is queries.
+
+## Queries
+
+You have to provide at least one query to Geneve, if you give it multiple
+Geneve will randomly choose the one it will generate the document for at
+that round.
+
+Suppose you have this query:
+
+```
+process.name: "*.exe"
+```
+
+What it tells to Geneve is actually: you want the documents to have a field
+named `process.name` and its content needs to match the wildcard `*.exe`.
+
+Generated documents could be:
+
+```json
+{"process.name": "excel.exe"}
+```
+
+```json
+{"process.name": "winword.exe"}
+```
+
+but also, more likely, random letters in the name such as
+
+```json
+{"process.name": "LDow.exe"}
+```
+
+or
+
+```json
+{"process.name": "OjiRlQMX.exe"}
+```
+
+If you really want to control the options, then you can enumerate them
+
+```
+process.name: ("excel.exe" or "winword.exe" or "regedit.exe")
+```
+
+the generated documents can only be one of the three possible, you
+restricted the choice Geneve can do.
+
+Let's do another one
+
+```
+process.name: "10.0.0.0/8"
+```
+
+you get
+
+```json
+{"process.name": "10.0.0.0/8"}
+```
+
+as surprising as it can be, it's the only answer Geneve can give back if you
+don't train it to actually consider `process.name` to be of type `ip address`.
+
+Here comes into play the schema and how it defines what fields and their type. We'll assume
+[ECS](https://www.elastic.co/guide/en/ecs/current/ecs-field-reference.html)
+is in use but Geneve does not, if you want ECS you need to load it (see
+[Loading the schema](https://github.com/cavokz/geneve/blob/add-some-docs3/docs/getting_started.md#loading-the-schema)).
+If you use fields not in the schema, Geneve will consider them of type `plain text` (`keyword`, actually).
+
+Now try again with a more appropriate field
+
+```
+source.ip: "10.0.0.0/8"
+```
+
+you get, for example
+
+```json
+{"source.ip": "10.23.84.86"}
+```
+
+## Query languages
+
+All the queries in the examples above are expressed in the
+[Kibana Query Language](https://www.elastic.co/guide/en/kibana/current/kuery-query.html) (Kuery)
+but you can also use the
+[Event Query Language](https://www.elastic.co/guide/en/elasticsearch/reference/current/eql.html) (EQL).
+These are the only two languages supported at the moment but it's well possible to add others.
+
+Independently from the query language used, fields remain those defined by the schema.
diff --git a/docs/getting_started.md b/docs/getting_started.md
new file mode 100644
index 00000000..e64f4280
--- /dev/null
+++ b/docs/getting_started.md
@@ -0,0 +1,325 @@
+# Getting started
+
+## Data generation process
+
+The data generation process uses this analogy: generated data flows from source to sink.
+
+To generate data it is then necessary to define:
+
+* `source`: what data is generated, eg. data model
+* `sink`: where data is sent to, eg. ES index
+* `flow`: how data is transmitted, eg. how fast or how much?
+* `schema`: fields definition, eg. ECS 8.2.0
+
+Each of the above is handled by its own REST API endpoint. An arbitrary
+number of sources, sinks, flows and schemas can be defined on the same
+server.
+
+## Install
+
+Currently Geneve is packaged only for [Homebrew](https://brew.sh), you
+need first to install the Geneve tap
+
+```shell
+$ brew tap elastic/geneve
+```
+
+then the tool itself
+
+```shell
+$ brew install geneve
+```
+
+## REST API server
+
+Data is generated by the Geneve server, you start it with
+
+```shell
+$ geneve serve
+2023/01/31 16:40:23 Control: http://localhost:9256
+```
+
+The server keeps the terminal busy with its logs, to stop just press `^C`.
+The first line in the log shows where to reach it, this is the base url of
+the server, all the API endpoints are reachable (but not browseable) under
+`api/`.
+
+For the rest of this document we'll assume that the following shell
+variables are set:
+
+* `$GENEVE` points to the Geneve server, url `http://localhost:9256`
+* `$TARGET_ES` is the url of the target Elasticsearch instance
+* `$TARGET_KIBANA` is the corresponding Kibana's url
+
+Now open a separate terminal to operate on the server with curl.
+
+## Loading the schema
+
+The schema describes the fields that can be present in a generated
+document. At the moment it needs to be explicitly loaded into the server.
+
+Download the latest version (or any other, if you have preferences) from
+https://github.com/elastic/ecs/releases and search for file `ecs_flat.yml`
+in the folder `ecs-X.Y.Z/generated/ecs/`.
+
+Supposing that the path of said file is in shell variable `$SCHEMA_YAML`, you
+load it with
+
+```shell
+$ curl -s -XPUT -H "Content-Type: application/yaml" "$GENEVE/api/schema/ecs" --data-binary "@$SCHEMA_YAML"
+```
+
+The `ecs` in the endpoint `api/schema/ecs` is an arbitrary name, it's how
+the loaded schema is addressed by the server.
+
+## Define the data model
+
+In the data model you describe the data that shall be generated. It can
+be as simple as a list of fields that need to be present or more complex
+for defining also the relations among them.
+
+How to write a data model is separate subject (see [Data model](data_model.md)),
+here we focus on how to configure one on the server. You use the `api/source` endpoint.
+
+```shell
+$ curl -s -XPUT -H "Content-Type: application/yaml" "$GENEVE/api/source/mydata" --data-binary @- </_mappings`
+endpoint returns the mappings of all the possible fields that can be
+encountered in the documents generated by that source.
+
+Use the Elasticsearch index API to create the index
+
+```shell
+$ curl -s -XPUT -H "Content-Type: application/json" $TARGET_ES/myindex --data @- <