why: history section "done"

endatabas · Nov 13, 2023 · 222b727 · 222b727
1 parent 3ed3ef6
commit 222b727
Showing 1 changed file with 65 additions and 1 deletion.
diff --git a/src/appendix/why.md b/src/appendix/why.md
@@ -1,10 +1,74 @@
 # Why?
 
-Why build Endatabas at all?
+Why did we build Endatabas at all?
 Isn't one of the many ([many](https://www.dbdb.io)) existing databases good enough?
 
+## What is Endatabas, anyway?
+
+The tagline "SQL Document Database With Full History" says a lot, but it doesn't say everything.
+Endatabas is, first and foremost, an _immutable database_.
+That's the Full History part.
+But storing all your data, forever, has clear implications.
+
+We consider these implications to be the _pillars_ of Endatabas.
+In 3D geometry, the legs of a tripod are mutually supportive; as long as all three feet are in contact with the ground, the tripod will not wobble or collapse.
+So it is with the pillars.
+Each supports and implies the others.
+The pillars are as follows:
+
+* Full History (requires: immutable data and erasure)
+* Timeline (requires: time-traveling queries)
+* Separation of Storage from Compute (requires: light and adaptive indexing)
+* Documents (requires: schemaless tables, "schema-per-row", arbitrary joins)
+* Analytics (requires: columnar storage and access)
+
+At the top of this 5D tripod is SQL, the lingua franca of database queries.
+A window of time has recently opened when all of this is finally possible.
+But first we can go back in history to see how we got here.
+
 ## History
 
+None of the ideas in Endatabas are new.
+
+George Copeland's [_What if mass storage were free?_](https://www.endatabas.com/references.html#10.1145/800083.802685)
+asked, back in 1980, what an immutable database might look like.
+His prescient vision for a database with full history enjoys the clarity of a researcher at the beginning of the database era.
+People have occasionally asked of Endatabas, "why bother retaining all history?"
+But this is the wrong question.
+The real question is: "why bother destroying data?"
+Copeland's answers, "The deletion concept was invented to reuse expensive computer storage."
+The software industry has grown so accustomed to the arbitrary deletion historical data that we now take destroying data for granted.
+
+Mass storage is not free yet -- but it is cheap.
+Copeland himself addresses "a more realistic argument: if the cost of mass storage were low enough, then deletion would become undesirable."
+Any system that exploits the separation of storage and compute can enjoy these low costs.
+
+Jensen and Snodgrass have thoroughly researched time-related database queries.
+Much of their work was published [in the 1990s](https://www.endatabas.com/bibliography.html#10.1109/69.755613)
+and early 2000s.
+Storing time, querying across time, time as a value ... these challenging subjects eventually grew to form
+[SQL:2011](https://www.endatabas.com/bibliography.html#ISO/IEC-19075-2:2021).
+Most SQL databases have struggled to implement SQL:2011 because incorporating _time_ as a core concept in databases which support destructive updates and deletes amplifies existing complexity.
+
+Document databases have a more convoluted story.
+Attempts at "schemaless", semi-structured, document, and object databases stretch from
+[Smalltalk in the 1980s](https://www.endatabas.com/bibliography.html#10.1145/971697.602300)
+to [C++ in the 1990s](https://en.wikipedia.org/wiki/Object_database#Timeline)
+to [Java](https://prevayler.org/)
+and [graphs](https://en.wikipedia.org/wiki/Neo4j) in the 2000s
+to [JSON in the 2010s](https://en.wikipedia.org/wiki/MongoDB).
+Despite all this, the most successful semi-structured document store, as of 2023, is a JSON column in a Postgres database.
+Database users desire flexible storage and querying -- but yesterday's weather says they desire SQL more.
+
+Khoshafian and Copeland introduced the [Decomposition Storage Model (DSM)](https://www.endatabas.com/bibliography.html#10.1145/318898.318923)
+in 1985.
+The four decades that followed saw any number of approaches to analytical processing in database.
+Most of the time, however, these tended to demand labour-intensive data logistics:
+data was piped, streamed, dumped, and copied into denormalized cubes and time-series databases.
+As humanity grew out of the batch processing of the 1980s into the always-online society of the 2020s, analytics data became another form of operational data and parts of this pipeline were looped back to users and customers.
+Recently, Hybrid Transactional/Analytical Processing (HTAP) promises a simpler, natural successor to OLTP and OLAP systems.
+For many businesses, the transactional/analytical divide is as arbitrary as deleting data because hard disks were expensive in 1986.
+
 ## Timing
 
 outside => in: