From 079562e1427f6f48d0ff2161ecf7a3b2ca67ad03 Mon Sep 17 00:00:00 2001 From: Ben Reinhart Date: Tue, 16 Jul 2024 16:04:00 -0700 Subject: [PATCH] Use collapsible markdown widgets for cells in srcmd --- .../srcbook/examples/getting-started.srcmd | 21 +- .../examples/langgraph-web-agent.srcmd | 26 +- .../api/srcbook/examples/websockets.srcmd | 21 +- packages/api/srcmd.mts | 401 +----------------- packages/api/srcmd/decoding.mts | 353 +++++++++++++++ packages/api/srcmd/encoding.mts | 109 +++++ packages/api/srcmd/types.mts | 14 + packages/api/test/srcmd.test.mts | 37 +- .../{notebook.srcmd => srcbook.srcmd} | 14 +- .../README.md | 12 +- .../package.json | 0 .../src/foo.mjs | 0 12 files changed, 566 insertions(+), 442 deletions(-) create mode 100644 packages/api/srcmd/decoding.mts create mode 100644 packages/api/srcmd/encoding.mts create mode 100644 packages/api/srcmd/types.mts rename packages/api/test/srcmd_files/{notebook.srcmd => srcbook.srcmd} (74%) rename packages/api/test/srcmd_files/{mock_notebook_dir => srcbook_dir}/README.md (70%) rename packages/api/test/srcmd_files/{mock_notebook_dir => srcbook_dir}/package.json (100%) rename packages/api/test/srcmd_files/{mock_notebook_dir => srcbook_dir}/src/foo.mjs (100%) diff --git a/packages/api/srcbook/examples/getting-started.srcmd b/packages/api/srcbook/examples/getting-started.srcmd index 3cd039ed..95703f3c 100644 --- a/packages/api/srcbook/examples/getting-started.srcmd +++ b/packages/api/srcbook/examples/getting-started.srcmd @@ -2,7 +2,8 @@ # Getting started -###### package.json +
+ package.json ```json { @@ -12,6 +13,7 @@ } } ``` +
## What are Srcbooks? @@ -24,13 +26,15 @@ A Srcbook is composed of **cells**. Currently, there are 4 types of cells: 3. **markdown cell**: what you're reading is a markdown cell. It allows you to easily express ideas with rich markup, rather than code comments, an idea called [literate programming](https://en.wikipedia.org/wiki/Literate_programming). 4. **code cell**: think of these as JS or TS files. You can run them or export objects to be used in other cells. -###### simple-code.js +
+ simple-code.js ```javascript // This is a trivial code cell. You can run me by // clicking 'Run' or using the shortcut `cmd` + `enter`. console.log("Hello, Srcbook!") ``` +
## Dependencies @@ -38,37 +42,44 @@ You can add any external node.js-compatible dependency from [npm](https://www.np You'll need to make sure you install dependencies, which you can do by running the `package.json` cell above. -###### generate-random-word.js +
+ generate-random-word.js ```javascript import {generate} from 'random-words'; console.log(generate()) ``` +
## Importing other cells Behind the scenes, cells are files of JavaScript or TypeScript code. They are ECMAScript 6 modules. Therefore you can export variables from one file and import them in another. -###### star-wars.js +
+ star-wars.js ```javascript export const func = (name) => `I am your father, ${name}` ``` +
-###### logger.js +
+ logger.js ```javascript import {func} from './star-wars.js'; console.log(func("Luke")); ``` +
## Using secrets For security purposes, you should avoid pasting secrets directly into Srcbooks. The mechanism you should leverage is [secrets](/secrets). These are stored securely and are accessed at runtime as environment variables. Secrets can then be imported in Srcbooks using `process.env.SECRET_NAME`: + ``` const API_KEY = process.env.SECRET_API_KEY; const token = auth(API_KEY); diff --git a/packages/api/srcbook/examples/langgraph-web-agent.srcmd b/packages/api/srcbook/examples/langgraph-web-agent.srcmd index 1c9a75ef..eb33e023 100644 --- a/packages/api/srcbook/examples/langgraph-web-agent.srcmd +++ b/packages/api/srcbook/examples/langgraph-web-agent.srcmd @@ -2,7 +2,8 @@ # LangGraph web agent -###### package.json +
+ package.json ```json { @@ -19,6 +20,7 @@ } } ``` +
## LangGraph tutorial @@ -28,7 +30,8 @@ We're going to build an agent that can search the web using the [Tavily Search A First, let's ensure we've setup the right env variables: -###### env-check.ts +
+ env-check.ts ```typescript import assert from 'node:assert'; @@ -36,12 +39,14 @@ import assert from 'node:assert'; assert.ok(process.env.OPENAI_API_KEY, 'You need to set OPENAI_API_KEY'); assert.ok(process.env.TAVILY_API_KEY, 'You need to set TAVILY_API_KEY'); ``` +
## Define the agent Now, let's define the Agent with LangGraph.js -###### agent.ts +
+ agent.ts ```typescript import { HumanMessage } from "@langchain/core/messages"; @@ -112,12 +117,13 @@ export const memory = SqliteSaver.fromConnString(DB_NAME); // This compiles it into a LangChain Runnable. // Note that we're (optionally) passing the memory when compiling the graph export const app = workflow.compile({ checkpointer: memory }); - ``` +
Now that we've built our app, let's invoke it to first get the weather in SF: -###### sf-weather.ts +
+ sf-weather.ts ```typescript import {app} from './agent.ts'; @@ -134,12 +140,14 @@ const finalState = await app.invoke( console.log(finalState.messages[finalState.messages.length - 1].content) ``` +
Now when we pass the same `thread_id`, in this case `"42"`, the conversation context is retained via the saved state that we've set in a local sqliteDB (i.e. stored list of messages). Also, in this next example, we demonstrate streaming output. -###### ny-weather.ts +
+ ny-weather.ts ```typescript import {app} from './agent.ts'; @@ -152,17 +160,19 @@ const nextState = await app.invoke( console.log(nextState.messages[nextState.messages.length - 1].content); ``` +
## Clear memory The memory was saved in the sqlite db `./langGraph.db`. If you want to clear it, run the following cell -###### clear.ts +
+ clear.ts ```typescript import {DB_NAME} from './agent.ts'; import fs from 'node:fs'; -// I can't find good documentation on the memory module, so let's apply the nuclear method fs.rmSync(DB_NAME); ``` +
\ No newline at end of file diff --git a/packages/api/srcbook/examples/websockets.srcmd b/packages/api/srcbook/examples/websockets.srcmd index 71e57440..d48a0fc8 100644 --- a/packages/api/srcbook/examples/websockets.srcmd +++ b/packages/api/srcbook/examples/websockets.srcmd @@ -2,7 +2,8 @@ # Intro to WebSockets -###### package.json +
+ package.json ```json { @@ -12,6 +13,7 @@ } } ``` +
This Srcbook is a fun demonstration of building a simple WebSocket client and server in Node.js using the [ws library](https://github.com/websockets/ws). We'll explore the basic concepts of communicating over WebSockets and showcase Srcbook's ability to host long-running processes. @@ -31,7 +33,8 @@ One of the most popular libraries for WebSockets in Node.js is the [ws library]( Below we implement a simple WebSocket _server_ using `ws`. -###### simple-server.js +
+ simple-server.js ```javascript import { WebSocketServer } from 'ws'; @@ -46,12 +49,14 @@ wss.on('connection', (socket) => { console.log("New client connected") }); ``` +
This simple server does nothing more than wait for incoming connections and log the messages it receives. Next, we need a _client_ to connect and send messages to it. Note the client is running in a Node.js process, not in the browser. Backends communicate over WebSockets too! -###### simple-client.js +
+ simple-client.js ```javascript import WebSocket from 'ws'; @@ -63,8 +68,8 @@ ws.on('open', () => { ws.send('Hello from simple-client.js'); ws.close(); }); - ``` +
Our simple client establishes a connection with the server, sends one message, and closes the connection. To run this example, first run the server and then run the client. Output is logged under the simple-server.js cell above. @@ -72,7 +77,8 @@ Our simple client establishes a connection with the server, sends one message, a The example above is not terribly interesting. WebSockets become more useful when the server tracks open connections and sends messages to the client. -###### stateful-server.js +
+ stateful-server.js ```javascript import { WebSocketServer } from 'ws'; @@ -132,8 +138,10 @@ wss.on('connection', (socket) => { }); }); ``` +
-###### client.js +
+ client.js ```javascript import WebSocket from 'ws'; @@ -175,6 +183,7 @@ client2.close(); console.log("Shutting down"); ``` +
## Explanation diff --git a/packages/api/srcmd.mts b/packages/api/srcmd.mts index 1591186d..373ebcd6 100644 --- a/packages/api/srcmd.mts +++ b/packages/api/srcmd.mts @@ -1,116 +1,11 @@ -import { marked } from 'marked'; import fs from 'node:fs/promises'; -import { SrcbookMetadataSchema, type SrcbookMetadataType, randomid } from '@srcbook/shared'; -import type { Tokens, Token, TokensList } from 'marked'; -import type { - CellType, - CodeCellType, - MarkdownCellType, - PackageJsonCellType, - TitleCellType, -} from '@srcbook/shared'; -import { languageFromFilename } from '@srcbook/shared'; import { pathToCodeFile, pathToPackageJson, pathToReadme } from './srcbook/path.mjs'; -marked.use({ gfm: true }); +import { encode } from './srcmd/encoding.mjs'; +import { decode } from './srcmd/decoding.mjs'; +import { type DecodeResult } from './srcmd/types.mjs'; -export function encode( - cells: CellType[], - metadata: SrcbookMetadataType, - options: { inline: boolean }, -) { - const encodedCells = cells.map((cell) => { - switch (cell.type) { - case 'title': - return encodeTitleCell(cell); - case 'markdown': - return encodeMarkdownCell(cell); - case 'package.json': - return encodePackageJsonCell(cell, options); - case 'code': - return encodeCodeCell(cell, options); - } - }); - - const encoded = [``] - .concat(encodedCells) - .join('\n\n'); - - // End every file with exactly one newline. - return encoded.trimEnd() + '\n'; -} - -export function encodeTitleCell(cell: TitleCellType) { - return `# ${cell.text}`; -} - -export function encodeMarkdownCell(cell: MarkdownCellType) { - return cell.text.trim(); -} - -export function encodePackageJsonCell(cell: PackageJsonCellType, options: { inline: boolean }) { - const source = options.inline - ? ['###### package.json\n', '```json', cell.source.trim(), '```'] - : ['###### package.json\n', '[package.json](./package.json)']; - - return source.join('\n'); -} - -export function encodeCodeCell(cell: CodeCellType, options: { inline: boolean }) { - const source = options.inline - ? [`###### ${cell.filename}\n`, `\`\`\`${cell.language}`, cell.source, '```'] - : [ - `###### ${cell.filename}\n`, - `[${cell.filename}](./src/${cell.filename}})`, // note we don't use Path.join here because this is for the markdown file. - ]; - - return source.join('\n'); -} - -export type DecodeErrorResult = { - error: true; - errors: string[]; -}; - -export type DecodeSuccessResult = { - error: false; - cells: CellType[]; - metadata: SrcbookMetadataType; -}; - -export type DecodeResult = DecodeErrorResult | DecodeSuccessResult; - -export function decode(contents: string): DecodeResult { - // First, decode the markdown text into tokens. - const tokens = marked.lexer(contents); - - // Second, pluck out srcbook metadata (ie ): - const { metadata, tokens: filteredTokens } = getSrcbookMetadata(tokens); - - // Third, group tokens by their function: - // - // 1. title - // 2. markdown - // 3. filename - // 4. code - // - const groups = groupTokens(filteredTokens); - - // Fourth, validate the token groups and return a list of errors. - // Example errors might be: - // - // 1. The document contains no title - // 2. There is a filename (h6) with no corresponding code block - // 3. There is more than one package.json defined - // 4. etc. - // - const errors = validateTokenGroups(groups); - - // Finally, return either the set of errors or the tokens converted to cells if no errors were found. - return errors.length > 0 - ? { error: true, errors: errors } - : { error: false, metadata, cells: convertToCells(groups) }; -} +export { encode, decode }; /** * Decode a compatible directory into a set of cells. @@ -163,291 +58,3 @@ export async function decodeDir(dir: string): Promise { return { error: true, errors: [error.message] }; } } - -const SRCBOOK_METADATA_RE = /^$/; - -function getSrcbookMetadata(tokens: TokensList) { - let match: RegExpMatchArray | null = null; - let srcbookMetdataToken: Token | null = null; - - for (const token of tokens) { - if (token.type !== 'html') { - continue; - } - - match = token.raw.trim().match(SRCBOOK_METADATA_RE); - - if (match) { - srcbookMetdataToken = token; - break; - } - } - - if (!match) { - throw new Error('Srcbook does not contain required metadata'); - } - - try { - const metadata = JSON.parse(match[1]); - return { - metadata: SrcbookMetadataSchema.parse(metadata), - tokens: tokens.filter((t) => t !== srcbookMetdataToken), - }; - } catch (e) { - throw new Error(`Unable to parse srcbook metadata: ${(e as Error).message}`); - } -} - -type TitleGroupType = { - type: 'title'; - token: Tokens.Heading; -}; - -type FilenameGroupType = { - type: 'filename'; - token: Tokens.Heading; -}; - -type CodeGroupType = { - type: 'code'; - token: Tokens.Code; -}; - -type LinkedCodeGroupType = { - type: 'code:linked'; - token: Tokens.Link; -}; - -type MarkdownGroupType = { - type: 'markdown'; - tokens: Token[]; -}; - -type GroupedTokensType = - | TitleGroupType - | FilenameGroupType - | CodeGroupType - | MarkdownGroupType - | LinkedCodeGroupType; - -/** - * Group tokens into an intermediate representation. - */ -export function groupTokens(tokens: Token[]) { - const grouped: GroupedTokensType[] = []; - - function pushMarkdownToken(token: Token) { - const group = grouped[grouped.length - 1]; - if (group && group.type === 'markdown') { - group.tokens.push(token); - } else { - grouped.push({ type: 'markdown', tokens: [token] }); - } - } - - function lastGroupType() { - const lastGroup = grouped[grouped.length - 1]; - return lastGroup ? lastGroup.type : null; - } - - function isLink(token: Tokens.Paragraph) { - return token.tokens.length === 1 && token.tokens[0].type === 'link'; - } - - let i = 0; - const len = tokens.length; - - while (i < len) { - const token = tokens[i]; - - switch (token.type) { - case 'heading': - if (token.depth === 1) { - grouped.push({ type: 'title', token: token as Tokens.Heading }); - } else if (token.depth === 6) { - grouped.push({ type: 'filename', token: token as Tokens.Heading }); - } else { - pushMarkdownToken(token); - } - break; - case 'code': - if (lastGroupType() === 'filename') { - grouped.push({ type: 'code', token: token as Tokens.Code }); - } else { - pushMarkdownToken(token); - } - break; - case 'paragraph': - if (lastGroupType() === 'filename' && token.tokens && isLink(token as Tokens.Paragraph)) { - const link = token.tokens[0] as Tokens.Link; - grouped.push({ type: 'code:linked', token: link }); - } else { - pushMarkdownToken(token); - } - break; - default: - pushMarkdownToken(token); - } - - i += 1; - } - - // Consider moving the package.json group to the first or second element if it exists. - return grouped; -} - -function validateTokenGroups(grouped: GroupedTokensType[]) { - const errors: string[] = []; - - const firstGroupIsTitle = grouped[0].type === 'title'; - const hasOnlyOneTitle = grouped.filter((group) => group.type === 'title').length === 1; - const invalidTitle = !(firstGroupIsTitle && hasOnlyOneTitle); - const hasAtMostOnePackageJson = - grouped.filter((group) => group.type === 'filename' && group.token.text === 'package.json') - .length <= 1; - - if (invalidTitle) { - errors.push('Document must contain exactly one h1 heading'); - } - - if (!hasAtMostOnePackageJson) { - errors.push('Document must contain at most one package.json'); - } - - let i = 0; - const len = grouped.length; - - while (i < len) { - const group = grouped[i]; - - if (group.type === 'filename') { - if (!['code', 'code:linked'].includes(grouped[i + 1].type)) { - const raw = group.token.raw.trimEnd(); - errors.push(`h6 is reserved for code cells, but no code block followed '${raw}'`); - } else { - i += 1; - } - } - - i += 1; - } - - return errors; -} - -function convertToCells(groups: GroupedTokensType[]): CellType[] { - const len = groups.length; - const cells: CellType[] = []; - - let i = 0; - - while (i < len) { - const group = groups[i]; - - if (group.type === 'title') { - cells.push(convertTitle(group.token)); - } else if (group.type === 'markdown') { - const hasNonSpaceTokens = group.tokens.some((token) => token.type !== 'space'); - // This shouldn't happen under most conditions, but if the file was edited or created manually, then there - // could be cases where there is excess whitespace, causing markdown blocks that were not intentional. Thus, - // we only create markdown cells when the markdown contains more than just space tokens. - if (hasNonSpaceTokens) { - cells.push(convertMarkdown(group.tokens)); - } - } else if (group.type === 'filename') { - i += 1; - switch (groups[i].type) { - case 'code': { - const codeToken = (groups[i] as CodeGroupType).token; - const filename = group.token.text; - const cell = - filename === 'package.json' - ? convertPackageJson(codeToken) - : convertCode(codeToken, filename); - cells.push(cell); - break; - } - case 'code:linked': { - const linkToken = (groups[i] as LinkedCodeGroupType).token; - const cell = convertLinkedCode(linkToken); - cells.push(cell); - break; - } - default: - throw new Error('Unexpected token type after a heading 6.'); - } - } - - i += 1; - } - - return cells; -} - -function convertTitle(token: Tokens.Heading): TitleCellType { - return { - id: randomid(), - type: 'title', - text: token.text, - }; -} - -function convertPackageJson(token: Tokens.Code): PackageJsonCellType { - return { - id: randomid(), - type: 'package.json', - source: token.text, - filename: 'package.json', - status: 'idle', - }; -} - -function convertCode(token: Tokens.Code, filename: string): CodeCellType { - return { - id: randomid(), - type: 'code', - source: token.text, - language: languageFromFilename(filename), - filename: filename, - status: 'idle', - }; -} - -// Convert a linked code token to the right cell: either a package.json file or a code cell. -// We assume that the link is in the format [filename](filePath). -// We don't populate the source field here, as we will read the file contents later. -function convertLinkedCode(token: Tokens.Link): CodeCellType | PackageJsonCellType { - return token.text === 'package.json' - ? { - id: randomid(), - type: 'package.json', - source: '', - filename: 'package.json', - status: 'idle', - } - : { - id: randomid(), - type: 'code', - source: '', - language: languageFromFilename(token.text), - filename: token.text, - status: 'idle', - }; -} - -function convertMarkdown(tokens: Token[]): MarkdownCellType { - return { - id: randomid(), - type: 'markdown', - text: serializeMarkdownTokens(tokens), - }; -} - -function serializeMarkdownTokens(tokens: Token[]) { - return tokens - .map((token) => { - const md = token.raw; - return token.type === 'code' ? md : md.replace(/\n{3,}/g, '\n\n'); - }) - .join(''); -} diff --git a/packages/api/srcmd/decoding.mts b/packages/api/srcmd/decoding.mts new file mode 100644 index 00000000..3789b032 --- /dev/null +++ b/packages/api/srcmd/decoding.mts @@ -0,0 +1,353 @@ +import { marked } from 'marked'; +import type { Tokens, Token, TokensList } from 'marked'; +import { SrcbookMetadataSchema, randomid } from '@srcbook/shared'; +import type { + CellType, + CodeCellType, + MarkdownCellType, + PackageJsonCellType, + TitleCellType, +} from '@srcbook/shared'; +import { languageFromFilename } from '@srcbook/shared'; +import type { DecodeResult } from './types.mjs'; + +marked.use({ gfm: true }); + +export function decode(contents: string): DecodeResult { + // First, decode the markdown text into tokens. + const tokens = marked.lexer(contents); + + // Second, pluck out srcbook metadata (ie ): + const { metadata, tokens: filteredTokens } = getSrcbookMetadata(tokens); + + // Third, group tokens by their function: + // + // 1. title + // 2. markdown + // 3. filename + // 4. code + // + const groups = groupTokens(filteredTokens); + + // Fourth, validate the token groups and return a list of errors. + // Example errors might be: + // + // 1. The document contains no title + // 2. There is a filename (h6) with no corresponding code block + // 3. There is more than one package.json defined + // 4. etc. + // + const errors = validateTokenGroups(groups); + + // Finally, return either the set of errors or the tokens converted to cells if no errors were found. + return errors.length > 0 + ? { error: true, errors: errors } + : { error: false, metadata, cells: convertToCells(groups) }; +} + +const SRCBOOK_METADATA_RE = /^$/; +const DETAILS_OPEN_RE = /]*>/; +const DETAILS_CLOSE_RE = /<\/details>/; +const SUMMARY_RE = /(.+)<\/summary>/; + +function getSrcbookMetadata(tokens: TokensList) { + let match: RegExpMatchArray | null = null; + let srcbookMetdataToken: Token | null = null; + + for (const token of tokens) { + if (token.type !== 'html') { + continue; + } + + match = token.raw.trim().match(SRCBOOK_METADATA_RE); + + if (match) { + srcbookMetdataToken = token; + break; + } + } + + if (!match) { + throw new Error('Srcbook does not contain required metadata'); + } + + try { + const metadata = JSON.parse(match[1]); + return { + metadata: SrcbookMetadataSchema.parse(metadata), + tokens: tokens.filter((t) => t !== srcbookMetdataToken), + }; + } catch (e) { + throw new Error(`Unable to parse srcbook metadata: ${(e as Error).message}`); + } +} + +type TitleGroupType = { + type: 'title'; + token: Tokens.Heading; +}; + +type FilenameGroupType = { + type: 'filename'; + value: string; +}; + +type CodeGroupType = { + type: 'code'; + token: Tokens.Code; +}; + +type ExternalCodeGroupType = { + type: 'code:external'; + token: Tokens.Link; +}; + +type MarkdownGroupType = { + type: 'markdown'; + tokens: Token[]; +}; + +type GroupedTokensType = + | TitleGroupType + | FilenameGroupType + | CodeGroupType + | MarkdownGroupType + | ExternalCodeGroupType; + +/** + * Group tokens into an intermediate representation. + */ +export function groupTokens(tokens: Token[]) { + const grouped: GroupedTokensType[] = []; + + function pushMarkdownToken(token: Token) { + const group = grouped[grouped.length - 1]; + if (group && group.type === 'markdown') { + group.tokens.push(token); + } else { + grouped.push({ type: 'markdown', tokens: [token] }); + } + } + + let i = 0; + const len = tokens.length; + + while (i < len) { + const token = tokens[i]; + + switch (token.type) { + case 'heading': + if (token.depth === 1) { + grouped.push({ type: 'title', token: token as Tokens.Heading }); + } else { + pushMarkdownToken(token); + } + i += 1; + break; + case 'html': + if (DETAILS_OPEN_RE.test(token.raw)) { + i = parseDetails(tokens, i, grouped); + } else { + pushMarkdownToken(token); + i += 1; + } + break; + default: + pushMarkdownToken(token); + i += 1; + } + } + + // Consider moving the package.json group to the first or second element if it exists. + return grouped; +} + +function parseDetails(tokens: Token[], i: number, grouped: GroupedTokensType[]) { + const token = tokens[i]; + + if (token.type !== 'html') { + throw new Error('Expected token to be of type html'); + } + + const match = token.raw.match(SUMMARY_RE); + + // TODO: Skip and treat as user markdown if no summary is found? + if (!match) { + throw new Error('Expected
HTML to contain a tag'); + } + + grouped.push({ type: 'filename', value: match[1] }); + + i += 1; + + i = advancePastWhitespace(tokens, i); + + const nextToken = tokens[i]; + + if (nextToken.type === 'paragraph') { + const link = (nextToken.tokens ?? []).find((t) => t.type === 'link') as Tokens.Link; + grouped.push({ type: 'code:external', token: link }); + } else if (nextToken.type === 'code') { + const code = nextToken as Tokens.Code; + grouped.push({ type: 'code', token: code }); + } + + i += 1; + + i = advancePastWhitespace(tokens, i); + + const closingTag = tokens[i]; + + if (closingTag.type !== 'html' || !DETAILS_CLOSE_RE.test(closingTag.raw)) { + throw new Error('Expected closing
tag'); + } + + i += 1; + + return i; +} + +function advancePastWhitespace(tokens: Token[], i: number) { + while (i < tokens.length && tokens[i].type === 'space') { + i += 1; + } + return i; +} + +function validateTokenGroups(grouped: GroupedTokensType[]) { + const errors: string[] = []; + + const firstGroupIsTitle = grouped[0].type === 'title'; + const hasOnlyOneTitle = grouped.filter((group) => group.type === 'title').length === 1; + const invalidTitle = !(firstGroupIsTitle && hasOnlyOneTitle); + const hasOnePackageJson = + grouped.filter((g) => g.type === 'filename' && g.value === 'package.json').length === 1; + + if (invalidTitle) { + errors.push('Document must contain exactly one h1 heading'); + } + + if (!hasOnePackageJson) { + errors.push('Document must contain exactly one package.json'); + } + + return errors; +} + +function convertToCells(groups: GroupedTokensType[]): CellType[] { + const len = groups.length; + const cells: CellType[] = []; + + let i = 0; + + while (i < len) { + const group = groups[i]; + + if (group.type === 'title') { + cells.push(convertTitle(group.token)); + } else if (group.type === 'markdown') { + const hasNonSpaceTokens = group.tokens.some((token) => token.type !== 'space'); + // This shouldn't happen under most conditions, but if the file was edited or created manually, then there + // could be cases where there is excess whitespace, causing markdown blocks that were not intentional. Thus, + // we only create markdown cells when the markdown contains more than just space tokens. + if (hasNonSpaceTokens) { + cells.push(convertMarkdown(group.tokens)); + } + } else if (group.type === 'filename') { + i += 1; + switch (groups[i].type) { + case 'code': { + const codeToken = (groups[i] as CodeGroupType).token; + const filename = group.value; + const cell = + filename === 'package.json' + ? convertPackageJson(codeToken) + : convertCode(codeToken, filename); + cells.push(cell); + break; + } + case 'code:external': { + const linkToken = (groups[i] as ExternalCodeGroupType).token; + const cell = convertLinkedCode(linkToken); + cells.push(cell); + break; + } + default: + throw new Error('Unexpected token type after a heading 6.'); + } + } + + i += 1; + } + + return cells; +} + +function convertTitle(token: Tokens.Heading): TitleCellType { + return { + id: randomid(), + type: 'title', + text: token.text, + }; +} + +function convertPackageJson(token: Tokens.Code): PackageJsonCellType { + return { + id: randomid(), + type: 'package.json', + source: token.text, + filename: 'package.json', + status: 'idle', + }; +} + +function convertCode(token: Tokens.Code, filename: string): CodeCellType { + return { + id: randomid(), + type: 'code', + source: token.text, + language: languageFromFilename(filename), + filename: filename, + status: 'idle', + }; +} + +// Convert a linked code token to the right cell: either a package.json file or a code cell. +// We assume that the link is in the format [filename](filePath). +// We don't populate the source field here, as we will read the file contents later. +function convertLinkedCode(token: Tokens.Link): CodeCellType | PackageJsonCellType { + return token.text === 'package.json' + ? { + id: randomid(), + type: 'package.json', + source: '', + filename: 'package.json', + status: 'idle', + } + : { + id: randomid(), + type: 'code', + source: '', + language: languageFromFilename(token.text), + filename: token.text, + status: 'idle', + }; +} + +function convertMarkdown(tokens: Token[]): MarkdownCellType { + return { + id: randomid(), + type: 'markdown', + text: serializeMarkdownTokens(tokens), + }; +} + +function serializeMarkdownTokens(tokens: Token[]) { + return tokens + .map((token) => { + const md = token.raw; + return token.type === 'code' ? md : md.replace(/\n{3,}/g, '\n\n'); + }) + .join('') + .trim(); +} diff --git a/packages/api/srcmd/encoding.mts b/packages/api/srcmd/encoding.mts new file mode 100644 index 00000000..f3c0873d --- /dev/null +++ b/packages/api/srcmd/encoding.mts @@ -0,0 +1,109 @@ +import { marked } from 'marked'; +import type { + CellType, + CodeCellType, + MarkdownCellType, + PackageJsonCellType, + TitleCellType, + SrcbookMetadataType, +} from '@srcbook/shared'; + +marked.use({ gfm: true }); + +export function encode( + allCells: CellType[], + metadata: SrcbookMetadataType, + options: { inline: boolean }, +) { + const [firstCell, secondCell, ...remainingCells] = allCells; + const titleCell = firstCell as TitleCellType; + const packageJsonCell = secondCell as PackageJsonCellType; + const cells = remainingCells as (MarkdownCellType | CodeCellType)[]; + + const encoded = [ + ``, + encodeTitleCell(titleCell), + encodePackageJsonCell(packageJsonCell, options), + ...cells.map((cell) => { + switch (cell.type) { + case 'markdown': + return encodeMarkdownCell(cell); + case 'code': + return encodeCodeCell(cell, options); + } + }), + ]; + + // End every file with exactly one newline. + return encoded.join('\n\n').trimEnd() + '\n'; +} + +function encodeTitleCell(cell: TitleCellType) { + return `# ${cell.text}`; +} + +function encodeMarkdownCell(cell: MarkdownCellType) { + return cell.text.trim(); +} + +function encodePackageJsonCell(cell: PackageJsonCellType, options: { inline: boolean }) { + return options.inline + ? encodeCollapsibleFileInline({ + open: false, + filename: 'package.json', + language: 'json', + source: cell.source, + }) + : encodeCollapsibleFileExternal({ + open: false, + filename: 'package.json', + filepath: './package.json', + }); +} + +function encodeCodeCell(cell: CodeCellType, options: { inline: boolean }) { + return options.inline + ? encodeCollapsibleFileInline({ + open: true, + filename: cell.filename, + language: cell.language, + source: cell.source, + }) + : encodeCollapsibleFileExternal({ + open: true, + filename: cell.filename, + filepath: `./src/${cell.filename}`, + }); +} + +function encodeCollapsibleFileInline(options: { + open: boolean; + source: string; + filename: string; + language: string; +}) { + // Markdown code block containing the file's source. + const detailsBody = `\`\`\`${options.language}\n${options.source}\n\`\`\``; + return encodeCollapsibleFile(detailsBody, options.filename, options.open); +} + +function encodeCollapsibleFileExternal(options: { + open: boolean; + filename: string; + filepath: string; +}) { + // Markdown link linking to external file. + const detailsBody = `[${options.filename}](${options.filepath})`; + return encodeCollapsibleFile(detailsBody, options.filename, options.open); +} + +function encodeCollapsibleFile(fileContents: string, filename: string, open: boolean) { + // The HTML
element is rendered as a collapsible section in GitHub UI. + // + // - https://gist.github.com/scmx/eca72d44afee0113ceb0349dd54a84a2 + // - https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/organizing-information-with-collapsed-sections + // + // The element is the header of the collapsible section, which we use to display the filename. + // The
element is collapsed by default, but can be expanded by adding the 'open' attribute. + return `\n ${filename}\n\n${fileContents}\n
`; +} diff --git a/packages/api/srcmd/types.mts b/packages/api/srcmd/types.mts new file mode 100644 index 00000000..c6564a91 --- /dev/null +++ b/packages/api/srcmd/types.mts @@ -0,0 +1,14 @@ +import { type CellType, type SrcbookMetadataType } from '@srcbook/shared'; + +export type DecodeErrorResult = { + error: true; + errors: string[]; +}; + +export type DecodeSuccessResult = { + error: false; + cells: CellType[]; + metadata: SrcbookMetadataType; +}; + +export type DecodeResult = DecodeErrorResult | DecodeSuccessResult; diff --git a/packages/api/test/srcmd.test.mts b/packages/api/test/srcmd.test.mts index 835357b2..5dcf7228 100644 --- a/packages/api/test/srcmd.test.mts +++ b/packages/api/test/srcmd.test.mts @@ -1,48 +1,47 @@ import Path from 'path'; import { getRelativeFileContents } from './utils.mjs'; import { decode, encode, decodeDir } from '../srcmd.mjs'; -import type { DecodeErrorResult, DecodeSuccessResult } from '../srcmd.mjs'; +import type { DecodeErrorResult, DecodeSuccessResult } from '../srcmd/types.mjs'; describe('encoding and decoding srcmd files', () => { let srcmd: string; const languagePrefix = '\n\n'; beforeAll(async () => { - srcmd = await getRelativeFileContents('srcmd_files/notebook.srcmd'); + srcmd = await getRelativeFileContents('srcmd_files/srcbook.srcmd'); }); it('is an error when there is no title', () => { const result = decode( - languagePrefix + '## Heading 2\n\nFollowed by a paragraph', + languagePrefix + + '## Heading 2\n\n
\n package.json\n\n```json\n{}\n```\n
\n\nFollowed by a paragraph', ) as DecodeErrorResult; expect(result.error).toBe(true); expect(result.errors).toEqual(['Document must contain exactly one h1 heading']); }); - it('is an error when there are multiple titles', () => { + it('is an error when there is no package.json', () => { const result = decode( - languagePrefix + '# Heading 1\n\nFollowed by a paragraph\n\n# Followed by another heading 1', + languagePrefix + '# Title\n\nFollowed by a paragraph', ) as DecodeErrorResult; expect(result.error).toBe(true); - expect(result.errors).toEqual(['Document must contain exactly one h1 heading']); + expect(result.errors).toEqual(['Document must contain exactly one package.json']); }); - it('is an error when there is a heading 6 without a corresponding code block', () => { + it('is an error when there are multiple titles', () => { const result = decode( languagePrefix + - '# Heading 1\n\n###### supposed_to_be_a_filename.mjs\n\nBut no code is found.', + '# Heading 1\n\n
\n package.json\n\n```json\n{}\n```\n
\n\nFollowed by a paragraph\n\n# Followed by another heading 1', ) as DecodeErrorResult; expect(result.error).toBe(true); - expect(result.errors).toEqual([ - "h6 is reserved for code cells, but no code block followed '###### supposed_to_be_a_filename.mjs'", - ]); + expect(result.errors).toEqual(['Document must contain exactly one h1 heading']); }); it('can decode a well-formed file', () => { const result = decode(srcmd) as DecodeSuccessResult; expect(result.error).toBe(false); expect(result.cells).toEqual([ - { id: expect.any(String), type: 'title', text: 'Notebook title' }, + { id: expect.any(String), type: 'title', text: 'Srcbook title' }, { id: expect.any(String), type: 'package.json', @@ -53,7 +52,7 @@ describe('encoding and decoding srcmd files', () => { { id: expect.any(String), type: 'markdown', - text: `\n\nOpening paragraph here.\n\n## Section h2\n\nAnother paragraph.\n\nFollowed by:\n\n1. An\n2. Ordered\n3. List\n\n`, + text: `Opening paragraph here.\n\n## Section h2\n\nAnother paragraph.\n\nFollowed by:\n\n1. An\n2. Ordered\n3. List`, }, { id: expect.any(String), @@ -66,7 +65,7 @@ describe('encoding and decoding srcmd files', () => { { id: expect.any(String), type: 'markdown', - text: '\n\n## Another section\n\nDescription goes here. `inline code` works.\n\n```javascript\n// This will render as markdown, not a code cell.\nfoo() + bar()\n```\n\n', + text: '## Another section\n\nDescription goes here. `inline code` works.\n\n```javascript\n// This will render as markdown, not a code cell.\nfoo() + bar()\n```', }, { id: expect.any(String), @@ -79,7 +78,7 @@ describe('encoding and decoding srcmd files', () => { { id: expect.any(String), type: 'markdown', - text: '\n\nParagraph here.\n', + text: 'Paragraph here.', }, ]); }); @@ -93,11 +92,11 @@ describe('encoding and decoding srcmd files', () => { describe('it can decode from directories', () => { it('can decode a simple directory with README, package, and one file', async () => { - const dirPath = Path.resolve(__dirname, 'srcmd_files/mock_notebook_dir/'); + const dirPath = Path.resolve(__dirname, 'srcmd_files/srcbook_dir/'); const result = (await decodeDir(dirPath)) as DecodeSuccessResult; expect(result.error).toBe(false); expect(result.cells).toEqual([ - { id: expect.any(String), type: 'title', text: 'Notebook' }, + { id: expect.any(String), type: 'title', text: 'Srcbook' }, { id: expect.any(String), type: 'package.json', @@ -108,7 +107,7 @@ describe('it can decode from directories', () => { { id: expect.any(String), type: 'markdown', - text: '\n\nWith some words right behind it.\n\n## Markdown cell\n\nWith some **bold** text and some _italic_ text.\n\n> And a quote, why the f\\*\\*\\* not!\n\n', + text: 'With some words right behind it.\n\n## Markdown cell\n\nWith some **bold** text and some _italic_ text.\n\n> And a quote, why the f\\*\\*\\* not!', }, { id: expect.any(String), @@ -121,7 +120,7 @@ describe('it can decode from directories', () => { { id: expect.any(String), type: 'markdown', - text: '\n\n```json\n{ "simple": "codeblock" }\n```\n', + text: '```json\n{ "simple": "codeblock" }\n```', }, ]); }); diff --git a/packages/api/test/srcmd_files/notebook.srcmd b/packages/api/test/srcmd_files/srcbook.srcmd similarity index 74% rename from packages/api/test/srcmd_files/notebook.srcmd rename to packages/api/test/srcmd_files/srcbook.srcmd index f461eda7..a36fa589 100644 --- a/packages/api/test/srcmd_files/notebook.srcmd +++ b/packages/api/test/srcmd_files/srcbook.srcmd @@ -1,14 +1,16 @@ -# Notebook title +# Srcbook title -###### package.json +
+ package.json ```json { "dependencies": {} } ``` +
Opening paragraph here. @@ -22,12 +24,14 @@ Followed by: 2. Ordered 3. List -###### index.mjs +
+ index.mjs ```javascript // A code snippet here. export function add(a, b) { return a + b } ``` +
## Another section @@ -38,12 +42,14 @@ Description goes here. `inline code` works. foo() + bar() ``` -###### foo.mjs +
+ foo.mjs ```javascript import {add} from './index.mjs'; const res = add(2, 3); console.log(res); ``` +
Paragraph here. diff --git a/packages/api/test/srcmd_files/mock_notebook_dir/README.md b/packages/api/test/srcmd_files/srcbook_dir/README.md similarity index 70% rename from packages/api/test/srcmd_files/mock_notebook_dir/README.md rename to packages/api/test/srcmd_files/srcbook_dir/README.md index b20ac1f7..6bc4f8ce 100644 --- a/packages/api/test/srcmd_files/mock_notebook_dir/README.md +++ b/packages/api/test/srcmd_files/srcbook_dir/README.md @@ -1,11 +1,14 @@ -# Notebook +# Srcbook -###### package.json +
+ package.json [package.json](./package.json) +
+ With some words right behind it. ## Markdown cell @@ -14,10 +17,13 @@ With some **bold** text and some _italic_ text. > And a quote, why the f\*\*\* not! -###### foo.mjs +
+ foo.mjs [foo.mjs](./src/foo.mjs) +
+ ```json { "simple": "codeblock" } ``` diff --git a/packages/api/test/srcmd_files/mock_notebook_dir/package.json b/packages/api/test/srcmd_files/srcbook_dir/package.json similarity index 100% rename from packages/api/test/srcmd_files/mock_notebook_dir/package.json rename to packages/api/test/srcmd_files/srcbook_dir/package.json diff --git a/packages/api/test/srcmd_files/mock_notebook_dir/src/foo.mjs b/packages/api/test/srcmd_files/srcbook_dir/src/foo.mjs similarity index 100% rename from packages/api/test/srcmd_files/mock_notebook_dir/src/foo.mjs rename to packages/api/test/srcmd_files/srcbook_dir/src/foo.mjs