Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A tool to incrementally improve the corpus when riffling through #670

Open
jkomoros opened this issue Oct 6, 2023 · 1 comment
Open

Comments

@jkomoros
Copy link
Owner

jkomoros commented Oct 6, 2023

One of the key things for a personal knowledge graph is encouraging you to riffle through it regularly. Zettlekasten does this naturally when you riffle through the cards to figure out where to file a new one.

The riffling both helps you remember and be inspired by old thoughts, but also incrementally improve the corpus ("oh actually this should have this concept tag added").

Card Web doesn't currently reward riffling through the card base very often. In very large card bases, it can be even a little bit stressful because you can see all of the duplication and things that should be interlinked but aren't.

There should be a mode or screen where spending a few minutes in it will help you riffle through random cards and also clean up your card web just a bit, with little, easy-to-action and not-a-big-deal-if-they're-wrong suggestions

Tasks that could make sense in such a mode:

  • Add missing concept tags (e.g. ones that were added after the card was created
  • Suggest concepts to create that are missing but are used multiple times in the corpus
  • Correct possible misspellings
  • Find duplicate-ish cards and have the user connect a dupe-of from the less-good version to the better version
  • Find similar cards and suggest adding an explicit see-also
  • Find duplicate-ish cards that are both marked as prioritized so the less-good one can be un-prioritized (this can happen from my workflow of put in cards, then distill out bits and bobs, then later put in the bits and bobs)
  • Mark cards that have lots of card-rank and have lots of similar cards to them as prioritized if they aren't.
  • Have a “particle collide” two cards together to suggest a new card (using Llms)
  • Summarize very similar cards into a canonical form
  • Between two very similar cards suggestion only one to be prioritized and dupe of
  • Presumably many others

One way to do this is to randomly pick a key card (perhaps with some ranking criteria like last-seen or card-rank) and then running a lot of task-suggestors to it. Another approach is to run task-suggestors to find the most relevant/important tasks and key around that.

In any case we'll want some kind of generalized system of recipes that can suggest possible edits (often in the form of "add a see-also link from abc3 to cdfe3") some of which might get quite complex and even rely on AI. For some of the more complex/expensive ones you could even imagine running them as an offline job and proactively suggesting them. The UI would need to be able to show each of these kinds of tasks with enough information for the user to actiion them (e.g. possibly showing other similar cards to give the user some context).

Maybe each action has things like:

  • A key card, typically that the edit actually happens to (this can be blank if the diff is 'create a card'). This is a non-empty array, and can be multiple primaries (might not need this complication yet)
  • A possibly empty list of supporting cards. For example, cards you are proposing linking to, or are proposing that the key card is a duplicate of.
  • A possibly empty list of context cards. These might be "cards that are similar to any of hte key or supporting cards" or "other cards that could use that concept if you created it"
  • A list of one or more cardDiffs that propose the actions to take if the action is approved.
  • A list of one or more cardDiffs that propose the actions to take as an alternate (e.g. a seeAlso)
  • A possibly empty list of cardDiffs to apply if the action is actively rejected (e.g. add a ack reference from the card to the other instead of a see-also). (What would the rejection action be for 'nah, don't create a concept about 'Prisoner's Dillema'', or 'Nah, don't mark this card as prioritized')

Each action type has:

  • The action generator function. This is run as an async, general purpose function that produces an action or null.
  • A action invalidator. Given an action of this type, check to verify if it's no longer valid. For example, if the user already added a see-also between that card and the other.

Generating these tweaks might be a somewhat expensive process, so while the mode is active it should probably be chewing away at suggesting new things to add to the end of the suggestion queue so once you accept one there's another one immediately (although there's the unlikely event that one is invalidated based on the approval/rejection of an earlier one)

Take a key card (sampled by some ranking distribution) and suggest a few actions. Then throw them into the ranking pot. Suggest the highest ranked action as next riffle. (Passive mode)

And then also a mode of “show me all actions for this current key card” (active mode)

Suggestions should be able to me modified, perhaps significantly, so they run the gamut from "just accept this thing" to "here's a thought starter"

The UI should be a card-editor that only shows the controls in the current suggestion, but where hte user can call up the full controls if they want

Conceptually related issues include #507 and #280 and #651

Verify: For things like see-also, if one card adds it to another card (so it's inbound for another card) does it also show up in the "see also" section?

One of the suggestors should look for see-also/dupe cards. Look for cards that have high similarity score and also were created within a few weeks of each other. The one that is primary vs dupe is based on a weighted function of: 1) which one is prioritized vs not, 2) which one is longer vs not, 3) which one is more recent, 4) which one an LLM judges as more polished / complete

Another one looks at cliques of see-also cards and then summarizes all of them into a new distilled card automatically.

What should the UI look like? Its primary purpose should be to riffle through a random card for spaced repetition style use cases. But it should also have the proposed actions on the side to be visible.

When you're viewing a proposed modification, you'll need a new UI anyway. It would have the primary card (s) at the top in a mostly-normal-size stage, and then the secondary cards below that in a horizontal card-drawer, and then the context cards below that (or taking over the collection in the left rail?). And then it would have a open editor button that opens the full normal editor with that card-diff loaded up... and when you're done would come back to the riffle screen.

The default UI would be just a section at the bottom of the info panel (below comments?). But there should also be a way to navigate to next proposed tweak, and stay in the tweak viewer. Maybe by default it goes into a "Save and next", but there's also an exit.

When an action is accepted or discarded, we should keep track of that in the card update. modifyCardWithBatch should take an updateInfo which includes {substantive: bool, actionCreator: 'manual' | ActionCreatorType, actionModified:boolean}

There needs to be a way for one of the action creators ("tweak creators"?) to ask for similar cards, and specify that it wants the real similarity from the server, not tfidf based.

Add a tweaks configurable filter, which includes cards that have tweaks proposed and ranks them by how high quality they are. Using that filter kicks off a process to suggest tweaks, and will keep on adding cards + tweaks to it. Each time you accept or reject a tweak it's removed from the set (which might remove the whole card if no more are left) and then then "next" action goes to the first card in the list. (This could get annoying if you've ignored a few tweaks; you want to be able to come back to them but also don't want to only see them).

Only generate tweaks if the tweaks filter is active and the user has general edit cards ability, or the user has editing ability on the current card, when it's suggesting tweaks for this card.

jkomoros added a commit that referenced this issue Nov 10, 2023
Now there's fetchSimilarCardsIfEnabled that returns a boolean if you should expect a result. And a failed
fetch returns a sentinel {}, which the similar cards pipeline sees and ignores as an empty result.

This will allow us to implement a waitForFullFieldity pipeline.

part of #646. Part of #670.
jkomoros added a commit that referenced this issue Nov 10, 2023
…ject.

We're about to add a new partial result type, and it would get confusing to add yet another optional
positional argument.

Part of #646. Part of #670.
jkomoros added a commit that referenced this issue Nov 10, 2023
jkomoros added a commit that referenced this issue Nov 10, 2023
jkomoros added a commit that referenced this issue Nov 10, 2023
…ilarity results in similarity.

Part of #646. Part of #670.
jkomoros added a commit that referenced this issue Nov 10, 2023
…er in the collection returned preview:true.

Part of #646. Part of #670.
jkomoros added a commit that referenced this issue Nov 10, 2023
It takes a collection description and keeps on trying to generate a collection until preview:false.

Part of #646. Part of #670.
jkomoros added a commit that referenced this issue Nov 10, 2023
This requires changing the "cards that link here" to skip see-also inbound links, since it's already
showed above.

This makes see-also be a reciprocal type which is conceptually correct... but also means that it can be a
bit confusing, if you want to remove the connection it's hard to tell if you're looking at the primary or
reciprocal card... so you might hit edit and then be surprised to not see a see-also reference to remove.

Part of #670.
@jkomoros
Copy link
Owner Author

jkomoros commented Nov 10, 2023

  • Store suggestions
  • Add a selectSuggestionForActiveCard that fetches the current suggestions for card and fetches them if they don't exist.
  • Is there a missing reading-list/start/view when viewing a single orphaned card?
  • see-also suggestions go away overboard for person / work cards (or really any card type that has few items) (this appears to be the infinite loop problem)
  • When a card with mined_from exists for synthesiz-cluser (with all of the same items) don't offer to do it again (it's already been done!)
  • Don't suggest see-also for cards that are already dupe-of (this may just be an artifact I noticed after adding a dupe-of, and immediately after the see-also wasn't cleared)
  • A suggestor to convert markdown to markup
  • Try using claude for synthesis (better matches tones)
  • Seriously, three should be a way to edit the proposed actions before they are executed. (Or cmd-click to open up the keyCard for editing but don't yet apply it)
  • Make it so SuggestionDiff.newCard can be a card or an array of cards to create one by one
  • There's an infinite loop in dupe-of or see-also for cards that are synthesizeCluster (maybe doing it twice in a row?) Seeing it just by visiting https://thecompendium.cards/c/everything/unpublished/working-notes/sort/recent/c-481-bbb619 and viewing suggestions (is this an outage related issue somehow?)
  • Plug in claude for synthesis / style keeping (experiment manually first to see if it's worth wiring in)
  • A suggestor for cards that should be quote cards (tracked in A type of card for quotes #465)
  • A suggestor to bold all-caps words. For multiple words in a row, it's a straightforward suggestion. For single words, it's less clear. (What if there were a way to store some state, like known-acronyms, for suggestions?)
  • Fetch ahead suggestions so when you advance in the collection they often don't have to load.
  • Make it so when updateCards is called, any suggestions that include any of those cards in keyCard or supportingCards are culled.
  • Don't suggest a see-also pointing to a card that is a dupe-of another card (automatically update to point to the dupe-of card? but you might get duplicates in the suggestion list...)
  • A weird case currently: a given card that should have a see-also already has a reference to the other card, so it's suppressed. Now the other card is edited, so there isn't a see-also... but the no-suggestion is cached, even though it no longer applies. The only way to see the suggestion is to refresh the webapp.
  • Hide flags from diff (or have a way to say which fields are affirmatively not important to show) (if doing that, make sure one that is only setting flags has a message).
  • Handle prioritzed todo setting differently in cardDiff message (the meaning of specific values is idiosyncratic and weird)
  • For dupe-of take into account which one is more recent for which one should be primary.
  • For synthesize together, the prompt doesn't do much more than just literally concatenate. Maybe just concatenate together and not include an LLM?
  • Add a verbose mode for just certain suggestors.
  • A suggestion that narrows down to cards created in the same week as the other cards that have a lower similarity score threshold to be marked as see-also. (This already kind of happens due to embeddings having the date)
  • Remove invalid suggestions when a card is newly activated
  • Ask LLM for which card is better
  • Keyboard shortcuts to accept primary/secondary/rejection
  • A way to edit a suggested createCard, and then on save, to apply the rest of the action (this gets pretty funky pretty quickly)
  • When a suggestion is in the process of being applied, gray out everything.
  • Update tagInfos to have better titles
  • Render reject action
  • Allow keyboard navigation between suggestions for selected card (and if runs off end, go to next card with suggestions)
  • Use llm to decide which of two linked cards are better and which one should be marked the dupe of
  • A way to suggest making a new card in a diff
  • Render out keyCard and supportingCards
  • Render out the diffs
  • A suggestions-summary that takes the suggestions for the current card, and if the user can edit the card, shows a summary if antyhing exists.
  • Create a suggestions.ts
  • Create a Suggestion type
  • modifyCards should take a map of Record<CardID, CardDiff> and not just be the same diff on each card.
  • Add a way to cull suggestions that are no longer valid
  • configure the first suggestion
  • Show the list of suggestions in console
  • Show the list of suggestions for a card in the card-info-panel, if the user can edit
  • Persist a list of suggestions (removing items that no longer fit every so often)
  • Add an option for how intensely the suggestors should work
  • Allow suggestions on the editing card
  • Remove DISABLE_SUGGESTIONS guard

jkomoros added a commit that referenced this issue Nov 10, 2023
jkomoros added a commit that referenced this issue Nov 10, 2023
jkomoros added a commit that referenced this issue Nov 10, 2023
jkomoros added a commit that referenced this issue Nov 10, 2023
jkomoros added a commit that referenced this issue Nov 11, 2023
There will likely be a lot of suggestors, and they'll likely need all kinds of things piped in, and it
will be easier to update it all if it's all just an args.

Part of #670.
jkomoros added a commit that referenced this issue Nov 11, 2023
jkomoros added a commit that referenced this issue Nov 11, 2023
jkomoros added a commit that referenced this issue Nov 11, 2023
jkomoros added a commit that referenced this issue Nov 12, 2023
jkomoros added a commit that referenced this issue Nov 12, 2023
Error message is only provided if it success is false.

Part of #646. Part of #670.
jkomoros added a commit that referenced this issue Nov 12, 2023
jkomoros added a commit that referenced this issue Nov 12, 2023
If provided, then error of code `stale-embedding` will be returned if the embedding is older than the
card.updated.

This will allow the client to determine that the embedding isn't  yet updated, and try again. This will
happen for example right after a card is saved, before it is re-embeddeded and stored.

Part of #670. Part of #646.
jkomoros added a commit that referenced this issue Nov 12, 2023
The similarCards.last_updated flow requires cards to have their point updated every time the card.updated
timestamp updates. That updates more often than the embedding changes.

This makes it so the last_updated filter should work.

Part of #646. Part of #670.
jkomoros added a commit that referenced this issue Nov 12, 2023
jkomoros added a commit that referenced this issue Nov 12, 2023
…oint's last_updated.

This meant we erroneously thought every embedding was stale.

Part of #646. Part of #670.
jkomoros added a commit that referenced this issue Nov 12, 2023
…s if the embedding is stale.

This means that right after a card is updated, before the embedding has been updated, it will get the new
similarity soon after it's available, no more than DELAY_FOR_STALE milliseconds after it's available.

Part of #646. Part of #670.
jkomoros added a commit that referenced this issue Dec 3, 2023
jkomoros added a commit that referenced this issue Dec 3, 2023
This starts working for multi line quotes.

Part of #465. Part of #670.
jkomoros added a commit that referenced this issue Dec 3, 2023
jkomoros added a commit that referenced this issue Dec 3, 2023
jkomoros added a commit that referenced this issue Dec 3, 2023
jkomoros added a commit that referenced this issue Dec 3, 2023
Part of #465. Part of #670.

This will allow passign overrides in createCard.
jkomoros added a commit that referenced this issue Dec 3, 2023
If provided, it controls vs CARD_TYPE_CONFIG.autoSlug.

Part of #465. Part of #670.
jkomoros added a commit that referenced this issue Dec 3, 2023
jkomoros added a commit that referenced this issue Dec 3, 2023
jkomoros added a commit that referenced this issue Dec 3, 2023
jkomoros added a commit that referenced this issue Dec 3, 2023
Now that it's possible to create multiple new cards, it's important to be able to distinguish.

Part of #670. Part of #465.
jkomoros added a commit that referenced this issue Dec 3, 2023
Before we just generated 5 and hoped that was good enough.

Part of #465. Part of #670.
jkomoros added a commit that referenced this issue Dec 4, 2023
jkomoros added a commit that referenced this issue Dec 4, 2023
jkomoros added a commit that referenced this issue Dec 4, 2023
jkomoros added a commit that referenced this issue Dec 4, 2023
jkomoros added a commit that referenced this issue Dec 4, 2023
Split it into two passes. This will make it a bit easier to figure out whether to create a work card or a person card.

Part of #670. Part of #465.
jkomoros added a commit that referenced this issue Dec 4, 2023
jkomoros added a commit that referenced this issue Dec 4, 2023
Doesn't do anything yet if it finds it.

Part of #670. Part of #465.
jkomoros added a commit that referenced this issue Dec 4, 2023
The logic of precisely when to do this is kind of finicky and honestly I'm sure I messed it up.

Part of #670. Part of #465.
jkomoros added a commit that referenced this issue Dec 4, 2023
jkomoros added a commit that referenced this issue Dec 4, 2023
jkomoros added a commit that referenced this issue Dec 4, 2023
jkomoros added a commit that referenced this issue Dec 4, 2023
…king-notes.

Cards with very few cards, like quote, person, work, will get very aggressive seeAlso to other cards of the same type since card_type is in the embedding content.

Part of #670.
jkomoros added a commit that referenced this issue Dec 4, 2023
The cardFinisher is not run when applying a card_type change.

Part of #465. Part of #670.
jkomoros added a commit that referenced this issue Dec 17, 2023
If there are semantic changes to be made by converting markdown, it suggests them.

Part of #670.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant