Parse LLM's codeblock and let's create a git diff against your own codeblock. That is why this diff support WILDCARDS too!
Improved diff: this tool propose CHARACTER and LINE based diff based on the modification amount and percentage.
NOTE: LLMs can create even better diffs with their wildcard. So all in all I suggest to create the extended version of the file with an LLM diff and then run this script to get very nice diffs.
using Pkg
Pkg.add(url="https://github.com/Cvikli/DiffLib.jl")
julia -e "using DiffLib; run_cli()" test_cases/case0.js test_cases/case0_changes.js -d -w "// ..."
or get the diff like git diff --word-diff
does:
julia -e "using DiffLib; run_cli()" test_cases/case0.js test_cases/case0_changes.js -w "// ..."
Or in code:
using DiffLib
# Compare two files
diff_files("test_cases/case0.js", "test_cases/case0_changes.js", "// ... ")
# Compare content strings
diff_contents(original_content, changed_content, ["WILDCARD"])
- LLM codeblock output + original codeblock diff
- The diff is Word-based and character-based diff
- Wildcard support for flexible matching
- CLI for easy file comparison from terminal
- Customizable output formatting by setting threashold of char or line based diff usage
- LLMs can generate abbreviations, also these can be forced to be generated to faster output:
// ... existing code ...
// ... existing imports ...
// ... rest of the component ...
// ... rest of the component remains the same
// ... rest of the existing styles ...
// ... rest of the existing code ...
// ... (rest of the code remains unchanged)
// ... other styled components remain the same
// ... (previous code remains unchanged)
// ... imports remain the same
// ... rest of the component (remove any font-size: 20px - declarations) ...
// ... (keep other code unchanged)
// ... (keep other styled components and imports unchanged)
// ... existing JSX ...
// ... existing useEffect and functions ...
// ... (keep existing state variables)
// ... (keep existing values)
// ... (keep existing code)
// ... (keep existing dependencies)
// ... existing error handling ...
// ... rest of the component ...
// ... (previous dependencies)
// ... (previous code)
// ... (previous values)
// ... (rest of the file)
This sounds pretty impossible to parse in each case. So I made this beginning match to be the pattern // ...
. If only one string is defined then we use the startswith(wildcard, line)
- The git diff often fail to find the diff... also many other diff fails in case of LLMs output.
- Also why don't we have more granular diff like word or even character based diff... why should we look for a whole line to find the changes? right? We are humans with limited cognitive speed. :D
This project is licensed under the MIT License.
- File path handling
- File readall string handling
- ARGS handling
- Refactor to use indexes
- Typesafety check
- Word based diff
- Even character based diff
- find best match should be keeping the order to verify the match. Also should be - whitespace sensitive probably. Also LCS could be used here too to check matching - line by line.
- Create README.md
- multi wildcard handling in typesafe manner ;)
- output generation to be modular (maybe buffer like mechanics)
- grouping :equal, :insert, :deleted directives...
- Testing if it handles consecutive diffs properly
- JS frontend for a merge tool
- Integrative new diff handling... Sort of handling the streamed chunked input in - the changes...
- Testing testing testing...
- LCS + continuity optimization... So if it finds 2,1,1 in a large text it is worse then finding the 4 consecutive line. (Btw... this should be found most of the time simply)
- Speed measureing... If it isn't enough fast...
By AI (60-80% and this is just the beginning)... used the tool AISH.jl