A CLI tool designed to process Rust source code, creating a high-level context suitable for Large Language Models (LLMs). It eliminates non-essential information that allows you share with LLMs large codebases.
When working with LLMs on large codebases, it's crucial to balance providing enough context while staying within context window limits and optimizing for cost and performance. This tool processes Rust code to remove unnecessary implementation details while preserving the essential structure and interfaces.
- Context Window Management: By stripping down the code to its essential structure, the tool helps fit more relevant information within the LLM's context window, which is crucial for effective processing and understanding.
- Focus on Essentials: The tool preserves the module structure, type definitions, function signatures, and important comments, which are often sufficient for understanding the overall architecture and design of the project.
- Reduced Noise: Removing implementation details and test code reduces noise, allowing the LLM to focus on the high-level structure and relationships within the codebase.
- Scalability: This approach scales well with large projects, as it avoids overwhelming the LLM with unnecessary details, making it easier to handle and process large codebases.
- Incremental Sharing: The tool's approach of sharing small parts of the codebase as needed ensures that the LLM has access to detailed information when required, without overwhelming it with the entire codebase.
- Removes:
- Test functions (
#[test]
) and test modules (#[cfg(test)]
) - Function bodies (with specific exceptions and when the
--no-function-bodies
option is used) - Doc comments and module-level documentation when the
--no-comments
option is used - Implementation details of derived traits
- Test functions (
- Preserves:
- Module structure and imports
- Type definitions (structs, enums, traits)
- Function signatures and interfaces
- Non-test attributes (e.g.,
#[derive]
) - Doc comments and module-level documentation (unless
--no-comments
option is specified) - Function bodies for:
- String-like return types (
String
,&str
,Cow<str>
) Result<T, E>
whereT
is string-likeOption<T>
whereT
is string-like- Custom
Serialize
trait implementations
- String-like return types (
- Special trait method annotations:
/// This is a required method
for required trait methods/// There is a default implementation
for methods with default implementations
- File paths relative to the
src
directory withmain.rs
andlib.rs
files if the--single-file
flag is used
# Clone the repository
git clone https://github.com/yourusername/code-context.git
cd code-context
# Build the project
cargo build --release
# Basic usage
code-context <input_path>
# With options
code-context <input_path> --output-dir <suffix_for_output_dir_name> --no-comments --stats --dry-run --single-file
Options:
-o, --output-dir <NAME> Output directory name [default: code-context]
--no-function-bodies Remove function bodies (except for functions with string-like return types)
--no-comments Remove all comments (including doc comments)
--no-stats Show processing statistics
--dry-run Run without writing output files
--single-file Output all files into a single combined file
-h, --help Print help
-V, --version Print version
Generated output files can be found in the
src-code-context
and
src-custom-suffix
directories.
- The file
src-code-context/code_context.rs.txt
was generated by passing the path to thesrc
directory of this repo with--single-file
and--no-function-bodies
options. - Files in the
src-custom-suffix
directory were generated by passing the path to thesrc
directory with--output-dir custom-suffix
and--no-function-bodies
options.
In both cases, the size reduction is 92.5% (from 76371 bytes to 5744 bytes).
Before:
/// Adds two numbers.
fn add(a: i32, b: i32) -> i32 {
a + b
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_add() {
assert_eq!(add(1, 2), 3);
}
}
After using the tool with --no-comments
and --no-function-bodies
options:
fn add(a: i32, b: i32) -> i32 {}
Q: What types of files does this tool process?
A: The tool processes files with the .rs
extension only. It does not process
files with .toml
, .json
, or other extensions.
Q: Can I run the tool without writing output files?
A: Yes, use the --dry-run
flag to run the tool without writing output files.
Q: Why output file(s) have an extension .rs.txt
. Why not generate .rs
file(s)?
A: If the tool generates .rs
files, the rust-analyzer
will generate a lot of
compilation errors. To avoid this, the tool generates .rs.txt
files.
Contributions are welcome! Please feel free to submit a Pull Request.