Replace inline type IDs with global constants in LLVM IR #15485
Merged
+23
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Consider the following snippet:
We are going to compile this twice, first with a cold cache, then with
-Dbar
:These times suggest that the addition of
Bar
completely invalidates the object cache, and indeed, if you pass also--stats
to the second compilation, it would sayno previous .o files were reused
. How could this be the case when none of theFoo
s depend onBar
?This counterintuitive behavior arises from the way Crystal separates LLVM IR into LLVM modules. Each non-generic type or generic instance has its own LLVM module containing all of that type or instance's class and instance methods, and then the rest goes to a special LLVM module called
_main
. The bytecode forFoo0
can be disassembled back to LLVM IR usingllvm-dis
:The line
store i32 7, ptr %1, align 4
is whereFoo0.allocate
storesFoo0.crystal_type_id
to the newly allocated memory area; this7
corresponds to the compile-time value ofFoo0.crystal_type_id
. If we drop-Dbar
again, the same value now becomes6
.The compiler component responsible for generating these type IDs is the
Crystal::LLVMId
class; it assigns numerical IDs in sequential order, with types defined later in the source code or the compiler receiving larger IDs than their sibling types. In particular, all structs have larger type IDs than every class. (You can see this information by setting the environment variableCRYSTAL_DUMP_TYPE_ID
to 1 during compilation.) Hence, by definingBar
at the beginning of the file, we have incremented the type ID of every singleFoo
by 1, and the inlining breaks the cache.If we move
Bar
to the bottom of the file, then recompilations will be able to reuse theFoo
object files, because their type IDs remain untouched. In practice, however, the splitting of source code into separate files renders this specific workaround nearly impossible to pull off, not to mention that other constructs liketypeof
andis_a?
also inline type IDs, apart fromReference.new
. In short, if your code tries to remove theNil
from anInt32?
, its cache will get invalidated any time you add or remove a class.This PR does not fight against the type ID assignment. It merely stops the inlining:
Global variables are required to be addressable in LLVM, so this creates an actual constant in the read-only section, hence the extra
load
. The actual compile-time value is now defined in_main
:With this simple trick, the object file cache is now working as intended:
As another example, we compile an empty file with the standard prelude, then add
class Foo; end; Foo.new
to it and recompile. These are the times:For an even larger codebase, we try this modification in
src/compiler/crystal/codegen/codegen.cr
of the Crystal compiler itself:The times are:
This will hopefully improve build times in certain scenarios, such as rapid prototyping, and IDE integrations that run the whole compiler.