-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Comb] Crash in AndOp folder #8024
Comments
Seems like AndOp canoncalizer is creating zero-operand operation. //===-------------------------------------------===//
Processing operation : 'comb.and'(0x4b3c980) {
%4 = "comb.and"(%18, %2) <{twoState}> : (i1, i1) -> i1
* Pattern : 'comb.and -> ()' {
Trying to match ""
** Insert : 'comb.and'(0x4b528c0)
** Replace : 'comb.and'(0x4b3c980)
** Modified: 'hw.output'(0x4b22000)
** Modified: 'hw.output'(0x4b22000)
** Erase : 'comb.and'(0x4b3c980)
"" result 1
} -> success : pattern applied successfully
// *** IR Dump After Pattern Application ***
'comb.and' op expected 1 or more operands, but found 0
mlir-asm-printer: 'hw.module' failed to verify and will be printed in generic form
"hw.module"() <{module_type = !hw.modty<input clk : i1, input rst : i1, input sel : !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>, input sel_valid : i1, input data : i64, input data_valid : i1, input read_data_ready : i1, output sel_ready : i1, output data_ready : i1, output read_data : !hw.typealias<@pycde::@AXI_Lite_Read_Resp, !hw.struct<data: i32, resp: i2>>, output read_data_valid : i1>, parameters = [], result_locs = [loc("a.mlir":7:226), loc("a.mlir":7:246), loc("a.mlir":7:267), loc("a.mlir":7:360)], sym_name = "MMIOAxiReadWriteDemux"}> ({
^bb0(%arg0: i1, %arg1: i1, %arg2: !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>, %arg3: i1, %arg4: i64, %arg5: i1, %arg6: i1):
%0 = "hw.constant"() <{value = true}> : () -> i1
%1 = "hw.constant"() <{value = 0 : i2}> : () -> i2
%2 = "comb.and"(%arg5, %arg3) <{twoState}> : (i1, i1) -> i1
%3 = "hw.struct_create"(%arg4, %arg2) : (i64, !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>) -> !hw.struct<a: i64, b: !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>>
%4 = "comb.and"() <{twoState}> : () -> i1
%5 = "hw.struct_extract"(%3) <{fieldIndex = 0 : i32}> : (!hw.struct<a: i64, b: !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>>) -> i64
%6 = "hw.struct_extract"(%3) <{fieldIndex = 1 : i32}> : (!hw.struct<a: i64, b: !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>>) -> !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>
%7 = "hw.struct_extract"(%6) <{fieldIndex = 1 : i32}> : (!hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>) -> i1
%8 = "comb.extract"(%5) <{lowBit = 0 : i32}> : (i64) -> i32
%9 = "comb.extract"(%5) <{lowBit = 32 : i32}> : (i64) -> i32
%10 = "comb.mux"(%7, %9, %8) <{twoState}> {sv.namehint = "mux_None_in0_in1"} : (i1, i32, i32) -> i32
%11 = "hw.struct_create"(%10, %1) : (i32, i2) -> !hw.typealias<@pycde::@AXI_Lite_Read_Resp, !hw.struct<data: i32, resp: i2>>
%12 = "hw.struct_extract"(%6) <{fieldIndex = 0 : i32}> : (!hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>) -> i1
%13 = "comb.xor"(%12, %0) <{twoState}> : (i1, i1) -> i1
%14 = "comb.and"(%2, %13) <{twoState}> : (i1, i1) -> i1
%15 = "comb.and"(%14, %18) <{twoState}> : (i1, i1) -> i1
%16 = "comb.and"(%2, %12) <{twoState}> : (i1, i1) -> i1
%17 = "comb.and"(%16, %18) <{twoState}> : (i1, i1) -> i1
%18 = "comb.and"(%arg6, %17) <{twoState}> : (i1, i1) -> i1
"hw.output"(%4, %4, %11, %15) : (i1, i1, !hw.typealias<@pycde::@AXI_Lite_Read_Resp, !hw.struct<data: i32, resp: i2>>, i1) -> ()
}) {output_file = #hw.output_file<"MMIOAxiReadWriteDemux.sv", includeReplicatedOps>} : () -> () |
I was able to further reduce it. It seems to be a cycle issue. |
I think the problem is in |
I think I understand why |
Taking a look, thanks for the ping and reproducer. This looked familiar, I've seen the zero-operands IR from fuzzing. Anyway, while poking at this here's a related issue that we process indefinitely and never terminate (same hw.module @MMIOAxiReadWriteDemux() {
%0 = comb.and bin %0 : i1
} EDIT: The above example is in the folder, taking the |
My simple suggestion that I'm still thinking through is to just check if the computed uniqueInputs is empty and do nothing in that case. IIRC a number of things in our compiler fall over in the presence of these sorts of cycles, FWIW. A small variation that hw.module @MMIOAxiReadWriteDemux(in %data_valid : i1, out sel_ready : i1) {
%0 = comb.and bin %2, %data_valid : i1
%2 = comb.and bin %2 : i1
hw.output %0 : i1
} -> (and similarly: hw.module @MMIOAxiReadWriteDemux(in %x: i1, in %y: i1, out z : i1) {
%0 = comb.and bin %1, %x: i1
%1 = comb.and bin %1, %y: i1
hw.output %0 : i1
} Here, canonicalizeIdempotentInputs turns I'm trying to determine what this IR even means, and whether it might be invalid (and if so what might we most reasonably do about it that isn't crashing/looping!). Thoughts? |
I'm not convinced that would cover all of the potential bugs. I could see a case where on of multiple inputs is part of a cycle.
This particular case was a bug in user code (PyCDE). I fixed the bug and avoided this particular bug.
Cycles in combinational graphs should be totally legal IMO. It should be possible to build ring oscillators in CIRCT! Now it's totally acceptable to just give up on any and all optimizations if a cycle is present. Whether or not frontends allow this behavior is another question.
In addition to my desire to have this be legal, this is totally unintuitive behavior especially in a canonicalizer. Since this was a bug in client code, I would at least like CIRCT to crap out with a decent error message. In nearly all cases, a comb cycle represents a bug, so we shouldn't be silently dropping it. I was thinking about writing a comb cycle analysis pass (I couldn't find one in the comb dialect) for PyCDE to optionally run to detect these sorts of bugs. |
Yep, I believe that's the Looks like our folders, flattening, canonicalizeIdempotentInputs (probably others, I haven't looked) all are various sorts of wrong in the presence of combinatorial cycles for many comb operations -- ranging from dropping cyclic portions (above), crashing/invalid IR (this issue but many related across other comb ops), infinite loops, or dropping gates in the cycle as part of canonicalization which probably isn't desired (?) (see below, not gate cycle snippet). I think something pervasive and perhaps fundamental in how our code (especially in canonicalization, probably in the export path as well) views the IR is at odds with what's desired when generating IR containing combinatorial cycles intentionally -- each gate is significant, vs the behavior it encodes functionally/"mathematically" (viewed as an expression, a perspective from which cycles are at least weird 😄). Not gate cycle: hw.module @foo(out y : i1) {
%one = hw.constant 1 : i1
%0 = comb.xor %10, %one : i1
%1 = comb.xor %0, %one : i1
%2 = comb.xor %1, %one : i1
%3 = comb.xor %2, %one : i1
%4 = comb.xor %3, %one : i1
%5 = comb.xor %4, %one : i1
%6 = comb.xor %5, %one : i1
%7 = comb.xor %6, %one : i1
%8 = comb.xor %7, %one : i1
%9 = comb.xor %8, %one : i1
%10 = comb.xor %9, %one : i1
hw.output %10 : i1
} Which CIRCT canonicalizes down to a single not, since that appears "equivalent" but, is it? (change to even number of gates for canonicalizer infinite loop, FWIW)
Thanks, that makes sense to me! Don't love disallowing things if we can avoid it 👍 . Disabling optimizations (for lack of a better plan at least in the short term) and carefully emitting IR containing combinatorial cycles seems like the way to go. Doing this automatically and selectively in our pipeline is an interesting question. What do various tools/flows do with this sort of thing? Are these put in modules / blocks that are specially annotated with directives, that sort of thing? (How is this modeled / conceptualized / managed elsewhere in other tools, languages, so on?)
I agree. And of course it shouldn't crash either!
A comb cycle analysis pass sounds great. I wonder how to craft this to be reasonable in presence of various complicated constructs or interactions with other (unknown?) dialects, precision re:bit-sensitivity, so on? Sounds like a great addition in whatever form! If our folders and canonicalizers need to know which operations are part of cycles, IMO having a pass detect these (and mark or error?) is a reasonable way to go (vs folders/canonicalizers needing to check this themselves, everywhere/repeatedly). And just to re-iterate: many folders/canonicalizers break in some form with combinatorial cycles (ignoring how to encode when preserving the gate count/structure is important). Let's get consensus on how we want to handle this and then "someone" can wade through and look for and address these appropriately (or file issues). WDYT? |
I think I agree with all of that. But I think the root of the problem is that we're misusing canonicalizers to do optimizations. Canonicalizers should be used to clean up IR to enable optimizations to run better. Anything which could break in the presence of cycles should be in an optional pass. This is a long-running issue which needs to be fixed.
I was just gonna do cycles in the comb dialect ops. I think this would cover 95% of cases I care about. But we could introduce a "combinational" trait to enable dialects to mark ops as combinational. In particular, some ops in the |
In a237db5:
Will try to reduce this test case further.
The text was updated successfully, but these errors were encountered: