A reasoning engine infers logical consequences from a set of fixed axioms and observations. However, before it can make an inference, it must compile the axioms and observations which are given in a predefined format. Any attempt to test the correctness of a reasoning engine assumes that it compiles inputs correctly, but that may not be the case. In this work, we implement a mutated grammar fuzzer to automatically generate tests for the compilation stage of Assumption-based Truth Maintenance System (ATMS), a reasoning engine for model-based diagnosis. We also implement a recognizer as an oracle and automatically evaluate the correctness of compiler output. We automatically generate, execute, and evaluate more than a million tests in two weeks. We show that while tests generated from the true grammar of ATMS find no faults, tests generated from mutated grammars uncover an important fault in the compiler. We also show that mutated grammars achieve higher code coverage with fewer tests and the original grammar cannot cover any code that is not covered by mutated grammars. To the best of our knowledge, ours is the first work that provides a practical implementation and evaluation of a mutated grammar fuzzer. We make the implementation available online along with small examples, tests generated for this paper, and steps to reproduce our experiments.