Compiling for the .NET Common Language Runtime (CLR)

*      Click here for that pesky preface, and here  for a sample chapter.

*      Click here  for Prentice-Hall’s home page for the book, then search for gough.

*      More about the book 

*      Resources     

*      Errata

*      Chapter Notes

           

 

 

Click cover for larger view

 

 

 

 

This page last updated on December 4, 2001


More about the Book

 

The idea for the book arose out of work that was done in Microsoft’s Project 7.

In that project a number of compiler teams created new language compilers for the Common Language Runtime.

John Gough and Paul Roe wrote a compiler for Component Pascal.  As it turns out, this is an interesting language to implement on the CLR, since it poses a number of interesting problems.  Finding solutions to these problems helps to clarify the capabilities of the runtime.

The Series

The book is the first in a series that Prentice-Hall will publish.  The series editor is Bertrand Meyer.

Use as a Textbook

Although the book was not primarily intended as a text, it may be of some interest to those universities and colleges that take a practical approach to the compiler subject.  Wayne Kelly has pioneered the use of the CLR as a target for an introductory compiler construction course.  This site will add some material in the resources section (below) that may be of interest to others wishing to go down that path.

 

 

top

Resources

This section has links to material on a chapter-by-chapter basis.

For each chapter there is a commentary, including any errata or late breaking news. There is also a link to download the example programs for that particular chapter.  These resources should all appear by mid December 2001.

General Remarks, and late breaking news.

Using the PDC 2001 release candidate

 

People using the RTM version of the framework should be aware of a small number of issues that have arisen since completion of the book.

*  Installation of the framework may not set up your path so as to be able to find ildasm or peverify.  You may check for this by opening a command window and trying “ilasm”, and then “ildasm”. If the command line processor can find one but not the other, edit your PATH variable so as to include the extra directory.  On my machine it was

C:\Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Bin

*  The verifier is now insisting that methods with one or more local variables must set the “init” flag.  gpcp version 1.1.2 has been fixed to account for this.  Local declarations now appear like –

.locals init (int32 foo)

with the declaration “init” being new.

Chapter 1

Chapter 2

Chapter 3

Chapter 4

The complete listings for the test program ValCls.cp and ValCls.il shown in outline on page 127 are linked here.  You will note that RecTyp is implemented as a value class, and PtrTyp is implemented as the reference class Boxed_RecTyp. Beware of the typographical error on the following page (listed below).

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 10

Chapter11

Chapter 12

Appendix A

Appendix B

 

 

top

 

 

Errata –

In general errors fall into three categories: regular typographical errors, things that were correct when they were written but are now incorrect, and stuff that is just plain wrong.

Chapter 1

Chapter 2

Chapter 3

On page 57 it is stated that value types cannot have virtual methods defined on them. 

This is just plain wrong.

Value types may have both instance and virtual methods defined on them.  In both cases it is a boxed instance of the value type that is passed as the “this” reference, but the “this” inside the method is a managed pointer to the value.

The use of virtual methods on value types is rather limited.  Such types are necessarily sealed, so any new virtual method cannot be overridden. In such cases an ordinary instance method would do as well, and possibly be more efficient.  The main use of such methods is to override virtual methods defined on the base types System.ValueType or System.Object.

 

Page 73 mentions “transient pointers”.  Transient pointers have been struck from final version the CLR.  Instead addresses on the evaluation stack are always managed pointers.  This change does not affect any of the described semantics (which is a good reason to delete the concept.)

Chapter 4

On page 128 the first of the two call statements has a typographical error.  The class name part of the method designator is wrong.

The two lines should read

     call instance void ValCls.RecTyp::’Foo’()

     call instance void ValCls.Boxed_RecTyp::’Bar’()

See the resources section above for complete listings of the example program.

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 10

Chapter11

Chapter 12

Appendix A

Appendix B

 

 

top

 

Chapter Notes

 

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Chapter 7

On page 219 the use of the optional “init” marker for local variable declarations is described.  The use of this marker is now mandatory for code to be verified.  This change is motivated by security considerations.  I would speculate that there is a trade-off here between the amount of analysis the verifier need to do to avoid this restriction and the cost of a possibly wasted initialization.

Chapter 8

Chapter 9

Chapter 10

One of the topics in this chapter, is handling overloading in languages that don’t.

Since release 1.1 gpcp has taken this chapter’s advice, and directly implements name overloading of method names in foreign (imported) libraries.  As suggested in the chapter, when there is not an unambiguous binding of the name, it is an error.  This forces programmers to avoid implicit type-coercions in argument lists to overloaded methods.

Eiffel for .NET takes a rather different approach.  In the case of that language for every foreign library an explicit mapping of names must be created.  Because of the confusion that would result if every Eiffel programmer defined their own name-mapping ISE is hosting a repository of agreed maps.  This is an elegant solution to the problem for those languages that have a close-knit user community where such a repository can work well.

Chapter11

Chapter 12

(Warning: this section will make no sense until the relevant section of the book has been read and understood. Sorry.) The final section on Accessing Non-local Variables has been simplified for purposes of exposition.  The gpcp compiler actually implements one significant optimization compared to the design described in this section of the book.  The new idea is that only activation records that actually contain uplevel data have XHR records.  The static link references still chain the XHR records together, but skip over activation records that have no such data.  In this optimized design, if a method at level N needs to access data a level M, then instead of following the chain of links (N-M) times the link must be followed d times where d is the number of methods statically nested between level M and N that have non-empty uplevel addressed local variable sets.  This number is statically known at compile time. In typical code, where the use of non-local data is sparse, this optimization not only saves on runtime code but also reduces the number of costly object creation steps.  Check both the source of gpcp, and run some examples to see how it works in practice.

Appendix A

Because of the new requirements for verification mentioned above (use of the “init” marker), it is now less critical to perform initialization analysis.  Nevertheless, in unverified contexts this is still an efficiency gain.  In any case, the arguments in favor of detecting additional erroneous programs at compile time still hold.

Appendix B

 

 

top