Table of Contents |
---|
Source Language
C0
The labs will use increasingly more complete subsets of the C0 programming language that is designed for the 15-122 Principles of Imperative Computation intro-level course in Computer Science at Carnegie Mellon University. C0 is a safe small subset of the C programming language. C0 has an unambiguously defined semantics. All students are strongly encouraged to learn the C0 language that they will write a compiler for in this course.
Programming Languages for Implementing Compilers
You are free to choose from a subset of programming languages or even different programming languages (caveats apply) as the language for writing your compiler. This course requires you to be familiar with the programming language that you chose. You should learn the language before the course so that you do not struggle with too many difficulties at once.
Learning New Programming Languages for 411
If you want to learn a new programming language for your lab, consider the following. Students are always encouraged to learn new things and new programming languages. Haskell, for example, is also a particularly good language for the labs but you may not have learned it in other courses yet. Because ML, Haskell, and Scala have built-in pattern matching, several transformations are easier to implement than in Java.- Unfortunately, course staff cannot provide much assistance in learning new languages. You have to learn your programming language of choice from a language reference or tutorials. You are advised to be reasonably proficient in your programming language when the course starts. If you learn a language that is completely new to you, it is a good idea to start early.
- Since each lab builds on work done in the previous labs, you have to be completely committed to work with your programming language of choice. You cannot change your language in between without having to redo the work from all previous labs.
- If you are already familiar with one functional and/or typesafe programming language, it is a lot easier to learn Haskell or ML in the given time frame.
- There is a tradeoff between the time investment of learning a new programming language in the beginning of the course and potential extra effort spent implementing more advanced compiler features in more verbose languages at the end of the course.
- Compilers have been written successfully in a lot of different programming languages.
- You are free to write your compiler in a different programming language that we do not support. Keep in mind, however, that we cannot provide starter code for other programming languages. So if you want to write your compiler in a language other than ML, Haskell, Java, or Scala, you need to develop your own version of the starter code for Lab 1. You are advised to start with this very early on. We also recommend that you talk to the course staff for advice.
If you want to take a peek at the starter code for lab 1 for the various programming languages in order to make up your mind, you can sneak preview it as follows:
svn co https://cvs.concert.cs.cmu.edu/15411-f12/public/lab1/Keep in mind that you later need to log in with your team account in order to be able to get work done.
Standard ML
Standard ML Implementations:-
Standard ML of New Jersey (SML/NJ)
Default (v110.59) on the lab machines; invoke with sml
Recent versions are likely to be compatible with SML/NJ v110.59 -
MLton
A highly optimizing, whole program compiler mostly compatible with SML/NJ -
Poly/ML
Another high quality compiler
- The Standard ML Basis Library
- SML/NJ Libraries
- ML-Lex Manual
- ML-Yacc Manual
- ML-ANTLR in ml-lpt
- Programming in Standard ML by Robert Harper
Java
- Java API
- JLex: lexer generator
- JFLex: lexer generator
- CUP: LALR parser generator
- ANTLR: lexer and LL parser generator
- JavaCC: lexer and LL parser generator
Haskell
- Haskell books and tutorials
- Parsec: monadic parser combinator library for Haskell
- Happy: parser generator for Haskell
Scala
- Scala reference manuals and tutorials
- Scala API
- scala.util.parsing.combinator package
- scala.util.parsing.combinator.PackratParsers packrat parsing by memoization
- ScalaBison: recent LR parser generator
Others
You are welcome to use other programming languages for implementing compilers. If you choose to do so, however, we cannot provide starter code, but you have to write your own starter code for lab 1. There is no starter code for later labs so this is a one-time setup cost for you, but you are advised to start in time to finish lab 1. We also cannot provide much advice about your favorite programming language. You should carefully read our advice about learning new programming languages for 411.Target Languages
x86-64 Machine-Level Programming
The following documents will help you fathom the depths of machine-level programming on the x86-64 machines, a 64-bit extension of the Intel instruction set.
- x86-64 Machine-Level Programming. This document supplements Chapter 3 of the textbook for 15-213 Computer Systems: A Programmer's Perspective by Randal E. Bryant and David R. O'Hallaron.
- GNU Assembler User Guide. This is version (2.15) available on the lab machines. Contains i386-specific features with notes on the difference between x86 and x86-64.
- x86-64 Application Binary Interface (ABI). Specifies the rules for compilers and linkers.
- Official Intel Processor Manuals (not for the faint of heart) including the Instruction Set Reference in two volumes.
IA32 and Assembler Reference Material
The following are for the older Intel x86 architectures. See the newer references above for the x86-64 (also known as IA32-EM64T).
- Intel opcode table for IA32, two-page reference
- x86 asm in Intel format
LLVM
- LLVM Home Page
- On-line demo (for producing LLVM source from C)
- LLVM Notes for this class, on the lab machines
Java Virtual Machine (JVM)
Tools
GDB
- Updated gdb reference for x86-64 architecture
- Quick gdb reference for x86 architecture