Writing a pi-calculus compiler in F# (Part V): Parser redux and runtime stubs

Just joining us? We’re writing a compiler for the π-calculus—Part I explains. Continuing from last time? Need a refresher? We just finished implementing a parser in F# for our little π-calculus language.

 

Now let’s switch gears and write our compiler’s top-level loop. This in main.fs:

 

open Ast

open System.IO

open Printf

open Idioms

 

let _ =

      if Array.length Sys.argv <> 2 || not (File.Exists(Sys.argv.(1)))

      then begin

            printf "usage: pic <file>\n";

            exit 1

      end;

     

      let filename = Sys.argv.(1) in

      let proc = ref Inert in

      using (new StreamReader(filename)) (fun input ->

            proc :=

                  let lexbuf = Lexing.from_stream_reader input in

                  try Pars.start Lex.token lexbuf

                  with e ->

                        let pos = Lexing.lexbuf_curr_p lexbuf in

                        printf "%s (%d, %d): parse error: %s"

                              filename

                              (pos.pos_lnum+1)

                              (pos.pos_cnum - pos.pos_bol + 1)

                              (e.Message);

                        exit 1;

                        Inert

      );

      print_any !proc

     

At some level, this is all mechanics; bewildering if you’re new to F# and pedestrian if you’re not. All we’re doing here is:

 

  1. Checking we got passed a filename argument
  2. Opening the file
  3. Parsing it (the contents of the using block)
  4. Printing what we parsed (with print_any !proc)

 

Let’s compile what we have thus far:

 

$ fslex lex.fsl

compiling to dfas (can take a while...)

18 states

writing output

 

$ fsyacc pars.fsy

building tables

computing first function

building kernels

computing lookahead relations

building lookahead table

building action table

building goto table

29 states

4 nonterminals

14 terminals

10 productions

#rows in action table: 29

#unique rows in action table: 24

maximum #different actions per state: 6

average #different actions per state: 2

 

$ fsc -o pic.exe ast.fs pars.fsi lex.fs pars.fs main.fs

 

$ pic

usage: pic <file>

 

$ _

 

Because fslex and fsyacc translate our lexer and parser files into ordinary F# source files we have to re-run fslex and fsyacc if we make changes to lex.fsl and pars.fsy. Otherwise, as we keep going, we can just re-run the F# compiler, fsc, with that third-to-last line. Now we have the front half of our compiler, which I’ve called pic, just waiting for something to compile! Let’s try three different programs:

 

$ type test\inert.pi

0

$ pic test\inert.pi

Inert

$ type test\syntax.pi

*this is undoubtedly a syntax error...

$ pic test\syntax.pi

test\syntax.pi (1, 1): parse error: unrecognized input

$ type test\echo.pi

!e(k).(new f).k<f>.f(x).f(r).r<x>.0

$ pic test\echo.pi

Repeat

 Recv

  ("e","k",

   New ("f",Send ("k","f",Recv ("f","x",Recv ("f","r",Send ("r","x",Inert))))))

$ _

 

The parser seems to be working, converting text into our abstract syntax, which print_any is dumping to the console.

 

Now we need to do the very last part—generate a .NET executable. Let’s quickly think about what that executable will look like. To do that, we need to be familiar with the CLR, and IL—the low-level CLR byte code. I recommend you read the ECMA/ISO specifications, particularly ECMA-335 Partition I-III; play around with the ilasm and ildasm tools which come with the .NET Framework SDK; and perhaps read John Gough’s excellent book Compiling for the .NET Common Language Runtime.

 

Back to thinking about our stuff: There will be running processes; these shall be CLR threads. Running processes have some variables (either from new, or from receiving them from somewhere) and those variables will need to be stored somewhere. It’s likely that | means to spawn a thread and copy captured variables into a closure for the new thread to reference whereas messages received from a channel can probably live in a local variable. Sequential parts of programs should probably just generate sequential code.

 

Receiving a message looks awfully like a method call—it’s something which activates your thread and passes you a parameter—but which really spoils our idea of generating sequential code, because every time the programmer writes receive in a sequence, we have to break apart the sequence in order to have a new method there. Also, what if two processes try to receive a message on the same channel at once? We need one place where message sending and receiving rendezvous, contention is resolved, and the threads from different processes can rub up against each other to pass data, but not jumping into each other’s code. Let’s call this place a Channel. All ­pic programs will need to use channels, so let’s write a little pic runtime library in C#, and create a reusable definition there:

 

public class Channel

{

    public readonly string Name;

 

    public Channel(string name)

    {

        this.Name = name;

    }

 

    public void Send(Channel channel)

    {

        // ???

    }

 

    public Channel Receive()

    {

        throw new System.NotImplementedException();

    }

}

 

I can’t say I know what happens here yet, but I know that you can create named channels which you can send and receive with. When you send, you put one datum in (a channel—everything in our world is a channel); when you receive, you get one datum out. We’ll worry about filling in those method stubs later. While these names are fresh in our minds, let’s write some AbsIL to reference the runtime library:

 

open Microsoft.AbstractIL.IL

 

(* Scope of the pic runtime assembly *)

let runtime_scope =

      ScopeRef_assembly

            { assemRefName = "Pirt";

              assemRefHash = None;

              assemRefPublicKeyInfo = None;

              assemRefRetargetable = false;

              assemRefVersion = None;

              assemRefLocale = None }

 

(* the channel type in the runtime library *)

let channel_typ =

      mk_nongeneric_boxed_typ (mk_tref (runtime_scope, "Channel"))

 

(* the channel constructor *)

let channel_ctor_mspec =

      mk_ctor_mspec_for_typ (channel_typ, [typ_String])

 

(* the Send method *)

let send_mspec =

      mk_nongeneric_instance_mspec_in_typ (channel_typ, "Send", [channel_typ], Type_void)

     

(* the Receive method *)

let recv_mspec =

      mk_nongeneric_instance_mspec_in_typ (channel_typ, "Receive", [], channel_typ)

 

This says the runtime library is called Pirt—we’d better remember to get the C# compiler to output Pirt.dll—and has a type named Channel, a constructor, a Send method, and a Receive method.

 

Next time we’ll get to the meat of our compiler and use this metadata to generate some code!

Powered by Google App Engine
Custom Search