Monday, March 17, 2014

Rust namespaces by example


/* ---[ Cheat sheet for Rust namespaces ]--- */

Rust namespaces can be a little mind-bending. This brief blog post is meant to provide instruction by example on setting up an application or library of Rust code that uses multiple files in a hierarchical namespace.

Though the current docs-project-structure guide on the Rust wiki is pretty sparse, you should start there. Then read the section on Crates and Modules in the tutorial.

I used those references plus an examination of how files and namespaces in the libstd code tree are structured to come up with this example.

In this simple example, I want to have an application where things are namespace to the top level module abc. I want to have a couple of files (namespaces) under abc, some additional directories (namespaces) below abc, such as abc::xyz and have modules in the abc::xyz namespace. Furthermore, I want all those namespaces to be able to refer to each - both down the chain and up the chain.

Here is a simple example that illustrates how to do it. I am using Rust-0.10pre (built 16-Mar-2014).

First, I have a project I called "space-manatee", under which I have a src directory and then my code hierarchy starts:

quux00:~/rustlang/space-manatee/src$ tree
.
├── abc
│   ├── mod.rs
│   ├── thing.rs
│   ├── wibble.rs
│   └── xyz
│       ├── mod.rs
│       └── waldo.rs
└── main.rs

2 directories, 6 files

To provide a namespace foo in Rust, you can either create a file called foo.rs or a dir/file combo of foo/mod.rs. The content of my abc/mod.rs is:

quux00:~/rustlang/space-manatee/src$ cat abc/mod.rs 
pub mod thing;
pub mod wibble;
pub mod xyz;

All this module does is export other modules in the same directory. It could have additional code in it - functions and data structures, but I elected not to do that.



xyz is a directory, and since I created the xyz/mod.rs dir/file combo, it is a namespace that can be used and exported.

Let's look into the other files:

quux00:~/rustlang/space-manatee/src$ cat abc/thing.rs 
extern crate time;

use time::Tm;

pub struct Thing1 {
    name: ~str,
    when: time::Tm,
}

pub fn new_thing1(name: ~str) -> ~Thing1 {
    ~Thing1{name: name, when: time::now()}
}

thing.rs pulls in the rustlang time crate and then defines a struct and constructor for it. It doesn't reference any other space-manatee code.

quux00:~/rustlang/space-manatee/src$ cat abc/wibble.rs 
use abc::thing;
use abc::thing::Thing1;

pub struct Wibble {
    mything: ~Thing1
}

pub fn make_wibble() -> ~Wibble {
    ~Wibble{mything: thing::new_thing1(~"cthulu")}
}

wibble.rs, however, does reference other space-manatee projects, so it uses the fully qualified namespace from the top of the hierarchy, but it does not have to explicitly "import" anything. It can find the thing namespace without a mod declaration because thing.rs is in the same directory.



OK, let's look into the xyz directory now.

quux00:~/rustlang/space-manatee/src$ cat abc/xyz/mod.rs 
pub mod waldo;

That just exports the waldo namespace in the same directory. What's in waldo?

quux00:~/rustlang/space-manatee/src$ cat abc/xyz/waldo.rs 
use abc::wibble::Wibble;

pub struct Waldo {
    magic_number: int,
    w: ~Wibble
}

The Waldo struct references the Wibble struct that is higher than it in the hierarchy. Notice there is no "import" via a mod statement - apparently going up the hierarchy requires no import.

So that's the supporting cast. Let's see how the main.rs program uses them:

quux00:~/rustlang/space-manatee/src$ cat main.rs 
extern crate time;

use abc::{thing,wibble};
use abc::thing::Thing1;
use abc::wibble::Wibble;
use abc::xyz::waldo::Waldo;

pub mod abc;

fn main() {
    let tg: ~Thing1 = thing::new_thing1(~"fred");
    println!("{}", tg.name);

    let wb: ~Wibble = wibble::make_wibble();
    println!("{}", wb.mything.name);

    let wdo = Waldo{magic_number: 42,
                    w: wibble::make_wibble()};
    println!("{:?}", wdo);
}

The only mod "import" main.rs had to do is of the abc namespace - which is in the same directory as main.rs. In fact, that is all you can import. If you try mod abc::thing the compiler will tell you that you aren't doing it right.

By importing abc, you are importing abc/mod.rs. Go back up and look at what abc/mod.rs does - it imports other modules, which in turn import other modules, so they all end up being imported into main.rs as addressable entities.



Once all those import references are set up, nothing special has to be done to compile and run it:

quux00:~/rustlang/space-manatee/src$ rustc main.rs
quux00:~/rustlang/space-manatee/src$ ./main
fred
cthulu
abc::xyz::waldo::Waldo{
  magic_number: 42,
  w: ~abc::wibble::Wibble{
    mything: ~abc::thing::Thing1{
      name: ~"cthulu",
      when: time::Tm{tm_sec: 21i32,
        tm_min: 26i32, tm_hour: 21i32, tm_mday: 17i32, tm_mon: 2i32,
        tm_year: 114i32, tm_wday: 1i32, tm_yday: 75i32, tm_isdst: 1i32,
        tm_gmtoff: -14400i32, tm_zone: ~"EDT", tm_nsec: 663891679i32
      }
    }
  }
}

(I hand formatted the Waldo output for easier reading.)

Sunday, March 16, 2014

Select over multiple Rust Channels


/* ---[ Channels in Rust ]--- */

The channel paradigm for CSP-based concurrency has received a lot of attention lately since it is the foundational concurrency paradigm in Go and Clojure has embraced it with core.async. It turns out that Rust, the new language from Mozilla, also fully embraces channel-based message passing concurrency.

Both Go and Clojure's core.async have a select operation that allows your code to wait on multiple channels and respond to the first one that is ready. This is based, at least conceptually, on the Unix select system call that monitors multiple file descriptors.

Rust also has a select operation. And it has a select! macro to make using it easier. Here's an example:

use std::io::Timer;

fn use_select_macro() {
    let (ch, pt): (Sender<~str>, Receiver<~str>) = channel();

    let mut timer = Timer::new().unwrap();
    let timeout = timer.oneshot(1000);
    select! (
        s = pt.recv() => println!("{:s}", s),
        () = timeout.recv() => println!("timed out!")
    );
}

Channels and Ports are now called Senders and Receivers in Rust. As with select in Go, if the Receiver called pt has a message come in before the 1 second timer goes off, its code block will execute. Otherwise, the timer's Receiver will be read from and its code block executed, printing "timed out".

Note that the select! macro uses parens, like a function call, not curly braces like a code block.

The select! macro is currently labelled experimental, since it has some limitations. One I hit this week is that it will fail (as in, not compile) if you embed the Receiver in a struct:


fn does_not_compile() {
    let (ch, pt): (Sender<~str>, Receiver<~str>) = channel();
    let a = A{c: ch, p: pt};

    let mut timer = Timer::new().unwrap();
    let timeout = timer.oneshot(1000);
    select! (
        s = a.p.recv() => println!("{:s}", s), 
        () = timeout.recv() => println!("time out!")
    );
}

This fails with error: no rules expected the token '.'. I've filed an issue for it here: https://github.com/mozilla/rust/issues/12902#issuecomment-37714663


/* ---[ Using the Rust Channel Select API ]--- */

The workaround is to use the Select API directly. Here's how you do it:


use std::comm::Select;
use std::io::Timer;

fn select_from_struct() {
    let (ch, pt): (Sender<~str>, Receiver<~str>) = channel();    
    let mut timer = Timer::new().unwrap();
    let timeout = timer.oneshot(1000);

    let a = A{c: ch, p: pt};

    let sel = Select::new();
    let mut pt = sel.handle(&a.p);
    let mut timeout = sel.handle(&timeout);
    unsafe { pt.add(); timeout.add(); }
    let ret = sel.wait();

    if ret == pt.id() {
        let s = pt.recv();
        println!("ss: {:?}", s);
    } else if ret == timeout.id() {
        let () = timeout.recv();
        println!("TIMEOUT!!");
    }
}

It's a little more code, but fairly straightforward to follow. You wrap your Receivers in a select Handle and them add them add them to the Receiver set via the add method (which must be wrapped in an unsafe block). Each handle gets an id so you can discover which one returned first.

Finally you wait. When the wait returns you check the return id and execute the appropriate block of code, which starts by call recv on the Receiver to get the incoming value (if any).