IPDL/Getting started
IPDL is a domain-specific language that allows programmers to define a message-passing "protocol" between "actors." These "actors" are thread contexts, which can execute either in separate address spaces or within the same address space (shared-nothing threads).
"Protocols" consist of two elements: declarations of messages that can be exchanged between two actors, and the definition of a state machine that describes when each message is allowed to be sent.
From an IPDL specification, several C++ headers are generated. These headers are meant to be opaque to the author of an IPDL specification; internally, they manage the tedious details of setting up and tearing down the underlying communication layer (sockets and pipes), constructing and sending messages, ensuring that all actors adhere to their specifications, and "correctly" handling errors.
This guide intends to introduce the basic concepts of IPDL through an increasingly complicated example. By the end of the guide, you should be able to write IPDL specs and the C++ implementations of message handlers. This guide does not attempt to cover how IPDL works under the covers.
Running example: Browser plugins
We will use the example of a web browser launching plugins in separate processes and then controlling them. A plugin here is a dynamically loaded code module, such as libflash.so. Once a plugin module has been loaded, the browser can ask the plugin module to create instances. A plugin instance is what lives inside an object frame on a particular web page; a single youtube video is a Flash plugin instance, for example. There can be any number of plugin instances per plugin module. (There can be any number of youtube videos open in a web browser.)
Plugin instances are scriptable, which means that they can access JavaScript objects in the browser, and the browser can access JavaScript objects created by the plugin instance. Using the youtube Flash video example again, the youtube instance can create a JavaScript object that represents the video player, and the browser can access that video player object and ask it to "Pause," "Play," etc. There can be any number of script objects per plugin instance.
Protocols and actors
Protocols define how two actors communicate. We'll introduce protocols and actors using the plugin modules described above. Recall that in this example, we have two processes: the browser process and the process in which the plugin module's code executes. There are two actors: the thread context in which the browser's plugin management code executes (in the browser process), and the thread on which the plugin's code executes (in the plugin process).
We have chosen to codify the concept of parent and child actors in IPDL; we use "parent actor" to refer to the "more trusted" actor, and "child actor" to refer to the "less trusted" actor. In the case of plugins, the browser actor is the parent, and the plugin actor is the child.
The parent and child actors communicate by sending messages to each other. The messages that can be exchanged are explicitly declared in IPDL. The following IPDL code defines a very basic interaction of browser and plugin actors
protocol Plugin { child: Init(); Deinit(); };
This code defines a Plugin
protocol. On the next three lines, the code declares two messages, Init()
and Deinit()
. We will describe the messages in more detail in the next section. To finish the introduction of protocols and actors, note the child
keyword used on the 2nd line. This keyword says that child "accepts", or "implements", the messages declared under that "child" label. That is, the parent actor can send those messages to the child, but the child cannot send those messages to the parent.
Basic messages
It is very important to understand message semantics. It's tempting to think of protocol messages as C++ function calls, but that's not very useful. We'll delve into gory details of the above example to illustrate these semantics. To keep the example as concrete as possible, we first need to take a detour into how the above example is translated into something that C++ code can utilize.
The above specification will generate three headers: PluginProtocolParent.h, PluginProtocolChild.h, and PluginProtocol.h (we'll ignore the third completely; it's full of uninteresting implementation details). As you might guess, PluginProtocolParent.h defines a C++ class (PluginProtocolParent) that code on the parent side utilizes, the browser in this example. And similarly, code running in the plugin process will use the PluginProtocolChild class.
These Parent and Child classes are abstract. They will contain a "Send" function implementation for each message type in the protocol (which inheriting classes can call to initiate communication), and a pure virtual "Recv" function for the message receiver (which needs to be implemented by the inheriting class). For example, the PluginProtocolParent class will look something like the following in C++
class PluginProtocolParent { public: void SendInit() { // boilerplate generated code ... } void SendDeinit() { // boilerplate generated code ... } };
and the PluginProtocolChild will be something like:
class PluginProtocolChild { protected: virtual void RecvInit() = 0; virtual void RecvDeinit() = 0; };
These Parent and Child abstract classes take care of all the "protocol layer" concerns; sending messages, checking protocol safety (we'll discuss that later), and so forth. However, these abstract classes can do nothing but send and receive messages; they don't actually do anything, like draw to the screen or write to files. It's up to implementing C++ code to actually do interesting things with the messages.
So these abstract Parent and Child classes are meant to be subclassed by C++ implementors. Below is a dirt simple example of how a browser implementor might utilize the PluginProtocolParent
class PluginParent : public PluginProtocolParent { public: PluginParent(String pluginDsoFile) { // launch child plugin process } };
This is a boring class. It simply launches the plugin child process. Note that since PluginParent inherits from PluginProtocolParent, the browser code can invoke the SendInit()
and SendDeinit()
methods on PluginParent objects. We'll show an example of this below.
Here's how the PluginProtocolChild might be used by a C++ implementor in the plugin process:
class PluginChild : public PluginProtocolChild { protected: // implement the PluginProtocolChild "interface" void RecvInit() { printf("Init() message received\n"); // initialize the plugin module } void RecvDeinit() { printf("Deinit() message received\n"); // deinitialize the plugin module } public: PluginChild(String pluginDsoFile) { // load the plugin DSO } };
The PluginChild is more interesting: it implements the "message handlers" RecvInit() and RecvDeinit() that will be automatically invoked by the protocol code when the plugin process receives the Init() and Deinit() messages, respectively.
Launching the subprocess and hooking these protocol actors into our IPC "transport layer" is beyond the scope of this document. See IPDL/Five minute example for more details.
Let's run through an example of how this dirt simple protocol and its implementation could be used. In the table below, the first column shows C++ statements executing in the browser process, and the second shows C++ statements executing in the plugin process.
Browser process | Plugin process |
---|---|
BrowserMain.cc: PluginParent* pp = new PluginParent("libflash.so"); | |
(do browser stuff) | plugin process is created |
... | PluginMain.cc: PluginChild* pc = new PluginChild("libflash.so"); |
BrowserMain.cc: pp->SendInit(); | (idle) |
PluginProtocolParent.h: (construct Init() message, send it) | ... |
(idle) | PluginProtocolChild.h: (unpack Init() message, call RecvInit();) |
... | PluginChild.cc: RecvInit() { // do stuff } |
... | ... |
TODO: is it clear what is going on here?
Direction
Each message type includes a "direction." The message direction specifies whether the message can be sent from-parent-to-child, from-child-to-parent, or both ways. Three keywords serve as direction specifiers; child was introduced above. The second is parent, which means that the messages declared under the parent label can only be sent from-child-to-parent. The third is both, which means that the declared messages can be sent in both directions. The following artificial example shows how these specifiers are used and how these specifiers change the generated abstract actor classes.
// this protocol ... protocol Direction { child: Foo(); // can be sent from-parent-to-child parent: Bar(); // can be sent from-child-to-parent both: Baz(); // can be sent both ways }; // ... generates these C++ abstract classes ...
// ----- DirectionProtocolParent.h ----- class DirectionProtocolParent { protected: virtual void RecvBar() = 0; virtual void RecvBaz() = 0; public: void SendFoo() { /* boilerplate */ } void SendBaz() { /* boilerplate */ } };
// ----- DirectionProtocolChild.h ----- class DirectionProtocolChild { protected: virtual void RecvFoo() = 0; virtual void RecvBaz() = 0; public: void SendBar() { /* boilerplate */ } void SendBaz() { /* boilerplate */ } };
Syntax note: you can use the child, parent, and both specifiers multiple times in a protocol specification. They behave like public, protected, and private labels in C++.
Parameters
Message declarations allow any number of parameters, which are data serialized by the sender and deserialized by the receiver. The following snippet of IPDL and generated code shows how parameters can be used in message declarations.
// protocol Blah { ... child: Foo(int parameter); // class BlahProtocolParent { ... void SendFoo(const int& parameter) { // boilerplate } // class BlahProtocolChild { ... virtual void RecvFoo(const int& paramter) = 0;
The IPDL compiler has a set of "builtin" types for which library code exists to serialize and deserialize parameters of that type. These builtin types include the C/C++ integer types (bool, char, int, ..., int8_t, uint16_t, ...), a string type (String
in IPDL), and some array types (StringArray
, for example). These builtin types are in flux and will certainly be expanded. The most up-to-date reference for builtins is the file ipc/glue/MessageTypes.h
.
The builtin types are insufficient for all protocols. When you need to send data of type other than one built into IPDL, you can add a using
declaration of the type in an IPDL specification, and in C++ define your own data serializer and deserializer. The details of this are beyond the scope of this document; see dom/plugins/NPAPI.ipdl
and dom/plugins/PluginMessageUtils.h
for examples of how this is done.
Semantics
Note that in all the IPDL message declarations above, the generated C++ methods corresponding to those messages always had the return type void
. What if we wanted to return values from message handlers, in addition to sending parameters in the messages? If we thought of IPDL messages as C++ functions, this would be very natural. However, "returning" values from message handlers is much different from returning values from function calls. Please don't get them mixed up!
In the Plugin protocol example above, we saw the parent actor send the Init() message to the child actor. Under the covers, "sending the Init() message" encompasses code on the parent side creating some Init()-like message object in C++, serializing that message object into a sequence of bytes, and then sending those bytes over a socket to the child actor. (All this is hidden from the C++ implementor, PluginParent above. You don't have to worry about this except to better understand message semantics.) What happens in the parent-side code once it has written those bytes to the socket's file descriptor? It is this behavior that we broadly call message semantics.
In IPDL, the parent-side code has three options for what happens after those serialized bytes are written to the socket:
- Continue executing. We call this asynchronous semantics; the parent is not blocked.
- Wait until the child acknowledges that it received the message. We call this synchronous semantics, as the parent blocks until the child receives the message and sends back a reply.
- The third option is more complicated and will be introduced below, after another example.
In the Plugin protocol example above, we might want the Init() message to be synchronous. It may not make sense for the browser to continue executing until it knows whether the plugin was successfully initialized. (We will discuss which semantics, asynchronous or synchronous, is preferred in a later section.) So for this particular case, let's extend the Plugin protocol's Init() message to be synchronous and return an error code: 0 if the plugin was initialized successfully, non-zero if not. Here's a first attempt at that
protocol Plugin { child: sync Init() returns (int rv); Deinit(); };
We added two new keywords to the Plugin protocol, sync and returns. sync marks a message as being sent synchronously; note that the Deinit() message has no specifier. The default semantics is asynchronous. The returns keyword marks the beginning of the list of values that are returned in the reply to the message. (It is a type error to add a returns block to an asynchronous message.)
Detour: the above protocol will fail the IPDL type checker. Why? IPDL protocols also have "semantics specifiers", just like messages. A protocol must be declared to have semantics at least as "strong" as its strongest message semantics. Synchronous semantics is called "stronger than" asynchronous. Like message declarations, the default protocol semantics is asynchronous; however, since the Plugin protocol declares a synchronous message, this type rule is violated. The fixed up Plugin protocol is shown below.
sync protocol Plugin { child: sync Init() returns (int rv); Deinit(); };
This new sync message with returns values changes the generated PluginProtocolParent and PluginProtocolChild headers as follows
// class PluginProtocolParent { ... int SendInit() { /* boilerplate */ } // class PluginProtocolChild { ... virtual int RecvInit() = 0;
To the parent implementor, the new SendInit() method signature means that it receives an int return code back from the child. To the child implementor, the new RecvInit() method signature means that it must return an int back from the handler, signifying whether the plugin was initialized successfully.
Important: To reiterate, after the parent code calls SendInit(), it will block the parent actor's thread until the response to this message is received from the child actor. On the child side, the int returned by child implementor from RecvInit() is packed into the response to Init(), then this response is sent back to the parent. Once the parent reads the bytes of response message from its socket, it deserializes the int return value and unblocks the parent actor's thread, returning that deserialized int to the caller of SendInit(). It is very important to grok this sequence of events.
Implementation detail: IPDL supports multiple returns values, such as in the following example message declaration
// sync protocol Blah { child: sync Foo(int param1, char param2) returns (long ret1, int64_t ret2);
C++ does not have syntax that allows returning multiple values from functions/methods. Additionally, IPDL needs to allow implementor code to signal error conditions, and IPDL itself needs to notify implementors of errors. To those ends, IPDL generates interface methods with nsresult return types and "outparams" for the IPDL returns values. So in reality, the C++ interface generated for the Blah example above would be
// class BlahProtocolParent { ... nsresult SendFoo(const int& param1, const int& param2, long* ret1, int64_t* ret2) { /* boilerplate */ } // class BlahProtocolChild { ... virtual nsresult RecvFoo(const int& param1, const int& param2, long* ret1, int64_t* ret2) = 0;
And the actual interface generated for the Plugin protocol is
// class PluginProtocolParent { ... nsresult SendInit(int* rv) { /* boilerplate */ } // class PluginProtocolChild { ... virtual nsresult RecvInit(int* rv) = 0;
RPC semantics
"RPC" stands for "remote procedure call," and this third semantics models procedure call semantics. A quick summary of the difference between RPC and sync semantics is that RPC allows "re-entrant" message handlers --- that is, while an actor is blocked waiting for an "answer" to an RPC "call", it can be unblocked to handle a new, incoming RPC call.
In the example protocol below, the child actor offers a "CallMeCallYou()" RPC interface, and the parent offers a "CallYou()" RPC interface. The rpc
qualifiers mean that if the parent calls "CallMeCallYou()" on the child actor, then the child actor, while servicing this call, is allowed to call back into the parent actor's "CallYou()" message.
rpc protocol Example { child: rpc CallMeCallYou() returns (int rv); parent: rpc CallYou() returns (int rv); };
If this were instead a sync protocol, the child actor would not be allowed to call the parent actor's "CallYou()" method while servicing the "CallMeCallYou()" message. (The child actor would be terminated with extreme prejudice.)
TODO sequence diagram if this explanation is unclear
Preferred semantics
Use async semantics whenever possible. Asynchronous messaging in C++ has a bad rap because of the complexity and verbosity of implementing asynchronous "protocols" in C++ with event loops and massive "switch" statements. IPDL is specifically designed to make this easier and more concise.
Blocking on replies to messages is discouraged. If you absolutely need to block on a reply, use sync semantics very carefully. It is possible to get into trouble with careless uses of synchronous messages; while IPDL can (or will eventually) check and/or guarantee that your code does not deadlock, it is easy to cause nasty performance problems by blocking.
Please don't use RPC semantics. RPC inherits (nearly) all the problems of sync messages, while adding more of its own. Every time your code makes an RPC call, it must ensure that its state is "consistent" enough to handle every re-entrant (nested) call allowed by the IPDL state machine. This is hard to get right. (Why then is RPC semantics supported? Basically because of NPAPI plugins, where we can't change their existing code.)
Checkpoint 1
By this point in the guide, you should understand
- basically what an IPDL protocol is
- what an IPDL actor is, basically
- what an IPDL message is, basically
- message directions
- message semantics
- the relationship between IPDL and its generated interface; i.e., what your C++ code must implement
Protocol management
So far we've seen a protocol that Plugin actors use to communicate. Plugins are "singletons"; there's only one copy of libflash.so open at any time. However, there are many plugin instances. (This is a very general pattern.) IPDL needs to somehow support these "instances", and it does so through "managed" protocols, a.k.a. sub-protocols (the terms will be used interchangeably). A sub-protocol is bound to a "manager" protocol, and actors created speaking that sub-protocol are bound to the lifetime of the "manager" protocol actor that created them. The managing protocol acts like a "factory" for actors of the sub-protocol. In general, IPDL supports hierarchies of protocols, descending from a single top-level protocol.
The following example extends the Plugin protocol to manage a PluginInstance protocol. The Plugin protocol is top-level in this example.
// ----- file Plugin.ipdl include protocol "PluginInstance.ipdl"; sync protocol Plugin { manages PluginInstance; child: sync Init() returns (int rv); Deinit(); sync PluginInstance(String type, StringArray args) returns (int rv); ~PluginInstance(); }
We added four new chunks of syntax. The first is a "protocol include," which are different from C++ includes in IPDL. IPDL implicitly defines "include guards" for protocol specifications, and actually reads and parses included protocol files. IPDL also allows C++ headers to be included, but these are completely transparent to IPDL (see PluginInstance.ipdl below).
The second new statement is a "manages statement." Here we declare that the Plugin protocol manages the PluginInstance protocol (which is defined in the included "PluginInstance.ipdl" specification). This means that the Plugin protocol must provide a "constructor" and "destructor" for PluginInstance actors; these are akin to factory methods. The "manages" statement also means that PluginInstance actors are tied to the lifetime of the Plugin actor that creates them --- that is, once the Plugin actor dies, all the PluginInstances become invalid.
The third and fourth new statements are "constructor" and "destructor" message declarations. The syntax was borrowed from C++, but the semantics are very different. First of all, constructors and destructors have a direction, like other IPDL messages. They also can have arbitrary parameters and return values, and can additionally have any IPDL messaging semantics. For now, you'll have to take our word that this is useful. The reason for constructors and destructors behaving like other messages will be explained below.
Constructors and destructors turn into C++ interface methods that are almost the same as other IPDL interface methods generated for message declarations, with one notable exception: the implementing C++ class must provide actual methods for allocating and deallocating the sub-protocol actors. This is shown below.
//class PluginParent { ... // NOTE: these two methods are basically the same for PluginParent and PluginChild virtual PluginInstanceParent* PluginInstanceConstructor( const String& type, const StringArray& args, int* rv) = 0; virtual nsresult PluginInstanceDestructor(PluginInstanceParent* __a) = 0; // NOTE: PluginChild does not have "Recv*" analogues of these methods PluginInstanceParent* SendPluginInstanceConstructor( const String& type, const StringArray& args, int* rv) { /* boilerplate code */ } nsresult SendPluginInstanceDestructor(PluginInstanceParent* __a) { /* boilerplate code */ }
//class PluginChild {... // NOTE: these two methods are basically the same for PluginParent and PluginChild virtual PluginInstanceChild* PluginInstanceConstructor( const String& type, const StringArray& args, int* rv) = 0; virtual nsresult PluginInstanceDestructor(PluginInstanceChild* __a) = 0;
Let's break apart this example quickly. Both the PluginParent and PluginChild have to be able to allocate and deallocate raw PluginInstance's; IPDL requires your C++ code to do this by implementing the PluginInstanceConstructor/PluginInstanceDestructor interface. (Your code must do this, because IPDL has no idea what concrete classes you'll have implementing the PluginInstance abstract classes, and it's best that IPDL does not know this.)
Different from this raw C++ allocation/deallocation is the protocol-level allocation/deallocation, done by actors sending each other constructor/destructor messages. This is the second component of the C++ interface above; note that the interface generated for ctor/dtor messages is basically identical to that generated for other messages. (There's one exception: the PluginChild does not need to implement "RecvPluginInstanceConstructor()" and "RecvPluginInstancecDestructor()" handlers. IPDL knows what this code should do and thus generates it for you.) The real secret "secret sauce" to IPDL ctors/dtors is that the generated code is smart enough to know when to invoke the C++-level allocation/deallocation "factory methods." Your code need not worry about this.
Next, let's take a look at the PluginInstance sub-protocol.
// ----- file PluginInstance.ipdl include "mozilla/plugins/PluginTypes.h" using mozilla::plugins::PluginWindow; sync protocol PluginInstance { manager Plugin; child: SetWindow(PluginWindow window); Paint(); parent: sync GetBrowserValue(String key) returns (String value); };
This protocol doesn't introduce any wildly new features; three new bits of syntax. First, PluginInstance.ipdl contains a "C++ include." This is transparent to IPDL, and is merely passed through to generated C++ code (i.e., IPDL doesn't even care if the file exists.) The second bit of new syntax is the "using statement," which pulls a non-builtin C++ type into IPDL's type system. This statement is modeled after C++'s using statement and should hopefully be familiar.
The last bit of new syntax is the "manager statement." This tells IPDL that the PluginInstance protocol is managed by the Plugin protocol. (Yes, "manager" is redundant. It exists for a variety of uninteresting reasons.)
At this point, it's worth noting two more facts about protocol management: first, to reiterate, the PluginInstance protocol can manage its own protocols. For example, in real life, PluginInstances manage "PluginScriptObject" and "PluginStream" protocols. The second bit of trivia is an implementation detail: in the current IPDL implementation, there is exactly one top-level protocol per underlying socket. However, this might change.
Protocol state machines
The astute reader might question why IPDL includes the word "protocol" when all that has been introduced so far are unstructured grab-bags of messages. IPDL allows protocol authors to define the order and structure of how messages may be sent/received by defining protocol state machines (finite state machines).
IPDL parent and child actors follow the same state machine, and keep states that are the "same" (though the parent and child states may be momentarily out of sync while messages cross the wire). IPDL arbitrarily requires state machines to be defined from the perspective of the parent side of the protocol. For example, when you see the send Msg
syntax, it means "when the parent actor sends Msg."
The following example shows one such state machine for the Plugin protocol.
include protocol "PluginInstance.ipdl"; sync protocol Plugin { manages PluginInstance; child: sync Init() returns (int rv); Deinit(); sync PluginInstance(String type, StringArray args) returns (int rv); ~PluginInstance(); // NOTE: state machine follows state START: send Init goto IDLE; state IDLE: send PluginInstance goto ACTIVE; state ACTIVE: send PluginInstance goto ACTIVE; send ~PluginInstance goto ACTIVE; send Deinit goto DYING; state DYING: send ~PluginInstance goto DYING; };
The new syntax here is threefold. First are "state declarations" --- state FOO:
declares a state "FOO". (States are capitalized by convention, not because of syntactic rules.) The first state to be declared is the protocol's "start state"; when an actor is created, its initial state is the "start state."
The second new syntax is send MsgDecl
which defines a trigger for a state transition; in this case, the trigger is send
ing the async or sync message "MsgDecl." The other triggers available are (i) recv
ing a async or sync message; (ii) call
ing an RPC; and (iii) answer
ing an RPC.
Aside: this is why actor ctors/dtors act like normal messages, with directions etc.: this allows them to be checked against the protocol state machine like any other message.
The third new syntax is goto NEXT_STATE
, a state transition. When the trigger preceding this transition occurs, the protocol actor's internal state is changed to, in this case, "NEXT_STATE."
Another example state machine, for PluginInstance, follows.
sync protocol PluginInstance { manager Plugin; child: SetWindow(PluginWindow window); Paint(); parent: sync GetBrowserValue(String key) returns (String value); state START: send SetWindow goto SENT_WINDOW; recv GetBrowserValue goto START; state SENT_WINDOW: send SetWindow goto SENT_WINDOW; send Paint goto SENT_WINDOW; recv GetBrowserValue goto SENT_WINDOW; };
A few additional notes:
- protocol state machines are optional, but strongly encouraged. simple state machines are useful too!
- all actor states, trigger matching, and transitions are managed by IPDL-generated code. your C++ never sees this
- all messages sent and received are checked against the protocol's state machine. if a message violates the state machine, generic error handling code is invoked; this will probably mean that the child process containing the child actor is terminated with extreme prejudice, and all parent actors are made invalid.
- lots of syntactic sugar is possible for state machine definitions. ping the Electrolysis team if you have good proposals.