IPDL/Getting started
IPDL is a domain-specific language that allows programmers to define a message-passing "protocol" between "actors." These "actors" are thread contexts, which can execute either in separate address spaces or within the same address space (shared-nothing threads).
"Protocols" consist of two elements: declarations of messages that can be exchanged between two actors, and the definition of a state machine that describes when each message is allowed to be sent.
From an IPDL specification, several C++ headers are generated. These headers are meant to be opaque to the author of an IPDL specification; internally, they manage the tedious details of setting up and tearing down the underlying communication layer (sockets and pipes), constructing and sending messages, ensuring that all actors adhere to their specifications, and "correctly" handling errors.
This guide intends to introduce the basic concepts of IPDL through an increasingly complicated example. By the end of the guide, you should be able to write IPDL specs and the C++ implementations of message handlers. This guide does not attempt to cover how IPDL works under the covers.
Running example: Browser plugins
We will use the example of a web browser launching plugins in separate processes and then controlling them. A plugin here is a dynamically loaded code module, such as libflash.so. Once a plugin module has been loaded, the browser can ask the plugin module to create instances. A plugin instance is what lives inside an object frame on a particular web page; a single youtube video is a Flash plugin instance, for example. There can be any number of plugin instances per plugin module. (There can be any number of youtube videos open in a web browser.)
Plugin instances are scriptable, which means that they can access JavaScript objects in the browser, and the browser can access JavaScript objects created by the plugin instance. Using the youtube Flash video example again, the youtube instance can create a JavaScript object that represents the video player, and the browser can access that video player object and ask it to "Pause," "Play," etc. There can be any number of script objects per plugin instance.
Protocols and actors
Protocols define how two actors communicate. We'll introduce protocols and actors using the plugin modules described above. Recall that in this example, we have two processes: the browser process and the process in which the plugin module's code executes. There are two actors: the thread context in which the browser's plugin management code executes (in the browser process), and the thread on which the plugin's code executes (in the plugin process).
We have chosen to codify the concept of parent and child actors in IPDL; we use "parent actor" to refer to the "more trusted" actor, and "child actor" to refer to the "less trusted" actor. In the case of plugins, the browser actor is the parent, and the plugin actor is the child.
The parent and child actors communicate by sending messages to each other. The messages that can be exchanged are explicitly declared in IPDL. The following IPDL code defines a very basic interaction of browser and plugin actors
protocol Plugin { child: Init(); Deinit(); };
This code defines a Plugin
protocol. On the next three lines, the code declares two messages, Init()
and Deinit()
. We will describe the messages in more detail in the next section. To finish the introduction of protocols and actors, note the child
keyword used on the 2nd line. This keyword says that child "accepts", or "implements", the messages declared under that "child" label. That is, the parent actor can send those messages to the child, but the child cannot send those messages to the parent.
Basic messages
It is very important to understand message semantics. It's tempting to think of protocol messages as C++ function calls, but that's not very useful. We'll delve into gory details of the above example to illustrate these semantics. To keep the example as concrete as possible, we first need to take a detour into how the above example is translated into something that C++ code can utilize.
The above specification will generate three headers: PluginProtocolParent.h, PluginProtocolChild.h, and PluginProtocol.h (we'll ignore the third completely; it's full of uninteresting implementation details). As you might guess, PluginProtocolParent.h defines a C++ class (PluginProtocolParent) that code on the parent side utilizes, the browser in this example. And similarly, code running in the plugin process will use the PluginProtocolChild class.
These Parent and Child classes are abstract. They will contain a "Send" function implementation for each message type in the protocol (which inheriting classes can call to initiate communication), and a pure virtual "Recv" function for the message receiver (which needs to be implemented by the inheriting class). For example, the PluginProtocolParent class will look something like the following in C++
class PluginProtocolParent { public: void SendInit() { // boilerplate generated code ... } void SendDeinit() { // boilerplate generated code ... } };
and the PluginProtocolChild will be something like:
class PluginProtocolChild { protected: virtual void RecvInit() = 0; virtual void RecvDeinit() = 0; };
These Parent and Child abstract classes take care of all the "protocol layer" concerns; sending messages, checking protocol safety (we'll discuss that later), and so forth. However, these abstract classes can do nothing but send and receive messages; they don't actually do anything, like draw to the screen or write to files. It's up to implementing C++ code to actually do interesting things with the messages.
So these abstract Parent and Child classes are meant to be subclassed by C++ implementors. Below is a dirt simple example of how a browser implementor might utilize the PluginProtocolParent
class PluginParent : public PluginProtocolParent { public: PluginParent(string pluginDsoFile) { // launch child plugin process } };
This is a boring class. It simply launches the plugin child process. Note that since PluginParent inherits from PluginProtocolParent, the browser code can invoke the SendInit()
and SendDeinit()
methods on PluginParent objects. We'll show an example of this below.
Here's how the PluginProtocolChild might be used by a C++ implementor in the plugin process:
class PluginChild : public PluginProtocolChild { protected: // implement the PluginProtocolChild "interface" void RecvInit() { printf("Init() message received\n"); // initialize the plugin module } void RecvDeinit() { printf("Deinit() message received\n"); // deinitialize the plugin module } public: PluginChild(string pluginDsoFile) { // load the plugin DSO } };
The PluginChild is more interesting: it implements the "message handlers" RecvInit() and RecvDeinit() that will be automatically invoked by the protocol code when the plugin process receives the Init() and Deinit() messages, respectively.
Let's run through an example of how this dirt simple protocol and its implementation could be used. In the table below, the first column shows C++ statements executing in the browser process, and the second shows C++ statements executing in the plugin process.
Browser process | Plugin process |
---|---|
BrowserMain.cc: PluginParent* pp = new PluginParent("libflash.so"); | |
(do browser stuff) | plugin process is created |
... | PluginMain.cc: PluginChild* pc = new PluginChild("libflash.so"); |
BrowserMain.cc: pp->SendInit(); | (spin event loop) |
PluginProtocolParent.h: (construct Init() message, send it) | ... |
(spin event loop) | PluginProtocolChild.h: (unpack Init() message, call RecvInit();) |
... | PluginChild.cc: RecvInit() { // do stuff } |
... | ... |
TODO: is it clear what is going on here?
Direction
Each message type includes a "direction." The message direction specifies whether the message can be sent from-parent-to-child, from-child-to-parent, or both ways. Three keywords serve as direction specifiers; child was introduced above. The second is parent, which means that the messages declared under the parent label can only be sent from-child-to-parent. The third is both, which means that the declared messages can be sent in both directions. The following artificial example shows how these specifiers are used and how these specifiers change the generated abstract actor classes.
// this protocol ... protocol Direction { child: Foo(); // can be sent from-parent-to-child parent: Bar(); // can be sent from-child-to-parent both: Baz(); // can be sent both ways }; // ... generates these C++ abstract classes ...
// ----- DirectionProtocolParent.h ----- class DirectionProtocolParent { protected: virtual void RecvBar() = 0; virtual void RecvBaz() = 0; public: void SendFoo() { /* boilerplate */ } void SendBaz() { /* boilerplate */ } };
// ----- DirectionProtocolChild.h ----- class DirectionProtocolChild { protected: virtual void RecvFoo() = 0; virtual void RecvBaz() = 0; public: void SendBar() { /* boilerplate */ } void SendBaz() { /* boilerplate */ } };
Syntax note: you can use the child, parent, and both specifiers multiple times in a protocol specification. They behave like public, protected, and private labels in C++.
Parameters
Message declarations allow any number of parameters, which are data serialized by the sender and deserialized by the receiver. The following snippet of IPDL and generated code shows how parameters can be used in message declarations.
// protocol Blah { ... child: Foo(int parameter); // class BlahProtocolParent { ... void SendFoo(const int& parameter) { // boilerplate } // class BlahProtocolChild { ... virtual void RecvFoo(const int& paramter) = 0;
The IPDL compiler has a set of "builtin" types for which library code exists to serialize and deserialize parameters of that type. These builtin types include the C/C++ integer types (bool, char, int, ..., int8_t, uint16_t, ...), a string type (String
in IPDL), and some array types (StringArray
, for example). These builtin types are in flux and will certainly be expanded. The most up-to-date reference for builtins is the file ipc/glue/MessageTypes.h
.
The builtin types are insufficient for all protocols. When you need to send data of type other than one built into IPDL, you can add a using
declaration of the type in an IPDL specification, and in C++ define your own data serializer and deserializer. The details of this are beyond the scope of this document; see dom/plugins/NPAPI.ipdl
and dom/plugins/PluginMessageUtils.h
for examples of how this is done.
Semantics
Note that in all the IPDL message declarations above, the generated C++ methods corresponding to those messages always had the return type void
. What if we wanted to return values from message handlers, in addition to sending parameters in the messages? If we thought of IPDL messages as C++ functions, this would be very natural. However, "returning" values from message handlers is much different from returning values from function calls. Please don't get them mixed up!
In the Plugin protocol example above, we saw the parent actor send the Init() message to the child actor. Under the covers, "sending the Init() message" encompasses code on the parent side creating some Init()-like message object in C++, serializing that message object into a sequence of bytes, and then sending those bytes over a socket to the child actor. (All this is hidden from the C++ implementor, PluginParent above. You don't have to worry about this except to better understand message semantics.) What happens in the parent-side code once it has written those bytes to the socket's file descriptor? It is this behavior that we broadly call message semantics.
In IPDL, the parent-side code has three options for what happens after those serialized bytes are written to the socket:
- Continue executing. We call this asynchronous semantics; the parent is not blocked.
- Wait until the child acknowledges that it received the message. We call this synchronous semantics, as the parent blocks until the child receives the message and sends back a reply.
- The third option is more complicated and will be introduced below, after another example.
In the Plugin protocol example above, we might want the Init() message to be synchronous. It may not make sense for the browser to continue executing until it knows whether the plugin was successfully initialized. (We will discuss which semantics, asynchronous or synchronous, is preferred in a later section.) So for this particular case, let's extend the Plugin protocol's Init() message to be synchronous and return an error code: 0 if the plugin was initialized successfully, non-zero if not. Here's a first attempt at that
protocol Plugin { child: sync Init() returns (int rv); Deinit(); };
We added two new keywords to the Plugin protocol, sync and returns. sync marks a message as being sent synchronously; note that the Deinit() message has no specifier. The default semantics is asynchronous. The returns keyword marks the beginning of the list of values that are returned in the reply to the message. (It is a type error to add a returns block to an asynchronous message.)
Detour: the above protocol will fail the IPDL type checker. Why? IPDL protocols also have "semantics specifiers", just like messages. A protocol must be declared to have semantics at least as "strong" as its strongest message semantics. Synchronous semantics is called "stronger than" asynchronous. Like message declarations, the default protocol semantics is asynchronous; however, since the Plugin protocol declares a synchronous message, this type rule is violated. The fixed up Plugin protocol is shown below.
sync protocol Plugin { child: sync Init() returns (int rv); Deinit(); };
This new sync message with returns values changes the generated PluginProtocolParent and PluginProtocolChild headers as follows
// class PluginProtocolParent { ... int SendInit() { /* boilerplate */ } // class PluginProtocolChild { ... virtual int RecvInit() = 0;
To the parent implementor, the new SendInit() method signature means that it receives an int return code back from the child. To the child implementor, the new RecvInit() method signature means that it must return an int back from the handler, signifying whether the plugin was initialized successfully.
Important: To reiterate, after the parent code calls SendInit(), it will block the parent actor's thread until the response to this message is received from the child actor. On the child side, the int returned by child implementor from RecvInit() is packed into the response to Init(), then this response is sent back to the parent. Once the parent reads the bytes of response message from its socket, it deserializes the int return value and unblocks the parent actor's thread, returning that deserialized int to the caller of SendInit(). It is very important to grok this sequence of events.
Implementation detail: IPDL supports multiple returns values, such as in the following example message declaration
// sync protocol Blah { child: sync Foo(int param1, char param2) returns (long ret1, int64_t ret2);
C++ does not have syntax that allows returning multiple values from functions/methods. Additionally, IPDL needs to allow implementor code to signal error conditions, and IPDL itself needs to notify implementors of errors. To those ends, IPDL generates interface methods with nsresult return types and "outparams" for the IPDL returns values. So in reality, the C++ interface generated for the Blah example above would be
// class BlahProtocolParent { ... nsresult SendFoo(const int& param1, const int& param2, long* ret1, int64_t* ret2) { /* boilerplate */ } // class BlahProtocolChild { ... virtual nsresult RecvFoo(const int& param1, const int& param2, long* ret1, int64_t* ret2) = 0;
And the actual interface generated for the Plugin protocol is
// class PluginProtocolParent { ... nsresult SendInit(int* rv) { /* boilerplate */ } // class PluginProtocolChild { ... virtual nsresult RecvInit(int* rv) = 0;
RPC semantics
"RPC" stands for "remote procedure call," and this third semantics models procedure call semantics. A quick summary of the difference between RPC and sync semantics is that RPC allows "re-entrant" message handlers --- that is, while an actor is blocked waiting for an "answer" to an RPC "call", it can be unblocked to handle a new, incoming RPC call.
In the example protocol below, the child actor offers a "CallMeCallYou()" RPC interface, and the parent offers a "CallYou()" RPC interface. The rpc
qualifiers mean that if the parent calls "CallMeCallYou()" on the child actor, then the child actor, while servicing this call, is allowed to call back into the parent actor's "CallYou()" message.
rpc protocol Example { child: rpc CallMeCallYou() return (int rv); parent: rpc CallYou() returns (int rv); };
If this were instead a sync protocol, the child actor would not be allowed to call the parent actor's "CallYou()" method while servicing the "CallMeCallYou()" message. (The child actor would be terminated with extreme prejudice.)
TODO sequence diagram if this explanation is unclear
Preferred semantics
Use async semantics whenever possible. Asynchronous messaging in C++ has a bad rap because of the complexity and verbosity of implementing asynchronous "protocols" in C++ with event loops and massive "switch" statements. IPDL is specifically designed to make this easier and more concise.
Blocking on replies to messages is discouraged. If you absolutely need to block on a reply, use sync semantics very carefully. It is possible to get into trouble with careless uses of synchronous messages; while IPDL can (or will eventually) check and/or guarantee that your code does not deadlock, it is easy to cause nasty performance problems by blocking.
Please don't use RPC semantics. RPC inherits (nearly) all the problems of sync messages, while adding more of its own. Every time your code makes an RPC call, it must ensure that its state is "consistent" enough to handle every re-entrant (nested) call allowed by the IPDL state machine. This is hard to get right. (Why was RPC semantics added? Basically for NPAPI plugins, for which we can't change any existing plugin code.)
Checkpoint 1
By this point in the guide, you should understand
- basically what an IPDL protocol is
- what an IPDL actor is, basically
- what an IPDL message is, basically
- message directions
- message semantics
- the relationship between IPDL and its generated interface; i.e., what your C++ code must implement