Loop/Load Handling
To avoid overwhelming the various servers involved in the Loop service infrastructure, Firefox 34 will include a set of three load management mechanisms. This page describes these mechanisms.
Service Soft Start
See bug 1055319.
This mechanism allows Mozilla to gradually ramp up system load after the feature makes it into release. In a nutshell:
By default, release (and late Beta) clients will have two prefs set to "true": loop.enabled and loop.throttled. Upon first startup, each client will select a random number in the range of 1 to 224-2 and write it into the "loop.soft_start_ticket_number" pref. Then, upon this and every subsequent startup, each client will check the value of the "loop.throttled" pref. If set to true, then the client checks the value of a DNS A record (tentatively "soft-start.loop.services.mozilla.com" -- see bug 1060809), which is required to be in the range 127.0.0.0 - 127.255.255.255. If the record is outside this range, or if there is an error retrieving the A record, then the client does not activate the Loop feature.
If the record is successfully retrieved, then the low 24 bits of the address are treated as a "now serving" number, and compared to the value stored in "loop.soft_start_ticket_number". If the value is strictly greater than the selected ticket number, then the feature is activated, and the "loop.throttled" pref is set to false (which will bypass this procedure for all subsequent startups).
This allows us to increase load on the system very gradually after launch. The recommended handling of this number is as follows:
- Ensure that the TTL for the DNS record is set to a relatively short value, so as to allow changes to propagate through the system rapidly. recommended value is in the range of 600 to 3600 seconds (10 minutes to an hour).
- When initially launched, set the load level to 10%. Leave it at that level for at least 24 hours and observe server load.
- If server utilization is sufficiently low, increase the load level incrementally, waiting at least 24 hours between each change to ensure that server load can settle.
- Once server load is ramped all the way to 100%, file a bug to remove the throttling logic from the Loop feature.
For easy reference, the following table calculates the IP address values for loads from 0% to 100%, in 5% increments:
Load (%) | Load (24-bit integer) |
IP Address |
---|---|---|
0% | 0 | 127.0.0.0 |
5% | 838860 | 127.12.204.204 |
10% | 1677721 | 127.25.153.153 |
15% | 2516582 | 127.38.102.102 |
20% | 3355443 | 127.51.51.51 |
25% | 4194303 | 127.63.255.255 |
30% | 5033164 | 127.76.204.204 |
35% | 5872025 | 127.89.153.153 |
40% | 6710886 | 127.102.102.102 |
45% | 7549746 | 127.115.51.50 |
50% | 8388607 | 127.127.255.255 |
55% | 9227468 | 127.140.204.204 |
60% | 10066329 | 127.153.153.153 |
65% | 10905189 | 127.166.102.101 |
70% | 11744050 | 127.179.51.50 |
75% | 12582911 | 127.191.255.255 |
80% | 13421772 | 127.204.204.204 |
85% | 14260632 | 127.217.153.152 |
90% | 15099493 | 127.230.102.101 |
95% | 15938354 | 127.243.51.50 |
100% | 16777215 | 127.255.255.255 |
I'm a crusty old perl programmer, so the suggestion I have for generating the IP address for an arbitrary load value looks like this; I'm sure there are more elegant solutions in other languages, perl being what it is. Simply replace the "50" at the end with the load level you'd like to get an IP address for:
perl -e 'print join(".",unpack("C*",pack("N",127<<24|int(((1<<24)-1)*(shift)/100))))."\n"' 50
Simple Push Server Load Distribution
See bug 1055139 and bug 1055143.
Currently, the push servers have been measured to handle up to 5 million users per cluster. With potentially several hundred million users ultimately using the Loop service, we need the ability to scale to use multiple simple push clusters.
To support this behavior, the Loop client will query the Loop server for the address of the Simple Push server to use, rather than using the Simple Push server configured in the client prefs. The loop server will maintain a list of Simple Push clusters, and hand out a random entry from this list each time a client requests the address of a Simple Push server. In this way, we can bring additional Simple Push servers online to accommodate the load required to support the Loop service.
Simple Push Server Load Reduction
See bug 1060610.
The final approach to reduce the load on the Simple Push infrastructure, especially as curious users investigate the meaning of a new button that has just appeared in their browser, is to defer setting up a simple push connection until the user performs an action that might result in receiving a call. This can happen if the user copies a link from the Loop panel (either with the "copy" button or by highlighting the link and using OS-specific means of copying to the clipboard), or if the user selects the "email link" button.
Users can also receive calls once they are logged in to an FxA account for Loop, so any logged-in user will maintain a simple push subscription also.
This prevents unnecessary simple push connections from being established by clients that aren't actually capable of receiving a call at any given moment.