canmove, Confirmed users
1,577
edits
(remove bogus distribution section) |
(add gcc 4.1.2 and 4.3 notes) |
||
Line 8: | Line 8: | ||
=Compilers= | =Compilers= | ||
Notes from dwitte on gcc 4.3 vs. 4.1.2. [https://bugzilla.mozilla.org/show_bug.cgi?id=409803#c17] Also see the [https://bugzilla.mozilla.org/show_bug.cgi?id=409803#c0 original post] about possible ways to make gcc 4.1.2 faster as well by using -Os and -finline-limit. | Notes from dwitte on gcc 4.3 vs. 4.1.2. [https://bugzilla.mozilla.org/show_bug.cgi?id=409803#c17] Also see the [https://bugzilla.mozilla.org/show_bug.cgi?id=409803#c0 original post] about possible ways to make gcc 4.1.2 faster as well by using -Os and -finline-limit. | ||
===gcc 4.1.2 notes=== | |||
<pre> | |||
it turns out that gcc 4.1.2 on linux, at our default optimization setting "-Os | |||
-freorder-blocks -fno-reorder-functions", avoids inlining even trivial | |||
functions (where the cost of doing so is less than even the fncall overhead). | |||
this is bad news for things like nsTArray, nsCOMPtr etc, which can result in | |||
many layers of wrapper calls if not inlined sensibly. | |||
gcc has an option to control inlining, "-finline-limit=n", which will (roughly) | |||
inline functions up to length n pseudo-instructions. to give some sense for | |||
numbers, the default value of n at -O2 is 600. i ran some tests and found that | |||
with our current settings and -finline-limit=50 on a 32-bit linux build, which | |||
is enough to inline trivial (one or two line) wrapper methods but no more, we | |||
can get a codesize saving of 225kb (2%), a Ts win of 3%, a Txul win of 18%, and | |||
a Tp2 win of about 25% (!). | |||
i also compared this to plain -O2: Txul is unchanged, Ts improves 3%, and Tp2 | |||
improves about 4%. however, codesize jumps 2,414kb (19%). maybe we can increase | |||
the inline limit at -Os to get back a bit of this perf, without exploding | |||
codesize. (we originally moved from -O2 to -Os on gcc 3.x, because it gave a | |||
huge codesize win and also a perf win of a few percent on Ts, Txul, and Tp. so, | |||
it seems gcc4.x behaves quite differently.) | |||
</pre> | |||
===gcc 4.3 notes=== | |||
<pre> | <pre> |