Professional Documents
Culture Documents
Optimization in GCC-2
Optimization in GCC-2
page=0,1
Optimization in GCC
Jan 26, 2005 By M. Tim Jones (/user/801462)
in
Architecture Specification
The default architecture is i386. GCC runs on all other i386/x86 architectures, but it
can result in degraded performance on more recent processors. If you're concerned
about portability of an image, you should compile it with the default. If you're more
interested in performance, pick the architecture that matches your own.
1 of 5 12/28/2010 4:44 PM
Optimization in GCC | Linux Journal http://www.linuxjournal.com/article/7269?page=0,1
real 0m1.036s
user 0m1.030s
sys 0m0.000s
[mtj@camus]$ gcc -o sort sort.c -O2 -march=pentium2
[mtj@camus]$ time ./sort
real 0m0.799s
user 0m0.790s
sys 0m0.010s
[mtj@camus]$
By specifying the architecture, in this case a 633MHz Celeron, the compiler can
generate instructions for the particular target as well as enable other optimizations
available only to that target. As shown in Listing 1, by specifying the architecture we
see a time benefit of 237ms (23% improvement).
Although Listing 1 shows an improvement in speed, the drawback is that the image is
slightly larger. Using the size command (Listing 2), we can identify the sizes of the
various sections of the image.
From Listing 2, we can see that the instruction size (text section) of the image
increased by 28 bytes. But in this example, it's a small price to pay for the speed
benefit.
Option Description
387 Standard 387 Floating Point Coprocessor
sse Streaming SIMD Extensions (Pentium III, Athlon 4/XP/MP)
sse2 Streaming SIMD Extensions II (Pentium 4)
2 of 5 12/28/2010 4:44 PM
Optimization in GCC | Linux Journal http://www.linuxjournal.com/article/7269?page=0,1
______________________
Comments
Select your preferred way to display the comments and click "Save settings" to activate your changes.
hi, follow the "Listing 3. Simple Example of gprof" but when using -O or -O2, the profile is "Flat
profile".So how to resoult it?
my step is:
1: gcc -o test_optimization test_optimization.c -pg -march=i386
2: ./test_optimization
3: gprof --no-graph -b ./test_optimization gmon.out
4: the result is:
Flat profile:
Each sample counts as 0.01 seconds.
no time accumulated
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 1 0.00 0.00 factorial
,
if add -O2 the result is:
Flat profile:
Each sample counts as 0.01 seconds.
no time accumulated
Any optimization can be enabled outside of any level simply by specifying its name with the -f prefix, as:
gcc -fdefer-pop -o test test.c
hello:
I just compiled a code under gcc cygwin and visual c 2005 in a lpatop with dula core intel processor.
The debuggable gcc code was about 2x times than faster than visual c++ debuggable code
however the situation reversed when i used O3 optimization in gcc and "release" optimization in visual c.
3 of 5 12/28/2010 4:44 PM
Optimization in GCC | Linux Journal http://www.linuxjournal.com/article/7269?page=0,1
apparentely, MSVC uses a few insecure optimizations counting that the developer created a
secure code. Probably thats why its debug build is slower.
I've seen lots of situations where gcc code gives a error right away, and promptly showing
me and bug and MSVC happily executing a code until it finally stumble upon a non-static field of a class and
finally giving a error. For me , this is simple misleading and thats why I prefer gcc
Someone should write some "C" code and a few scripts that will enable / disable every compiler option
and then print out which options worked best for _your_ particular system.
A benchmark that would specifically test each option (as opposed to using a single benchmark, and huge)
could be written.
Benchmarks need to have a 'large' effect on the option that is being switched.
This could be ran overnight (or on multiple machines, each doing part of the testing) and results provided on a web page
somewhere.
Experts could put in thier two cents and a wiki of snipperts could
be fed into a code compilator (not compiler, just a bunch of scripts) that would compilate all the snippets and produce a
final program to be compiled on many different machines.
This way we could figure out that if we had such-and-such a system then "how-often" (what % of the time) would we
simply be better off
to use a particular option and when is it more likely based on that TYPE of program we are running (wordprocessor vs.
MultiMedia app).
EG: If you have a Pentium is is ALWAYS (or should be if gcc is correct) best to use the -march=pentium option - BUT - it
is NOT always best to use "-fcrossjumping" (though it _could_ be for certain applications).
The output of all this could simply be a half dozen command line choices for each processor - including a "general purpose
'best'" setting and a "quick compile with great optimization" setting (for intermediate builds).
This is something that a few dozen people need to work on to get the ball rolling and then the rest of us need to pitch in and
compile the resulting test scripts to check for errors. With everyone's help we should have the so-called answer(S) to
"which compilation options should I use for machine-X when compiling applcation=category Y.
Where can I get a readable copy of Table 1? The copy here is too small to read, and can't be enlarged.
try clicking on it
4 of 5 12/28/2010 4:44 PM
Optimization in GCC | Linux Journal http://www.linuxjournal.com/article/7269?page=0,1
5 of 5 12/28/2010 4:44 PM