Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

IMPROVING THE MODULARITY OF OPENOFFICE.

ORG
Mathias Bauer > Project Lead OpenOffice.org Writer Sun Microsystems Inc.
1

A popular myth (I)

OpenOffice.org is monolithic and loads everything into memory when it is started.


Everyone and Everything Participating on the Network

The facts
Some data (OpenOffice.org 2.0.4 on Windows)
> > > >

Full installation contains 309 libs, 102 MB Startup without application uses 68 libs, 28 MB Adding writer loads 21 libs more, 17 MB
http://wiki.services.openoffice.org/wiki/Architecture/Libraries

Many libraries are loaded on demand (most of them being UNO services)

Though OOo does not load everything, it loads a lot


> splitting up of libaries necessary > needs code refactoring

Installation requirements

I want to install only Writer.


User wants to save disk space User doesn't want to see elements of unwanted applications in the user interface Everyone and Everything Participating on the Network

Installation of single applications


OpenOffice.org always supported the installation of single applications from the complete installation set Due to the high code reuse in OpenOffice.org a lot of code is shared between the applications
> Installation set in total is smaller > Application specific part is very small

Installation size (OpenOffice.org 2.3, Windows), without PyUNO


> Complete > Writer only

279 MB 246 MB

Reducing the necessary


Make features optional
> Should be restricted to larger blocks of functionality many small libraries will make the complete size bigger (disk size and run time memory consumption) too many options will make the setup confusing > New optional functionality should be implemented as

UNO components or even extensions

Split big libraries to avoid the taking one takes all effect Needs package restructuring and code refactoring

Download requirements

I want to download only Writer.


Modularity already present in the installation set User wants to save bandwidth

Modular installation sets


As shown for installations currently the gain is low
> winning a few 10 MB per installation set does not justify

multiplying the number of download sets by 3 or 4 > extensions currently are the best approach

Improvements need better packaging


> > > >

avoids explosion of testing and build matrix needs ability to download and install individual packages needs redesign of packages needs code refactoring

Development requirements
Faster and easier builds
> currently possible only using solver tar balls > build and install only packages that have been changed

Code refactoring will be useful also for


> > > >

better understanding the code fewer dependencies between the parts less side effects in code changes smaller regression risk

Creation of development packages Build language packs separately

Packaging requirement

There should be more packages with less and clearer dependencies.


Testing of packages Improve build performance Ease introduction of new functionality

Steps for an improved packaging


Separate URE (Stephan Bergmann) Separate packages for non-code parts
> help, templates, fonts, gallery content etc. > additional effort ongoing for branding stuff > better platform and localization packages

Separate packages for application specific parts


> done for the known parts looking for more

Separate packages for features (components and well separated libraries)


> not done to the possible extent, more to find > try to get more components and well separated libraries

Problems already found


Risk for more effort in QA or RE Risk for more effort or confusion on download page Brand specific packages shouldn't contain code Language packs shouldn't contain fonts and code Many of our libraries can't be separated reasonably
> libraries not designed with package idea in mind > many cross depencies > library design and code refactoring needed

Conclusion

Whatever meaning of modularity you use improving it will not be possible without a package and library redesign and refactoring of a large part of the code base.

A popular myth (II)

The maintenance problems of OpenOffice.org can be seen from the intermodule dependencies.
Everyone and Everything Participating on the Network

smoketestoo_native instsetoo_native postprocess dictionaries dbaccess sc desktop extensions chart2 lingucomponent scripting binfilter linguistic forms svx avmedia sfx2 helpcontent2 connectivity xmlhelp uui framework crashrep shell basic sj2 svtools toolkit vcl scp2 readlicense_oo idl officecfg unoxml fileaccess UnoControls sysui xmlscript sdk_oo testtools bean xmerge pyuno remotebridges unodevtools rdbmaker bridges ucb javaunohelper jvmfwk odk cli_ure stoc cpputools jvmaccess jurt unoil ridljar codemaker psprint unotools tools i18npool regexp comphelper ucbhelper cppuhelper cppu offuh offapi udkapi idlc registry berkeleydb libtextcat libxslt neon vos libxml2 libegg np_sdk rhino zlib sandbox beanshell unixODBC epm fondu MathMLDTD python sane xalan twain freetype hsqldb boost x11_extensions setup_native icu external salhelper sal xml2cmp soltools stlport vigra solenv moz sndfile extras nas portaudio afms bitstream_vera_fonts o3tl store udm expat agg curl cosv jpeg psprint_config libwpd libxmlsec autodoc testshl2 msfontextract oovbaapi ure package basebmp basegfx i18nutil embedserv io sax eventattacher hwpfilter jut qadevOOo configmgr animations sot transex3 wizards rsc dtrans padmin scaddins automation so3 xmloff accessibility slideshow cppcanvas canvas goodies embeddedobj fpicker writerperfect xmlsecurity sw sd filter basctl starmath

Critical review
Dependencies show up at built-time, not at run-time
> dependencies to build tools are irrelevant > non-code parts should be removed from the picture

Experience shows that language binding related modules are unproblematic (->URE) Quality of dependencies is not visible
responsible for most of the maintenance problems > large libraries also create large problems with inner Everyone and Everything Participating on the Network dependencies that are not visible in the diagram > only the code tells the whole truth
> stable interfaces are less problematic > only a few libraries at the top of the diagram are

starmath

basctl

desktop

sc

filter

forms lingucomponent

chart2 svx avmedia sfx2

extensions

sd

xmlsecurity

sw

dbaccess

scripting

linguistic

slideshow cppcanvas canvas sj2

binfilter so3

writerperfect

sane

twain

oovbaapi scaddins xmlhelp uui

xmloff goodies

basic framework svtools

connectivity embeddedobj fpicker

libwpd

libegg sysui

toolkit vcl

shell hsqldb unixODBC

ucb unoxml

psprint xmlscript

sot unotools tools i18npool

fileaccess rsc

dtrans UnoControls embedserv

berkeleydb

agg animations basebmp basegfx sax hwpfilter

sndfile freetype portaudio nas XmlSearch eventattacher

package

configmgr

regexp comphelper

cpputools

stoc i18nutil

ucbhelper cppuhelper cppu

vigra

neon store libxslt curl rhino

registry setup_native libxml2 zlib icu external

salhelper sal xml2cmp soltools

o3tl xt expat

vos

libxmlsec moz

beanshell np_sdk

jpeg

sandbox xmerge

xalan

stlport

boost

x11_extensions

unoil

Writer dependencies
sw svx linguistic sfx2 basic uui connectivity framework sj2 svtools shell toolkit vcl fileaccess xmlscript psprint unotools tools i18npool configmgr basebmp sax stoc regexp comphelper ucbhelper cppuhelper cppu setup_native icu jpeg expat external salhelper sal xml2cmp soltools x11_extensions boost stlport sandbox zlib libwpd o3tl vos basegfx i18nutil cpputools rsc libegg goodies avmedia xmloff so3 writerperfect

unixODBC hsqldb unoil moz

sot nas sndfile freetype portaudio

vigra

StarOffice development in 1994


Platform independent development: StarView
> Windows, Mac, OS/2 (later: Linux, HPUX et al.)

Low resources
> Intel 486 > 4 MB RAM (remember the times of Soft RAM?)

Windows 3.x (16 Bit system)


> OLE technology need to run 2 or more applications in parallel needed own replacement for non-Windows platforms > Shared libraries with common data segment static data must be put into executables

Resulting legacies
Platform specific code only in dedicated libraries Minimize memory consumption for OLE
> maximize code reuse in shared class libraries > create application framework on top of StarView (SFX)

following the Template pattern > build persistence model based on OLE storage

Dominant influence of StarView and SFX


> all global objects referenced through Application class > framework provided base classes for nearly everything

(Application, documents, views, dialogs etc.) > extensive use of (multiple) implementation inheritance > hierarchical class library organisation

Framework Architecture
CUI GUI UNO AutoSave/Recovery C++ Storage Mgmnt.

Load Environment Filter Mgmnt. Document Mgmnt. UCB Type Detection Window Mgmnt. VCL Embedding Generic UI Config

UNO

Framework refactoring: strategy


Reimplementation outside of SFX library
> SFX became collection of service wrappers for

application code > not part of global infrastructure any more > SFX shall not be loaded on startup

Implement as UNO services as much as possible


> Exchangeable components > Extendable (even by non C++-components) > Move optional parts into own libraries (e.g.Embedding)

Framework refactoring: done


Type and filter configuration Load environment Autosave/Recovery Embedding (OLE2 in and out, OOo) Document and Window management Storage access Menubar, toolbars, statusbar and controls Dialog factories (move code of dialogs into own libraries for common, sw, sc, sd)

Framework refactoring: to do
Move last UNO services from SFX
> GlobalEventBroadcaster > GlobalAppDispatcher > Generic FrameLoader

Move dialog configuration from SFX Reimplement docking windows as a UNO service Library redesign

Application Environment

So what do we need?
Better separation of model, view and controller (UI)
> at least on build level > perhaps even on library or package level

More UNO components and services Library redesign


> library design should follow the architecture > group classes more meaningfull > avoid too big and too small libraries

Smaller interfaces
> less exported symbols > much less use of implementation inheritance

Vision of the Writer architecture


GUI root Writer Writer View page page page paragraph paragraph HTML SXW DOC ODT ... Model UNO S T E S E

Filters

Drawing Layer

Config

VCL

GUI

ODF

FWK

Modularity problems in Writer


Model representation based on SFX
> > > >

not separated from persistence code not separated from API implementation coupled with other parts represented in SFX needs preliminary work in SFX before

No clear separation between core, layout and UI, not even on build level Not all filters are separated from the core Monolithic Drawing Layer Huge C++ class interfaces

What are we doing?


Filter work
> legacy filters moved to binfilter module, will become a

separate package (done already) > only living filters stay in the module (html, text, Word) > new filters are developed as UNO components > Word import filter will be converted to UNO component

Started refactoring in SFX


> separate API and GUI code for storing documents > separate model API implementation from SFX code > make DocumentInfo a real UNO service

Library redesign (svx, sfx, svtools, etc.)

http://wiki.services.openoffice.org/wiki/Global_Library_Redesign

Library redesign: svx


Real all purpose library
accessibility customshapes edit engine / outliner forms support gallery items toolbar, menu and statusbar controllers code for binary ms filters some items, dialogs etc. drawing layer etc. etc.

Refactoring in SFX
DocumentInfo
> real UNO service, can be used autonomously > component also usable for filters (e.g. doc/docx import)

Separate model API implementation from SFX code


> remove all SFX code from code that implements API > move code into helper classes with low dependencies > encapsulate GUI code

Continue in Writer code

Ongoing Writer refactoring


> Andreas Martens > first step for class SwDoc 15 interfaces identified Includes of doc.hxx from 408 down to 368 > to be continued

http://wiki.services.openoffice.org/wiki/Writer/ToDo/Writer_Refactoring/Writer_Refactoring From fat classes to smaller interfaces

Remove acces to layout and view from Writer core


> big problem: Drawing Layer

Separate core/view from controller


> builds upon SFX refactoring

Drawing Layer
GUI CUI ODF UNO XML Top Layer Top Layer Writer Calc Impress Base IDE Math Wizards GUI Mid Layer Mid Layer CUI UNO i18n Common GUI Drawing Layer BASIC Utilities Framework UNO VCL Help

FWK

System Integration

Drawing Layer work


Problems wrt. modularity
> Model and view basically in one object > deep inheritance and usage of concrete instances, with

app framework, control layer, and VCL

Ongoing work
> Thorsten Behrens and Armin Le Grand > See Moving OOo to XCanvas on
http://marketing.openoffice.org/ooocon2006/schedule/wednesday.html

> Biggest problem: Drawing Layer has many clients

Q&A

Improving the modularity of OpenOffice.org


Mathias Bauer > Mathias.Bauer@sun.com

You might also like