Change-Id: Id0cb44ac7d805a1736fa8170aaffb97b54211707 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/166617 Tested-by: Jenkins Reviewed-by: Vasily Melenchuk <vasily.melenchuk@cib.de>
Writer Application Code
Exact history was lost before Sept. 18th, 2000, but old source code comments show that Writer core dates back until at least November 1990.
Module Contents
inc: headers available to all source files inside the moduleqa: unit, slow and subsequent testssdisource: see belowuiconfig: user interface configurationutil: UNO passive registration config
Source Contents
core: Writer core (document model, layout, UNO API implementation)filter: Writer internal filtersascii: plain text filterbasfltdocx: wrapper for the UNO DOCX import filter (in writerfilter) for autotext purposeshtml: HTML filterinc: include files for filtersrtf: thin copy&paste helper around the UNO RTF import filter (in writerfilter)writerww8: DOC import, DOC/DOCX/RTF exportxml: ODF import/export, subclassed from xmloff (where most of the work is done)
uibase: user interface (those parts that are linked intosw& always loaded)ui: user interface (optional parts that are loaded on demand (swui))
Core
There is a good overview documentation of basic architecture of Writer core in the OOo wiki:
- https://wiki.openoffice.org/wiki/Writer/Core_And_Layout
- https://wiki.openoffice.org/wiki/Writer/Text_Formatting
Writer specific WhichIds are defined in sw/inc/hintids.hxx.
The details below are mainly about details missing from the wiki pages.
SwDoc
The central class for a document is SwDoc, which represents a document.
A lot of the functionality is split out into separate Manager classes,
each of which implements some IDocument* interface; there are
SwDoc::getIDocument*() methods to retrieve the managers.
However there are still too many members and methods in this class, many of which could be moved to some Manager or other...
SwNodes
Basically a (fancy) array of SwNode pointers. There are special subclasses of
SwNode (SwStartNode and SwEndNode) which are used to encode a nested tree
structure into the flat array; the range of nodes from SwStartNode to its
corresponding SwEndNode is sometimes called a "section" (but is not necessarily
what the high-level document model calls a "Section"; that is just one of the
possibilities).
The SwNodes contains the following top-level sections:
- Empty
- Footnote content
- Frame / Header / Footer content
- Deleted Change Tracking content
- Body content
Undo
The Undo/Redo information is stored in a sw::UndoManager member of SwDoc,
which implements the IDocumentUndoRedo interface.
Its members include a SwNodes array containing the document content that
is currently not in the actual document but required for Undo/Redo, and
a stack of SwUndo actions, each of which represents one user-visible
Undo/Redo step.
There are also ListActions which internally contain several individual SwUndo
actions; these are created by the StartUndo/EndUndo wrapper methods.
Text Attributes
The sub-structure of paragraphs is stored in the SwpHintsArray member
SwTextNode::m_pSwpHints. There is a base class SwTextAttr with numerous
subclasses; the SwTextAttr has a start and end index and a SfxPoolItem
to store the actual formatting attribute.
There are several sub-categories of SwTextAttr:
-
formatting attributes: Character Styles (
SwTextCharFormat,RES_TXTATR_CHARFMT) and Automatic Styles (no special class,RES_TXTATR_AUTOFMT): these are handled bySwpHintsArray::BuildPortionsand MergePortions, which create non-overlapping portions of formatting attributes. -
nesting attributes: Hyperlinks (
SwTextINetFormat,RES_TXTATR_INETFMT), Ruby (SwTextRuby,RES_TXTATR_CJK_RUBY) and Meta/MetaField (SwTextMeta,RES_TXTATR_META/RES_TXTATR_METAFIELD): these maintain a properly nested tree structure. The Meta/Metafield are "special" because they have both start/end and a dummy character at the start. -
misc. attributes: Reference Marks, ToX Marks
-
attributes without end: Fields, Footnotes, Flys (
AS_CHAR) These all have a corresponding dummy character in the paragraph text, which is a placeholder for the "expansion" of the attribute, e.g. field content.
Fields
There are multiple model classes involved for fields:
enum SwFieldIdsenumerates the different types of fields.SwFieldTypecontains some shared stuff for all fields of a type. There are many subclasses ofSwFieldType, one for each different type of field. For most types of fields there is one shared instance of this per type, which is created inDocumentFieldsManager::InitFieldTypes()but for some there are more than one, and they are dynamically created, seeDocumentFieldsManager::InsertFieldType(). An example for the latter are variable fields (SwFieldIds::GetExp/SwFieldIds::SetExp), with oneSwFieldTypeper variable.SwXFieldMasteris the UNO wrapper of a field type. It is aSwClientregistered at theSwFieldType. Its life-cycle is determined by UNO clients outside ofsw; it will get disposed when theSwFieldTypedies.SwFormatFieldis theSfxPoolItemof a field. TheSwFormatFieldis aSwClientregistered at itsSwFieldType. TheSwFormatFieldowns theSwFieldof the field.SwFieldcontains the core logic of a field. TheSwFieldis owned by theSwFormatFieldof the field. There are many subclasses ofSwField, one for each different type of field. Note that there are not many places that can Expand the field to its correct value, since for example page number fields require a View with an up to date layout; therefore the correct expansion is cached.SwTextFieldis the text attribute of a field. It owns theSwFormatFieldof the field (like all text attributes).SwXTextFieldis the UNO wrapper object of a field. It is aSwClientregistered at theSwFormatField. Its life-cycle is determined by UNO clients outside ofsw; it will get disposed when theSwFormatFielddies.
Lists
-
SwNumFormat(subclass ofSvxNumFormat) determines the formatting of a single numbering level. -
SwNumRule(NOT a subclass ofSvxNumRule) is a list style, containing oneSwNumFormatper list level.SwNumRule::maTextNodeListis the list ofSwTextNodethat have this list style applied. -
SwNumberTreeNodeis a base class that represents an abstract node in a hierarchical tree of numbered nodes. -
SwNodeNumis the subclass ofSwNumberTreeNodethat connects it with an actualSwTextNodeand also with aSwNumRule;SwTextNode::mpNodeNumpoints back in the other direction -
SwListrepresents a list, which is (mostly) a vector ofSwNodeNumtrees, one perSwNodestop-level section (why that?). -
IDocumentListsAccess,sw::DocumentListsManagerowns allSwListinstances, and maintains mappings:- from list-id to
SwList - from list style name to
SwList(the "default"SwListfor that list style)
- from list-id to
-
IDocumentListItems,sw::DocumentListItemsManagercontains a set of allSwNodeNuminstances, ordered bySwNodeindex -
the special Outline numbering rule:
SwDoc::mpOutlineRule -
IDocumentOutlineNodes,sw::DocumentOutlineNodesManagermaintain a list (which is actually stored inSwNodes::m_pOutlineNodes) ofSwTextNodesthat either have the Outline numrule applied, or have theRES_PARATR_OUTLINELEVELitem set (note that in the latter case, theSwTextNodedoes not have aSwNodeNumand is not associated with theSwDoc::mpOutlineRule). -
SwTextNodesand paragraph styles have items/properties:RES_PARATR_OUTLINELEVEL/"OutlineLevel"to specify an outline level without necessarily having the outlineSwNumRuleassignedRES_PARATR_NUMRULE/"NumberingStyleName"the list style to apply; may be empty""which means no list style (to override inherited value) OnlySwTextNodehas these items:RES_PARATR_LIST_ID/"ListId"determines theSwListto which the node is addedRES_PARATR_LIST_LEVEL/"NumberingLevel"the level at which theSwTextNodewill appear in the listRES_PARATR_LIST_ISRESTART/"ParaIsNumberingRestart"restart numbering sequence at thisSwTextNodeRES_PARATR_LIST_RESTARTVALUE/"NumberingStartValue"restart numbering sequence at thisSwTextNodewith this valueRES_PARATR_LIST_ISCOUNTED/"NumberingIsNumber"determines if the node is actually counted in the numbering sequence; these are different from"phantoms"because there's still aSwTextNode.
Note that there is no UNO service to represent a list.
Layout
The layout is a tree of SwFrame subclasses, the following relationships are
possible between frames:
- You can visit the tree by following the upper, lower, next and previous pointers.
- The functionality of flowing of a frame across multiple parents (e.g. pages)
is implemented in
SwFlowFrame, which is not anSwFramesubclass. The logical chain of such frames can be visited using the follow and precede pointers. ("Leaf" is a term that refers to such a relationship.) - In case a frame is split into multiple parts, then the first one is called master, while the others are called follows.