Skip to content
Modular Infrastructure for Inclusive Housing Tran Thien Toan Ngo · PhD Dissertation

Purpose and Interface Scope

The complete evaluation results for Chapter 5 (the standardisation schema) are documented here, and the supplementary data package deposited with this thesis is described in the final section. The evaluation addressed three substantive questions: whether the schema resolves clause-level ambiguity in the NDIS SDA Design Standard 2019, whether resolution concentrates differentially across clause classes in a pattern consistent with the schema’s design intent, and whether the two-channel architecture is warranted by measurable complementarity between the text and figures channels. All reported metrics are grounded in artefacts held in the data package described in the final section; evidence registers mapping reported results to specific package files are listed there.


Text-Channel Evaluation Corpus and Integrity

The text-channel evaluation corpus is defined by 611 serialised clauses drawn from the NDIS SDA Design Standard 2019, covering 25 design categories. Clauses were serialised using the schema documented in Appendix: Standards Serialisation Schema and assigned clause-class labels (design_requirement, rationale, applicable_to, other). Corpus integrity baselines — structural completeness checks, identity-stability analysis, and cross-layer coverage measures — are reported in Appendix: SDA Corpus Integrity Metrics.


Text-Channel Ambiguity Resolution Results

The central evaluation measure is the per-clause ambiguity-delta: the reduction in measurable ambiguity achieved by the schema relative to an unstructured baseline. Across the 611-clause corpus, the mean per-clause ambiguity-delta is 0.2514. The aggregate full-schema delta against a null-model baseline (no schema applied) is 0.382, compared with 0.007 for the null model. The minimal-schema variant yields 0.315, establishing that clause-class annotation contributes +0.301 of the total gain and deontic annotation contributes an additional +0.067.

Table A5.1: Ambiguity-delta by schema variant

Schema variant Ambiguity-delta Contribution above null
Null model 0.007
Minimal schema 0.315 +0.308
Full schema 0.382 +0.375
Clause-class gain +0.301
Deontic gain +0.067

Resolution is concentrated in design_requirement clauses — the class carrying direct normative force — with an ambiguity-delta of 0.570. The rationale class achieves 0.475; applicable_to achieves 0.023; other achieves 0.000 by design, as the schema excludes residual clauses from resolution logic. This distribution confirms that the schema resolves ambiguity where the consequences of ambiguity are highest, rather than across the corpus indiscriminately. In summary, the clause-class-differentiated resolution profile is the principal evidence that the schema’s design intent — to concentrate resolution power where normative stakes are highest — is realised in practice. The next section reports the deontic force analysis that extends this picture to the modal dimension of the corpus.

Table A5.2: Ambiguity-delta by clause class

Clause class Ambiguity-delta Structural role
design_requirement 0.570 Direct normative force; highest-consequence class
rationale 0.475 Explanatory support and justification
applicable_to 0.023 Scope or applicability condition
other 0.000 Residual class; excluded from schema resolution logic

Structural resolution and deontic resolution rates are 90.6% and 90.9% respectively across the 611-clause corpus, indicating that clause-role and modality profile are preserved through the schema in the large majority of cases.


Deontic Force Analysis

The corpus is defined by 164 modal clauses, and deontic outcomes require three-way decomposition. Of the 75 coverage-gap cases, ontology coverage is insufficient to assess force; these cases represent a transparent design boundary rather than force loss. Of the remaining cases, 61 represent actual force collapse and 28 represent force-preserved resolution. This decomposition separates ontology under-coverage from genuine modality loss — a distinction that a single aggregate collapse rate obscures.

Lexical ambiguity over the corpus is a census property, reported here as a plain proportion rather than estimated from a sample. Of the 204 lemmas that meet the frequency threshold of at least three appearances, 121 — 59.3 per cent — are polysemous as the corpus stands. The polysemous set includes built-environment fundamentals — door, wall, height, basin — and spatial-category terms such as space and area; because these are also among the most frequent lemmas, their polysemy is the single largest source of the interpretive burden the schema must address explicitly. The burden concentrates where it matters most for compliance: of the 54 fallback terms the schema retains, 107 of their appearances fall within the 164 modal clauses. Full per-term polysemy counts are in Appendix: Polysemy Metrics and Confidence. The text-channel ambiguity and deontic results together show the schema reducing interpretive uncertainty precisely where normative stakes are highest while leaving the residual burden visible rather than hidden. The figures-channel results in the next section, and the cross-channel complementarity they reveal, are what warrant the two-channel architecture.


Figures-Channel Results and Cross-Channel Complementarity

The figures channel provides 189 design requirements extracted from SDA Design Standard figures, complementing the 611-clause text corpus. Of the 189 figure-derived requirements, 75.7% have no equivalent in the text channel. Conversely, 86.4% of text-channel design requirements have no figure equivalent. The figures-channel ambiguity-delta is 0.9148, substantially higher than the text-channel mean of 0.2514. Together, these figures establish that the text and figures channels carry largely disjoint normative content and differ substantially in their amenability to structured resolution, warranting the two-channel architecture rather than single-channel extraction from either source alone. Overall, the figures-channel results and cross-channel complementarity data constitute the central evidentiary basis for the schema architecture decision documented in Chapter 5. The next section presents the robustness and validation findings that confirm the stability of these results across corpus partitions.


Stability and Sensitivity Checks

Because the corpus is a complete census rather than a probability sample, stability is reported as descriptive structure across partitions of the corpus, with no interval estimate or significance test attached. When the 611 clauses are divided into ten equal partitions, on average 94.96 per cent of the terms in any one partition also occur in the others, so the schema vocabulary is shared across the corpus rather than confined to any single region of the standard. Structural coupling between primitives is reported descriptively rather than tested against a null: the co-occurrence counts for every primitive pair are deposited in the data package, and they show coupling concentrated among a minority of pairs rather than spread evenly. The one design parameter that could have driven the results, the minimum-frequency threshold, is shown to be non-load-bearing by reporting the analysis at each adjacent cut-off from one to ten appearances — no finding in this appendix changes across that range. The next section situates these results against established building and housing ontologies.


External Benchmarking

Comparison against established building and housing ontologies establishes the superiority of the schema-first approach. The SDA Design Standard serialisation achieves 100.0% clause coverage of the corpus by construction. Applied to the same clause set, IFC4 achieves 32.1% coverage and BOT achieves 20.8%, leaving approximately two-thirds of normative content unrepresented under an ontology-extension strategy. The systematic literature review underpinning Chapter 5 identified 15 of 134 floor-plan representation papers as addressing accessibility concepts, confirming that accessibility representation constitutes an under-served area in the field. Therefore, the external benchmarking evidence confirms that the schema-first approach achieves normative coverage that established ontology-extension strategies cannot match for this instrument, and this establishes the comparative justification for the design decision to develop a purpose-built serialisation schema rather than extending an existing ontology.


Supplementary Data Package

The Chapter 5 data package is deposited with this thesis in two forms: a single compressed archive and an expanded directory for direct file access. Its deposit locations are recorded in the Data Package Manifest below; an examiner can locate every component of the package from that manifest. All file references elsewhere in this appendix are given relative to the root of the deposited package (the ch5-artefact-bundle entry in the manifest).

Data Package Manifest

Component Location within the deposited thesis Notes
Package root publish-thesis/publish-data/appendix-data/ch5-artefact-bundle/ The root of the deposited Chapter 5 artefact bundle; all relative references below resolve against it
Compressed archive ch5-artefact-bundle/sda-data-package-v1.zip Single-file archive of the complete package
Expanded package ch5-artefact-bundle/data-package/ Direct file access; canonical artefacts listed in Table A5.3 below
Interactive explorer ch5-artefact-bundle/explorer/sda-standards-explorer.html Self-contained HTML; opens in any web browser, no server required
Raw evidence files ch5-artefact-bundle/evidence/ Maps reported results to specific package artefacts for researchers accessing the package directly

The interactive explorer, opened in any web browser, provides entity-browsing, triple inspection, and design-category filtering across the full serialised dataset without requiring a server. Within the expanded package, Table A5.3 lists the canonical artefact files in the canonical/ and text-corpus/ subdirectories.

Table A5.3: Canonical data package artefacts

File Reference Contents
canonical/graph-ready-triples.json F01 Entity-resolved SVO triples for knowledge-graph ingestion
canonical/figures-triples.json F02 Normative triples extracted from standard figures
canonical/deontic-force-figures.json F03 Deontic-force classification per figure triple
canonical/ambiguity-delta-figures.json F04 Text-versus-figure ambiguity-delta scores
canonical/text-figure-cross-validation.json F05 Cross-validation results, text and figures channels
canonical/polysemy-figures-analysis.json F06 Polysemy analysis scoped to figure-extracted terms
canonical/sda_polysemy_analysis.json F07 Full corpus polysemy analysis, 1,236 terms
canonical/sda_ontological_clusters.json F08 Ontological cluster assignments
text-corpus/serialised-text-requirements.json F10 Serialised plain-text requirement clauses, 611 records
derived/unified-explorer-data-v1.json Integrated dataset: 56 entities, 189 triples

Four flat-file exports in the package root support spreadsheet inspection: entities.csv, triples.csv, polysemy.csv, and requirements-by-category.csv. Semantic exports for graph-tool and semantic-web reuse are at semantic/sda-knowledge-graph.graphml (NetworkX DiGraph in GraphML format) and semantic/sda-ontology.ttl (RDF Turtle, planimetric.org namespace).

Field-level documentation for every field in every file is in data-package/DATA-DICTIONARY.md. Regeneration scripts are included in the package root — gen_csv_exports.py, gen_graphml.py, gen_rdf_turtle.py — all self-contained Python 3.12 programs that can be run against the canonical JSON artefacts to reproduce the derived outputs.


Provenance and Versioning

Item Detail
Source standard NDIS SDA Design Standard 2019 (Australian Government)
Package version 1.0
Generation date 2026-03-28
Entity count 56
Triple count 189
Polysemy terms analysed 1,236
Licence Creative Commons Attribution 4.0 International (CC BY 4.0)

Evidence Appendices

The detailed findings for each evaluation dimension are presented in dedicated examiner-facing appendices:

Raw evidence files mapping reported results to specific data package artefacts are also deposited in the package’s evidence/ subdirectory (its location within the deposited thesis is recorded in the Data Package Manifest above), for reference by researchers accessing the data package directly.