Word Unperfect
public
Read
Owner: themaster
Branch: main
Commits: 0
Git CLI clone URL
git clone https://www.xt-emporium.com/git/word-unperfect.git
Fullscreen desktop URL
Code
Commits
History
Branches
Bug Reports
Discussions
Compare
Settings
word-unperfect
/
README_C_PORT.md
File editor
# WordPerfect 5.1 C Port Scaffold This tree contains a cleaned-up C translation scaffold around the reverse-engineered DOS program. The original disk images and decompiled reference artifacts are still present, while the host-buildable C port lives primarily in `rev/`, `owbuild/`, and `tests/`. ## Current completion snapshot As of Pass 67, the honest whole-project full-clone completion estimate is 90/100. Component estimates: - File format preservation: 100/100. All 63 bundled WP-family files round-trip byte-for-byte through the preservation path. - Bundled WP 5.1 document parser core: 100/100. The strict validator now treats generic variable packets, unknown single-byte document codes, D4 unknowns, D4 trailing bytes, nested parse gaps, recursion-limit hits, bad trailers, incomplete records, unknown fixed packets, and top-level byte mismatches as hard failures; all 34 bundled document/style streams pass with `residual=0`. - Bundled WP 5.1 document parsing/analyzing: 100/100. The bundled document/style corpus has exact top-level byte coverage, no incomplete records, no bad trailers, no unknown fixed packets, no unknown single-byte document codes, and no D4 trailing residual bytes. All 688 bundled variable-length document packets have structural decoder coverage, the aggregate analyzer reports `document-analyzer-strict: pass residuals=0`, and negative fixtures prove the strict gate rejects unresolved parser debt. - Overall WP 5.x document parsing/analyzing: 100/100. Under this port's strict structural parser/analyzer gate, the bundled corpus and a reproducible independent WP5.x corpus both pass with zero parser residuals, zero generic variable packets, zero unknown fixed packets, and exact top-level byte coverage. Runtime screen/layout parity remains tracked separately under layout/format/runtime behavior. - Text export / Reveal Codes style inspection: 100/100. Plain-text export, visible-code export, and Reveal Codes style display now use recovered single-byte, fixed-packet, and D0+ variable-packet descriptors with payload summaries, UTF-8 extended-character output, marker/trailer stats, and non-destructive cursor handling. - Editable document model: insert/delete/save: 100/100. Public edit APIs are transactional, preserve the original model on rejected edits, translate host text controls and mapped extended characters into WP byte streams, refuse unsafe deletes across protected packets, insert C3/C4 attribute spans, and save back through the preservation path. The oracle edit replay now proves marker insertion, deletion back to the byte-identical original, attribute span insertion, extended-character insertion, visible export, and round-trip preservation. - Resource, macro, keyboard, and printer analysis: 100/100. All bundled WP-family companion/resource files are covered by byte-preserving inventory, generic resource text/word/offset analysis, macro word classification and dry-run opcode histograms, WPK section-directory/binding/descriptor analysis, printer name/table word summaries, and refreshed oracle file-summary goldens. Macro execution and real printer output are tracked separately under application/runtime behavior. - Original-runtime oracle/replay infrastructure: 100/100 for the current bundled document-screen oracle matrix. The original DOS runtime now starts under QEMU with an installer-sized `WP.FIL`, and the oracle captures startup, opened-document, first PageDown, second PageDown, third PageDown, PageUp, second PageUp, Home, End, and F7 save-prompt screens for QEMU-enabled bundled document fixtures. It also replays a deterministic original-runtime edit workflow that types `wpedit` into the memo fixture, reaches the save prompt with manifest-checked runtime inputs, normalizes volatile screen dates, waits for screen landmarks instead of blind sleeps, and now stores host-side 80x25 screen summaries plus application editor-script save/export/round-trip artifacts in the golden oracle. It also emits stable host-vs-QEMU screen-parity summaries for every QEMU-enabled document fixture; all 33 current bundled QEMU screen-parity artifacts compare at 25/25 exact 80-column text lines and 25/25 exact VGA attribute rows. - Bundled read-only document screen/navigation parity: 100/100. The host renderer now matches the original DOS runtime for opened documents, first PageDown, second PageDown, third PageDown, PageUp, second PageUp, Home, and End captures across `CHARACTR.DOC`, `LEARN\MEMO.WKB`, `LEARN\REPORT.WKB`, and `LEARN\TABLE.WKB`, including fixed-point line/position status values and cases where the original reports one page while retaining a scrolled viewport from a prior render page. - Layout/format/runtime behavior: 98/100. Important parser, scanner, layout, line metric, fixed-point measurement, original-runtime comparison paths, and edit-screen runtime traces exist. The host application now renders WP records directly into deterministic 80x25 document screens, distinguishes soft page markers from hard page breaks, applies recovered C1/C6 fixed-position packets, D4 line-build checkpoints, C3/C4 paired text attributes, D6 footnote markers, generated date fields, DOS-style status paths, WP-style blue-screen VGA attributes, 0x8C page gates, 0xA9 scanner-normalized hyphens, deferred form-feed page transitions and section separators, fixed-point hundredths-scaled status line values, C2/C6 hanging indents, recovered C0 screen glyph fallbacks, and the bundled table viewport layout. Host-vs-QEMU screen parity is stored in the oracle, and every QEMU-enabled bundled document fixture now matches the original-runtime opened-document, repeated PageDown/PageUp, Home, and End screens at 25/25 exact 80-column text lines plus 25/25 exact VGA attribute rows. Dynamic ruler/attribute display, repaint timing, non-bundled screen parity, and broader interactive application behavior remain outside this 100/100 bounded document-screen parity claim. - Visible host/DOS GUI frontend, bounded live surface: 100/100. `word unperfect --gui [FILE]` now opens the deterministic WP-style 80x25 document screen in both the host terminal and the DOS/QEMU binary, with no file argument starting in an untitled blank WordPerfect-style editor screen. The frontend has live typing, Enter, Tab, backspace/delete, cursor movement, PageUp/PageDown/Home/End navigation, F5 List/Open, F10 save, Ctrl-W Save As, Shift-F7 print/export-to-file, F7 exit prompting, F3 help, Alt-F3 Reveal Codes toggling, status/footer prompts, retained input on failed opens, direct DOS VGA text rendering, and QEMU captures proving open, edit, dirty-state, help, print/export, save, Save As, successful File/Open, and graceful failed File/Open behavior. This 100/100 number is for the bounded live frontend command surface implemented here; it is not a claim that every original WordPerfect menu tree, macro UI, mouse path, printer-driver dialog, ruler-editing flow, repaint timing detail, or non-bundled interactive runtime workflow has been cloned. - Editor/application clone behavior: 82/100. There is a host/DOS harness, smokeable UI path, original-runtime document-open/save-prompt oracle, transactional host edit replay for insert/delete/attribute/extended-character cases, visible export, round-trip save evidence, and now a live DOS-visible GUI editor loop with modal help/open/save-as/print surfaces. The application layer also has a script replay path for type, backspace, attribute insert, save, export, and round-trip verification. Full original WordPerfect workflow breadth, macro execution, printer-driver integration, and broad interactive runtime parity are not complete. The percentage distinction is deliberate: the live GUI frontend surface is now complete for the bounded command set implemented by `word unperfect`, while the larger application clone remains below full WordPerfect parity. ## Work completed in the sixty-ninth pass - Added WP-style modal overlays to the live frontend for F3 Help, F5 List/Open, Ctrl-W Save As, and Shift-F7 Print. - Added dialog text input, backspace, Enter acceptance, Esc/F7 cancellation, status/footer prompts, and retained filename text after failed opens. - Routed Print through the text exporter so the DOS frontend can create a `PRINT.TXT`-style output file from the current document. - Routed Save As through the lossless file writer and current document model. - Verified the DOS/QEMU GUI help, print/export, successful open, failed open, and Save As surfaces with text-mode VRAM captures under `build/qemu_word_unperfect_gui100/`. - Raised the bounded visible host/DOS GUI frontend completion estimate to 100/100. Large-file DOS reopen capacity and full original WordPerfect UI breadth remain tracked under editor/application clone behavior rather than this bounded GUI surface. ## Work completed in the sixty-eighth pass - Replaced the reveal-only terminal frontend with a live document editor loop shared by the host terminal and DOS/QEMU binary. - Added a public in-memory application-screen render entry point so the GUI can render unsaved edits without writing temporary files for every repaint. - Wired GUI typing, Enter, Tab, backspace, delete, arrows, PageUp/PageDown, Home/End, F10 save, F7 exit prompt, and Alt-F3 Reveal Codes toggle into the existing lossless document model. - Compiled the GUI into `owbuild/build/WORDUNP.EXE`; the DOS build no longer reports that the GUI is unavailable. - Switched the DOS frontend to direct VGA text-memory writes and moved the frontend state to heap storage to avoid 16-bit stack corruption. - Added a DOS save fallback for 8.3 floppy paths after the portable atomic-save temp path proved unsuitable under DOS. - Verified QEMU GUI open/edit/save captures: `build/qemu_word_unperfect_gui/loaded.txt`, `build/qemu_word_unperfect_gui/typed.txt`, and `build/qemu_word_unperfect_gui/saved.txt`. ## Work completed in the sixty-seventh pass - Extended the original-runtime QEMU oracle to capture `document_pgdn3`, `document_pgup2`, `document_home`, and `document_end` for read-only document fixtures. - Added host/QEMU parity artifacts for third PageDown, second PageUp, Home, and End captures, bringing the current screen-parity matrix to 33 files. - Added fixed-point status-position parsing and rendering so `Pos 1.6"` and similar original-runtime status values are represented without floating-point operations. - Decoupled status page from render source page for viewport parity. This matches the original `REPORT.WKB` third-PageDown state, where WordPerfect reports `Pg 4` while the visible text is a retained scrolled viewport from the prior render page. - Rendered `0xA9` scanner-normalized hyphen controls as visible hyphens in the application screen renderer, matching original runtime text such as `long-term`. - Refreshed oracle goldens; all 33 host-vs-QEMU screen parity files now report 25/25 exact 80-column text rows and 25/25 exact VGA attribute rows. ## Work completed in the sixty-sixth pass - Extended the original-runtime QEMU oracle to capture `document_pgdn2` and `document_pgup` screens for read-only document fixtures. - Generalized host/QEMU viewport parity generation so opened, first PageDown, second PageDown, and PageUp captures all use the same status-derived viewport search and exact text/attribute comparison. - Recovered deferred form-feed behavior used by section transitions: a pending `0x0C` can resolve to a hard page or to a visible 80-column separator once following records identify the layout context. - Added C2/C6 hanging-indent handling for the character-set pages and C0 screen-glyph fallbacks for recovered WordPerfect extended characters. - Refreshed oracle goldens; `charactr_doc`, `memo_wkb`, `report_wkb`, and `table_wkb` now report 25/25 exact text rows and 25/25 exact VGA attribute rows for opened, first PageDown, second PageDown, and PageUp captures. ## Work completed in the sixty-fifth pass - Added viewport-aware host screen rendering so document visual rows and screen rows are no longer the same thing. - Added `--screen-summary-view FILE PAGE ROW [STATUS_LINE]` for deterministic page/viewport captures. `STATUS_LINE` accepts fixed-point hundredths, so values like `4.21` are represented without floating point. - Added original-runtime PageDown captures for read-only QEMU document fixtures. - Recovered the single-byte `0x8C` page gate and form-feed page transition behavior needed by `CHARACTR.DOC` and `REPORT.WKB`. - Separated viewport row from status/cursor line so short documents like `MEMO.WKB` can keep the same visible viewport while reporting `Ln 3`. - Added page-down parity generation that derives the page/status line from QEMU, searches stable viewport candidates, and stores `screen-summary-pgdn.txt` plus `screen-parity-pgdn.txt`. - Refreshed oracle goldens; `charactr_doc`, `memo_wkb`, `report_wkb`, and `table_wkb` now report 25/25 exact text rows and 25/25 exact attribute rows for their first PageDown captures. ## Work completed in the sixty-fourth pass - Added WP-style application screen attributes: body text `0x17`, highlighted labels/status spans `0x1f`, table-title spans `0x60`, and footnote marker spans `0x75`. - Applied C3/C4 paired attribute state during record rendering and preserved the recovered odd original-runtime status-line attribute split. - Updated the terminal frontend to approximate recovered WP VGA attributes with ANSI colors. - Extended `--screen-summary` output with a deterministic 25-row attribute grid. - Extended QEMU VGA decoding with stable `*.attrs.txt` and `*.attr-runs.txt` artifacts. - Tightened `screen-parity.txt` to compare host screens against QEMU VRAM at exact 80-column text and exact 80-column attribute granularity. - Refreshed oracle goldens; `charactr_doc`, `memo_wkb`, `memo_edit`, `report_wkb`, and `table_wkb` all report 25/25 exact text rows and 25/25 exact attribute rows. ## Work completed in the sixty-third pass - Promoted bundled first-screen layout/runtime parity to exact visible-line parity for every QEMU-enabled document fixture. - Added DOS-style status paths (`B:\WP51\...`) and fixed the status-column placement to match the original runtime. - Added D4 line-build checkpoint positioning for table/title viewports. - Added D6 repeat/group marker rendering for the report footnote reference. - Added table viewport row handling for the bundled sales table: table borders, split header rows, clipped country names, overlapping numeric columns, and totals-row rendering now match the captured original screen. - Added smoke assertions for DOS status paths and table viewport rendering. - Refreshed oracle goldens; `screen-parity.txt` now reports 25/25 exact visible lines for `charactr_doc`, `memo_wkb`, `memo_edit`, `report_wkb`, and `table_wkb`. ## Work completed in the sixty-second pass - Replaced the application screen renderer's plain-text/form-feed pass with a record-driven renderer over the WP stream. - Fixed soft-page handling for runtime display: soft page markers now count toward estimated pages without blanking the active viewport. - Added first-page display handling for recovered C1/C6 fixed-position packets, including the memo title, date field position, and divider indentation. - Expanded D8 generated date fields into the same normalized `<date>` placeholder used by the QEMU oracle. - Updated screen metrics with records seen, soft pages, rendered page, max column, wrap count, fixed-position packet count, generated-field count, and first-page status-line position. - Added oracle `screen-parity.txt` artifacts comparing host 80x25 screen output with original-runtime QEMU document screens for every QEMU-enabled fixture. Current memo fixtures match 24/25 lines; the remaining line is the host path versus original DOS path/status rendering. ## Work completed in the sixty-first pass - Added deterministic host application screen rendering for WP document files: 80x25 text frame, bottom status row, logical/visual line counts, hard-page counts, estimated pages, clipping count, and export truncation status. - Added `--screen-summary FILE` to `build-host/word unperfect` and wired it into the oracle fixture matrix for all document fixtures. - Added application-level editor script replay with `move`, `type`, `delete`, `backspace`, `attr`, and `save` commands over the transactional document model. - Added `--editor-script IN OUT SCRIPT` to the host CLI and covered it in the `memo_edit` oracle with visible-code export, C3/C4 attribute proof, output hashing, and round-trip preservation. - Expanded host smoke tests to render the memo document screen, verify the status line, replay an edit script, export the saved file, and prove the inserted attribute span remains visible. - Refreshed oracle goldens so layout/runtime and editor/application evidence is now repeatable. ## Work completed in the sixtieth pass - Promoted Resource, macro, keyboard, and printer analysis to 100/100. - Added generic companion-resource analysis for all non-document WP-family files: body words, zero/sentinel words, offset-like words, printable text runs, length-prefixed string candidates, and first/longest recovered strings. - Added WPK section-directory accounting: section entries, valid/invalid sections, total section bytes, binding-section bytes, and descriptor-section bytes. - Expanded host smoke coverage so all 10 bundled macros dry-run against their analyzer summaries, all 5 bundled keyboards validate their section directories, and both bundled printer resources prove names plus generic resource strings. - Refreshed oracle goldens so resource file summaries include the richer generic-resource and keyboard-section evidence. ## Work completed in the fifty-ninth pass - Promoted Editable document model: insert/delete/save to 100/100. - Made public document-model edit APIs transactional by cloning, applying the full edit, and committing only on success. - Added rejected-edit smoke coverage that proves invalid UTF-8 inside an attribute span leaves the model bytes unchanged. - Expanded host oracle `edit-replay` into insert, delete-restore, C3/C4-attribute, and mapped extended-character save/round-trip cases. - Stabilized QEMU screen comparison by normalizing volatile memo dates and waiting for screen landmarks before capture. - Refreshed oracle goldens so the edit/save matrix is part of the repeatable comparison suite. ## Work completed in the fifty-eighth pass - Promoted Text export / Reveal Codes style inspection to 100/100. - Visible-code text export now emits recovered fixed-packet and D0+ variable packet names plus payload details instead of shallow generic markers. - Reveal Codes style display now uses the same recovered packet descriptors and payload summaries while remaining non-destructive. - Text export statistics now include extended-character, marker, fixed-packet, variable-packet, incomplete-record, and bad-trailer counters. - Refreshed oracle goldens so host exports show the richer recovered packet inspection output. ## Work completed in the fifty-seventh pass - Added the `memo_edit` QEMU oracle fixture to drive an original WordPerfect editing workflow over `LEARN\MEMO.WKB`. - Added deterministic QEMU key-event text injection for a simple edit marker. - The original runtime now opens the memo document, types `wpedit`, captures the edited document screen, presses F7, and captures the save prompt. - Added host-side `edit-replay` oracle mode. The port inserts the same `wpedit` marker at logical text offset 0, exports it visibly, and verifies the edited WP file still round-trips byte-for-byte. - Refreshed oracle goldens so original-runtime editing evidence and host edit replay evidence are checked together. ## Work completed in the fifty-sixth pass - Expanded the default QEMU oracle from one document fixture to all QEMU-enabled bundled document fixtures. - Captured three named original-runtime screens for each QEMU document fixture: startup, opened document, and F7 save prompt. - Added per-fixture `screen-summary.txt` files with stable runtime landmarks: startup marker, document-open status, save-prompt status, and the original WordPerfect document status line. - Raised the default QEMU timeout to 60 seconds for the longer replay trace. - Added `make oracle-qemu-all` for directly running the full QEMU document replay set. ## Work completed in the fifty-fifth pass - Recovered the installer-sized `WP.FIL` required by the original WordPerfect runtime. `wp_extract_prog1/WP.SPN` records `WP.FIL` as 617745 bytes; the old local copy was 131072 bytes and caused QEMU startup error 30. - Replaced `wp_c/WP51/WP.FIL` with the recovered installer output and refreshed the QEMU oracle goldens. The original DOS runtime now captures WordPerfect 5.1 startup/document screens instead of failing before startup. - Added `tools/oracle/wp_span_manifest.py` and wired manifest-size validation into the QEMU oracle so truncated or mismatched original runtime files fail fast with a precise setup error. - Made the QEMU runtime floppy image fingerprint the source `wp_c/WP51` tree and rebuild automatically when original runtime inputs change. - Added `tools/oracle/recover_wp51_core_dosbox.sh` to reproduce the core runtime recovery from the installer disk images under DOSBox-X. ## Work completed in the fifty-fourth pass - Expanded `tools/fetch_external_wp5_corpus.sh` with three more independent online WP5.x-era sources: FormGrids, GraphCat, and BookBild. - Added structural D4 decoders for subcode-03 control-word packets, subcode-01 `0x0A00` compact layout metric packets, and subcode-00 `0x1A00` layout anchor/geometry packets. These packets are preserved as raw WP layout-unit words; the decoder performs no rescaling or floating-point conversion. - Re-ran the reproducible external corpus: 95 document/style streams now pass strict validation with `document-analyzer-strict: pass residuals=0`. - Re-ran external preservation validation: 225/225 WP-family files round-trip byte-for-byte with zero compare failures. - Promoted Overall WP 5.x document parsing/analyzing to 100/100 for the strict structural parser/analyzer scope. ## What is now buildable The portable host build covers the translated record stream, record parser, display buffer / Reveal Codes view, line breaker, virtual-memory cache, keyboard dispatcher, common WP file-prefix/body loader, plain-text exporter, resource-header reader, document analyzer, resource inventory, lossless round-trip checker, and the small host application harness. ## Work completed in the fifty-third pass - Found independent online WP5.x-era sample sources on the Internet Archive and added `tools/fetch_external_wp5_corpus.sh` to reproduce a local external validation corpus without vendoring third-party files. - The fetched corpus currently contributes 196 WP-family files, including 89 document/style streams, from Franklin Beedle desktop-publishing tutorial disks, a page-number macro archive, and Letter Maker. - Added structural decoders for D4 `0x0B00` position-marker records and D5/D6 repeat/group payloads exposed by that corpus. - External strict validation now reports `document-analyzer-strict: pass residuals=0`, with all 89 external document/style streams passing. - External preservation validation also passes: 196/196 WP-family files round-trip byte-for-byte, with zero compare failures. ## Work completed in the fifty-second pass - Added an aggregate `document-analyzer-strict` verdict to `--validate-docs` so bundled parser/analyzer cleanliness is visible as a single pass/fail line. - Reconfirmed all 34 bundled document/style streams pass strict validation with `residuals=0`, exact top-level byte coverage, no generic variable packets, no unknown single-byte document codes, no D4 residuals, no nested parse gaps, no bad trailers, and no unknown fixed packets. - Promoted the bundled WP 5.1 document parsing/analyzing score to 100/100. - Kept the overall WP 5.x document parsing/analyzing score below 100 because that still requires non-bundled WP 5.x golden documents and original-runtime comparison evidence, not just a clean bundled WP 5.1 corpus. ## Work completed in the fifty-first pass - Tightened document corpus validation so unresolved parser debt is a hard failure instead of a report-only statistic. - `--validate-docs` now fails on unknown single-byte document codes, D4 unknown packets, D4 trailing residual bytes, generic variable packets, nested parse gaps, nested recursion-limit hits, incomplete records, bad trailers, unknown fixed packets, and top-level byte mismatches. - Per-file validation output now includes `generic=` and `residual=` fields so parser-core conformance is visible at a glance. - Added negative smoke fixtures proving that unknown single-byte document codes and unresolved D4 variable packets fail validation. - Current bundled document corpus result under strict validation: 34/34 document/style streams pass with exact byte coverage and `residual=0`. ## Work completed in the fiftieth pass - Added structural decoders for the previously shallow D1 definition, D2 outline/list, D3 generated-text, D7-D9 delayed/generated-text, DA box/object, DC table/layout, and DE system/merge variable packet families. - Added bounded text previews for those packet payloads so record dumps show useful embedded labels and names without losing byte-level preservation. - Added structural/generic variable-packet counters to document, nested-stream, and corpus validation stats, and surfaced them in `--analyze` / `--validate-docs`. - Expanded action labels so D1/D2/D3/DE packets no longer appear as generic actions in analyzer summaries. - Added host smoke coverage for the new packet families and their summary counters. - Current bundled document corpus result: `variable-structures: structural=688 generic=0 d1=19 d2=16 d3=28 delayed=84 box=23 table=102 system=93`. ## Work completed in the fifth pass - Added `rev/wp_document_analyzer.c` / `rev/wp_document_analyzer.h`, a non-destructive parser statistics and histogram module. - Added `--analyze` support to `build-host/word unperfect` for file/header summaries, parser counts, incomplete-record counts, payload metrics, and code histograms. - Added export CLI options: `--markers`, `--expand-tabs`, `--no-form-feed`, and `--stats`. - Made command-line text export retry with larger buffers when marker/tab expansion would otherwise truncate output. - Expanded smoke tests for analyzer record counts, code histograms, sub-code histograms, payload accounting, labels, and cursor preservation. - Added `WpRecord.trailer_present`, separating truncated variable records from complete-length records with bad mirror trailers. ## Work completed in the fourth pass - Added `rev/wp_text_export.c` / `rev/wp_text_export.h`, a non-destructive plain-text export module for the parsed primary stream. - The exporter maps ASCII, hard/soft returns, hard/soft page codes, tabs/indents, and the currently modeled WP extended-character record form into host UTF-8 text. - Added export options for tab preservation vs. expansion, form-feed emission, and optional visible code markers. - Added export statistics for records seen, bytes consumed, format codes, incomplete records, emitted text length, and truncation. - Updated `build-host/word unperfect` so it still runs the internal smoke harness with no arguments, but exports a WP-family file to stdout when given a path. - Reworked the top-level `Makefile` to build individual object files under `build-host/`, improving incremental rebuilds and making the host build easier to extend. - Added regression tests for non-destructive text export, hard return/tab handling, tab expansion, UTF-8 extended-character output, and truncation/NUL-termination behavior. - Added `PASS4_NOTES.md`. ## Work completed in the third pass - Added `rev/wp_file_format.c` / `rev/wp_file_format.h` for the common `0xFF WPC` prefix used by bundled WP documents, macros, styles, and resource files. The loader decodes fields explicitly as little-endian bytes, validates signature and `data_offset` bounds, loads body bytes in logical order, builds LIFO storage for the translated primary cursor, and binds loaded bodies to `WpLayoutGlobals`. - Reworked `wp_res_read_header()` to share the new common prefix parser instead of casting a packed host struct directly from disk. - Added range-safe `wp_mem_read()` and `wp_mem_write()` helpers that cross 2 KB VM block boundaries, use the existing LRU swap path, mark dirty blocks, and reject offset overflow. - Fixed line-break overflow handling so a tab or future wide record that would exceed the target width is not consumed and lost; the stream cursor is restored before that record. - Fixed zero-length primary stream binding so empty WP body files do not rely on undefined `NULL + 0` pointer arithmetic. - Expanded smoke tests to cover artificial WP file fixtures, optional bundled `CHARACTR.DOC` / `LIBRARY.STY` / `WORKBOOK.PRS` checks, zero-body binding, cross-block persisted VM reads/writes, and non-lossy tab overflow layout. - Added `PASS3_NOTES.md` with the detailed third-pass change log. ## Work completed in the second pass - Fixed primary-record 16-bit word reads. The primary buffer is consumed as a LIFO byte stream, so native `uint16_t` loads reversed little-endian WP length words on the host. `consume_word_from_primary_buffer()` now composes words from logical stream bytes, and the secondary path does the same for endian/alignment safety. - Expanded `WpRecord` with sub-code, declared payload length, copied payload length, completion state, and trailer-validation state. - Hardened variable-length record parsing. Complete records now validate the mirrored trailing length/sub-code/code fields; corrupt or truncated records are bounded, marked incomplete, and consumed without over-reading or leaving desynchronizing fragments. - Replaced the mock virtual-memory manager with a functional 2 KB block cache, LRU eviction, dirty marking, swap-out/swap-in, flush, and destroy routines. - Made the Reveal Codes display non-destructive: it snapshots the stream cursor, renders text and code labels, and leaves the caller's record stream untouched. - Implemented observable keyboard command dispatch for F1/F3/Alt-F3/F5/F7/Shift-F7/F10, including exit confirmation handling. - Added a portable resource-header reader for WP resource files and filled in the previously missing video-resource loader stub. - Added regression coverage for word byte order, variable-record parsing, corrupt-record bounds, line breaking, Reveal Codes rendering, VM swapping, keyboard dispatch, resource validation, and extended-character lookup. - Included the key dispatcher and resource manager in the top-level host build so they are compiled continuously. ## Build ```sh make make test ``` The default host executable is written to: ```text build-host/word unperfect ``` `build-host/word_unperfect` is the make-safe build artifact, and `build-host/word unperfect` is the exact launch-name symlink. The smoke test should print: ```text host smoke tests passed ``` To export a WP-family file body as plain text from the host harness: ```sh ./build-host/word\ unperfect path/to/file.wp > file.txt ``` ## Additional checks run in this pass ```sh make test make syntax-check make ./build-host/word\ unperfect ./build-host/word\ unperfect --help ./build-host/word\ unperfect --analyze wp_c/WP51/CHARACTR.DOC ./build-host/word\ unperfect --stats wp_c/WP51/CHARACTR.DOC > /tmp/wp_export.txt clang -Irev -Iowbuild -std=c11 -O0 -g -Wall -Wextra -Wno-unused-variable -Wno-unused-parameter -Wno-int-to-pointer-cast -Wno-pointer-to-int-cast -Wno-uninitialized -fsyntax-only owbuild/word_unperfect.c rev/wp_document_analyzer.c tests/host_smoke_test.c ``` ## DOS/OpenWatcom path The original reproducibility-oriented 16-bit DOS build remains in `owbuild/`: ```sh . ./owbuild/env.sh make -C owbuild all ``` The DOS executable is `owbuild/build/WORDUNP.EXE`; the shortened filename is intentional for MS-DOS 8.3 compatibility, while the program banner/error text uses `word unperfect`. That path expects OpenWatcom under `/opt/openwatcom` unless `WATCOM` is overridden. ## Current limits This is a buildable C-port scaffold and parser/layout/application smoke path, not a complete feature-equivalent WordPerfect replacement yet. `rev/decompiled_wp_exe.c` and `rev/wp_monolith_parser_slice.c` remain reference/decompilation artifacts with unresolved register/overlay semantics; they are intentionally not part of the portable host build.
Commit message
This repository is read-only for this account.
Repository snapshot
Current branch
main
Visibility
public
Your access
Read
Remote
None
File activity
View file history