Bible Text Repositories Guide
Introduction
This guide covers how to handle Bible text repositories in Door43, including both original language texts and gateway language translations. These repositories contain complete Bibles in USFM format with different levels of alignment and complexity.
Repository Types Covered:
- Original Language Texts: Hebrew Bible (UHB), Greek New Testament (UGNT)
- Gateway Language Translations: Literal Translation (ULT), Simplified Translation (UST)
Original Language Bible Repositories
Examples: hbo_uhb, el-x-koine_ugnt
Key Characteristics:
- Subject: "Bible"
- Container type: "bundle"
- Content: Original Hebrew/Greek text
- Format: USFM 3.0
- Tokenization: Words marked for alignment
- Scope: Complete Bible (66 books)
Gateway Language Bible Repositories
Examples: en_ult, en_ust
Key Characteristics:
- Subject: "Aligned Bible"
- Container type: "bundle"
- Content: Gateway language translation
- Format: USFM 3.0 with word alignment
- Alignment: Word-level connections to original languages
- Scope: Complete Bible (66 books)
How to Identify Bible Text Repositories
Step 1: Check the Manifest Subject
- Look for
dublin_core.subjectfield in manifest.yaml - Original texts have subject "Bible"
- Gateway translations have subject "Aligned Bible"
Step 2: Verify Container Type
- Check
dublin_core.typefield - Should be "bundle" for Bible text repositories
Step 3: Confirm File Structure
- Look for numbered USFM files (01-GEN.usfm, 02-EXO.usfm, etc.)
- Should have files for most or all 66 Bible books
- Files follow the pattern:
{NN}-{BOOK}.usfm
Original Language Bible Manifest
dublin_core:
identifier: 'uhb' # Resource identifier
language:
identifier: 'hbo' # Biblical Hebrew language code
direction: 'rtl' # Right-to-left text direction
subject: 'Bible' # Original language text
type: 'bundle' # Complete collection
version: '2.1.30' # Resource version
format: 'text/usfm3' # USFM 3.0 format
projects: # All 66 books
- identifier: 'gen' # Book identifier
title: 'Genesis' # Human-readable title
path: './01-GEN.usfm' # File path
sort: 1 # Display order
versification: 'original' # Versification system
categories: ['bible-ot'] # Old Testament
Gateway Language Bible Manifest
dublin_core:
identifier: 'ult' # Resource identifier
language:
identifier: 'en' # English language code
direction: 'ltr' # Left-to-right text direction
subject: 'Aligned Bible' # Gateway language with alignment
type: 'bundle' # Complete collection
version: '86' # Resource version
format: 'text/usfm3' # USFM 3.0 format
relation: # Dependencies
- 'en/tw' # Translation Words
- 'en/tn' # Translation Notes
- 'hbo/uhb?v=2.1.30' # Hebrew Bible source
- 'el-x-koine/ugnt?v=0.34' # Greek NT source
projects: # All 66 books
- identifier: 'gen' # Book identifier
title: 'Genesis' # Human-readable title
path: './01-GEN.usfm' # File path
sort: 1 # Display order
versification: 'ufw' # Versification system
categories: ['bible-ot'] # Old Testament
Original Language Repository Structure
hbo_uhb/
โโโ ๐ manifest.yaml # Resource Container manifest
โโโ ๐ LICENSE.md # CC BY-SA 4.0 license
โโโ ๐ 01-GEN.usfm # Genesis
โโโ ๐ 02-EXO.usfm # Exodus
โโโ ๐ 03-LEV.usfm # Leviticus
โโโ ... # All Old Testament books
โโโ ๐ 39-MAL.usfm # Malachi (last OT book)
โโโ ๐ README.md # Repository documentation
Gateway Language Repository Structure
en_ult/
โโโ ๐ manifest.yaml # Resource Container manifest
โโโ ๐ LICENSE.md # CC BY-SA 4.0 license
โโโ ๐ A0-FRT.usfm # Front matter
โโโ ๐ 01-GEN.usfm # Genesis with alignment
โโโ ๐ 02-EXO.usfm # Exodus with alignment
โโโ ... # All 66 books
โโโ ๐ 40-MAT.usfm # Matthew (first NT book)
โโโ ... # All New Testament books
โโโ ๐ 67-REV.usfm # Revelation
Original Language USFM Content
Hebrew Bible Sample (01-GEN.usfm):
\id GEN unfoldingWordยฎ Hebrew Bible
\usfm 3.0
\ide UTF-8
\h ืืจืืฉืืช
\toc1 ืืจืืฉืืช
\toc2 ืืจืืฉืืช
\toc3 ืืจ
\mt ืืจืืฉืืช
\c 1
\p
\v 1 ืึฐึผืจึตืืฉึดืึืืช ืึธึผืจึธึฃื ืึฑืึนืึดึืื ืึตึฅืช ืึทืฉึธึผืืึทึืึดื ืึฐืึตึฅืช ืึธืึธึฝืจึถืฅื
Key Features:
- Hebrew text in right-to-left direction
- Standard USFM markers
- No alignment data (this is the source)
- Tokenized for alignment purposes
Gateway Language USFM Content
English ULT Sample (01-GEN.usfm):
\id GEN unfoldingWordยฎ Literal Text
\usfm 3.0
\ide UTF-8
\h Genesis
\toc1 The Book of Genesis
\toc2 Genesis
\toc3 Gen
\mt Genesis
\c 1
\p
\v 1 \zaln-s |x-strong="H07225" x-lemma="ืจึตืืฉึดืืืช" x-content="ืึฐึผืจึตืืฉึดืืืช"\*\w In|x-occurrence="1"\w* \w the|x-occurrence="1"\w* \w beginning|x-occurrence="1"\w*\zaln-e\* \zaln-s |x-strong="H0430" x-lemma="ืึฑืึนืึดืื" x-content="ืึฑืึนืึดืื"\*\w God|x-occurrence="1"\w*\zaln-e\* \zaln-s |x-strong="H01254" x-lemma="ืึธึผืจึธื" x-content="ืึธึผืจึธื"\*\w created|x-occurrence="1"\w*\zaln-e\* \zaln-s |x-strong="H0853" x-lemma="ืึตืช" x-content="ืึตืช"\*\zaln-e\* \zaln-s |x-strong="H08064" x-lemma="ืฉึธืืึทืึดื" x-content="ืึทืฉึธึผืืึทึืึดื"\*\w the|x-occurrence="2"\w* \w heavens|x-occurrence="1"\w*\zaln-e\* \zaln-s |x-strong="H0853" x-lemma="ืึตืช" x-content="ืึฐืึตืช"\*\w and|x-occurrence="1"\w*\zaln-e\* \zaln-s |x-strong="H0776" x-lemma="ืึถืจึถืฅ" x-content="ืึธืึธึฝืจึถืฅ"\*\w the|x-occurrence="3"\w* \w earth|x-occurrence="1"\w*\zaln-e\*.
Key Features:
- English translation text
- Extensive word alignment markers (
\zaln-s,\zaln-e,\w) - Strong's concordance numbers
- Hebrew lemma and morphology data
- Occurrence tracking for precise alignment
Step 1: Identify Repository Type
Check Repository Characteristics:
- Verify the manifest subject is "Bible" or "Aligned Bible"
- Confirm container type is "bundle"
- Look for numbered USFM files covering multiple books
Determine Alignment Level:
- Original language texts: No alignment data
- Gateway language texts: Extensive alignment markers
Step 2: Extract Bible Structure Information
From Manifest:
- Get the complete book list from
projects[]array - Note the versification system used
- Check for book categories (bible-ot, bible-nt, bible-frt)
Expected Book Count:
- Complete Bible: 66+ books (including front matter)
- Old Testament only: 39 books
- New Testament only: 27 books
Step 3: Process File Organization
File Naming Pattern:
- Books numbered by canonical order: 01-GEN.usfm, 02-EXO.usfm
- New Testament starts at 40: 40-MAT.usfm (note: some use 41-MAT.usfm)
- Front matter: A0-FRT.usfm (if present)
File Size Expectations:
- Small books (Philemon, 2-3 John): 5-15 KB
- Medium books (Ephesians, Philippians): 20-50 KB
- Large books (Genesis, Psalms): 100-500 KB
- Very large books (1 Chronicles): 500+ KB
Step 4: Handle Content Access
For Original Language Texts:
- Use raw URLs for direct access to Hebrew/Greek text
- Content is ready for parsing with standard USFM parser
- No alignment processing needed
For Gateway Language Texts:
- Use raw URLs for content access
- Content requires alignment-aware USFM parser
- Process alignment markers for word-level features
Step 5: Process Dependencies
Original Language Texts:
- Usually have minimal dependencies
- May reference gateway language translations
Gateway Language Texts:
- Always reference original language sources
- Reference support resources (TN, TW, TA, TQ)
- May reference parallel gateway translations
How to Display Bible Text Repositories in Preview Apps
Step 1: Present Repository Information
- Show the Bible name and language clearly
- Indicate if it's original language or gateway language
- Display the translation approach (literal vs simplified for gateway languages)
Step 2: Organize Book Navigation
- Group books by testament (Old Testament, New Testament)
- Show book names in both identifier and full title
- Include book categories if available
Step 3: Handle Alignment Features
- For gateway language texts, indicate that word alignment is available
- Show related resources that work with this Bible
- Provide access to alignment-dependent features
Step 4: Show Content Statistics
- Display total number of books available
- Show versification system used
- Indicate checking level if available
How to Use Bible Text Repositories in Editing Apps
Step 1: Set Up Bible Access
- Configure access to all book files based on projects array
- Set up navigation between books and chapters
- Handle versification system appropriately
Step 2: Configure Alignment Processing (for gateway languages)
- Set up alignment marker parsing
- Enable word-level highlighting features
- Connect to original language sources for alignment data
Step 3: Enable Related Resources
- Configure access to Translation Notes for verse guidance
- Set up Translation Words for term definitions
- Enable Translation Questions for quality checking
Step 4: Handle Large File Sizes
- Implement efficient loading for large books
- Consider chapter-by-chapter loading for very large books
- Cache frequently accessed books locally
Differences Between Bible Repository Types
| Aspect | Original Language | Gateway Language |
|---|---|---|
| Subject | "Bible" | "Aligned Bible" |
| Alignment | None | Extensive word alignment |
| Dependencies | Minimal | Many (TN, TW, TA, etc.) |
| Complexity | Standard USFM | USFM + alignment markers |
| File Size | Standard | Larger due to alignment |
| Use Case | Source reference | Translation base |
| Target Users | Advanced translators | All translators |
1. File Access Strategy
- Use raw URLs for better performance with large files
- Cache frequently accessed books locally
- Handle file size variations appropriately
2. Alignment Processing
- For gateway languages, always process alignment markers
- Maintain connection to original language sources
- Enable word-level features only for aligned texts
3. Navigation and Display
- Provide clear testament and book organization
- Show translation approach (literal vs simplified)
- Indicate alignment availability to users
4. Performance Optimization
- Load books on demand rather than all at once
- Cache manifest data for quick book enumeration
- Use efficient USFM parsing for large files
Issue 1: Large File Sizes
Problem: Some books (like Psalms) can be very large with alignment data Solution: Implement progressive loading or chapter-by-chapter access
Issue 2: Alignment Marker Complexity
Problem: Gateway language texts have complex alignment syntax Solution: Use specialized USFM parsers that handle alignment markers
Issue 3: Versification Differences
Problem: Different Bible traditions use different verse numbering Solution: Always check the versification field and handle appropriately
This guide is based on analysis of Door43 Bible text repositories and should be used alongside the main Door43 API Developer Guide.