Tyndale STEPBible Data development for machine analysis and computational linguistics
Abstract
The development of STEPBible.org by Tyndale House Cambridge, which aimed to provide study tools for the disadvantaged world, resulted in creating datasets which are useful for many other purposes. In particular, computational linguistics and other machine analysis can benefit from the more stringent and varied ways in which data has been refined and presented. Much of this data resulted from a project supported by ETEN to automatically tag Bibles to Greek and Hebrew. A public repository is gradually being populated with the results of this work.