What do you do when you can’t find the answers? Do you turn to god? Do you turn to spiritual scriptures for inspiration? Well, that’s almost exactly what a team of researchers from the Dartmouth College seem to have done, according to a recent report by Science Daily. More accurately, the researchers have begun studying the Bible so they can find better ways of improving translations done by computers. This is primarily because the Bible has already been translated into numerous different languages.
The researchers have turned to the Holy Book because they believe it contains “a large, previously untapped data set of aligned parallel text (or translation/s)”. Each version of the Bible reportedly contains over 31,000 verses that the researchers borrowed to produce more than 1.5 million unique pairs of source and translated texts. These pairs can then be fed into machine learning training sets.
“The English-language Bible comes in many different written styles, making it the perfect source text to work with for style translation,” commented Keith Carlson, a PhD student at Dartmouth. To make matters easier still, the Bible is already thoroughly indexed; it makes consistent use of book, chapter, and verse numbers, which reduces the number of potential errors with automatic translation and matching.
The team from Dartmouth College used 34 different versions of the Bible that varied in linguistic complexity and style, reports LiveMint. They included easy and hard versions such as the “Bible in Basic English” and the “King James Version”. The texts were reportedly fed into two different algorithms: a statistical machine translation system named “Moses” and a commonly used neural network framework named “Sqeq2Seq” that is modelled on the human brain.