[Prev][Next][Index][Thread]

Re: Machine readable texts of NT manuscripts



Mr. Finney,

Although I have never done this type of work, I have investigated a similar
problem.  I am doing most of the following from memory as my notes are in Canada
and I am in the UK.

 
> I am collating the uncial manuscripts of the Epistle to the Hebrews.
> For your information, these are P12, P13, P17, P46, P79, P89, 01, 02,
> 03, 04, 06, 015, 016, 018, 020, 025, 044, 048, 049, 075, 0121b, 0122,
> 0227, 0228, and 0252.
> I have three questions:
> 1) Do Machine readable texts of these manuscripts exist?

I have no real knowledge of whether these are in machine readable format,
but it is doubtful.  Getting the standard texts into this format takes
long enough.  If they are in such a form, it will be because someone has
done it for her/himself or for a research group.

> 2) Is it possible to use a Kurzweil (or the like) scanner to produce
> a machine readable text file from a reasonable quality reproduction
> of a manuscript? 

The answer is yes if you are working from a typscript copy of the
uncials or a fairly consistently printed MS.  When doing 
such work it is absoultely necessary that you use a scanner or scanner 
software with _Intelligent Character Recognition_ (ICR).  This allows you 
to 'teach' the software how to recognize different letters and letters with 
variations. Kurzweil machines have such software built in, but it may be
possible to purchase such software separately and use a much less expensive
scanner.

>If so, what would the cost and error rate of such a
> process be?
Purchasing a Kurzweil is very expensive (tens of thousands of $$).  Outsourcing
(i.e., using a firm that does such things) will be expensive too, not just 
because you will have to pay for the service, but because you will have to 
spend much time proofing the work, as this will not be done for you by them.
If you do it yourself and the software is teachable, and if you can set the 
software to prompt you each time it can not recognize a letter, it is 
possible to have a very low error rate.

If you want to look into it, I believe that CCAT at the Univ. of Penn. does
this type of work.

> 3) Is there a word processing program that could handle variants in
> the following way:
> 
> THIS IS THE | NORMAL   | TEXT BUT SOMETIMES YOU GET THESE | VARIANT 
>             | USUAL    |                                  | OPTIONAL
>             | STANDARD |                                  | ALTERNATIVE
> 
> READINGS | WHICH SOMETIMES MAKE SENSE AND SOMETIMES DON'T. OMISSIONS, 
> PHRASES  |
> WORDING  |
> 
> ADDITIONS AND TRANSPOSITIONS ALSO OCCUR.
> 

The software COLLATE (from Oxford?) probably will do this for you as an
exported ASCII file.


I hope this helps.

Glenn Wooden