I just happen to have a copy of the full 12 volume version of Sir James Frazer's "Golden Bough" in pdf format. As far as I know this version is not available on the web (yet) and I'm currently trying to get Volume 1 converted to text for publishing. Anyone want to help?
To make it easier a certain Italian friend created a script that will automatically process all the pdfs into png format, and then set the OCR software onto the png to create a txt file. There are a few places where it doesn't work properly but these are usually due to the scanning process (I didn't always get the book square to the automatic process is confused). All that is needed is proof reading the txt file against the pdf.
Before anyone asks it appears that the 12 volume version is out of copyright, though in the worst case variation it is in copyright for another 3 years. However, I think that 12 volumes would probably take in the order of 3 years for me to complete. As not only is there proof reading the main text but there is cross referencing the not inconsiderable index making the links to the various pages in the text. Then I have to decide how to deal with the enourmous number of footnotes. As you can imagine it is going to be a mammoth task.
I get the feeling that I am going to be busy.