|
Forget Cricket Talk about anything [within Board Rules, of course :) ] |
July 1, 2007, 02:18 PM
|
Cricket Legend
|
|
Join Date: June 20, 2002
Location: BanglaCricket.com
Posts: 6,069
|
|
Quote:
Originally Posted by Zunaid
And it makes fascinating reading. Thanks for keeping me captive most of the morning. Now, can anyone elucidate the comment about 'mussalmans' being mostly of the 'Shaikh' cast.
|
Did they use "mussalmans"? I think the word predominantly used in this to denote muslims is "Mohammedans".
What I get from these so far is that the East Bengal muslims were mostly mass-converts from the lowest Hindu castes. That is most of our grand grand parents were basically low caste Hindus.
However, after their conversion to Islam, the Bengali muslims filled the vacuum of the Hindu caste system by adhering to an Arab-like "tribal" caste system, or "bongsho". The "shaikh"s, and the "syed"s got their appellations from Arabia, the "khan"s remind you of the Mongols, etc. Very identity hungry.
|
July 1, 2007, 04:02 PM
|
|
Moderator
|
|
Join Date: May 17, 2005
Location: Melbourne
Posts: 6,496
|
|
Just a suggestion Arnab Bhai.... after you're done putting up all the pages as images, see if you can convert it into Microsoft Document Imaging (TIF format) which (by my experience) has character recognition (at least for scanned documents). After thats done, you could perhaps compile it into PDF. PDF's would be more handy than reading off images... if quoting etc is required, its much more easier to organise on Acrobat Professional.
|
July 1, 2007, 04:18 PM
|
Cricket Legend
|
|
Join Date: June 20, 2002
Location: BanglaCricket.com
Posts: 6,069
|
|
Quote:
Originally Posted by ammark
Just a suggestion Arnab Bhai.... after you're done putting up all the pages as images, see if you can convert it into Microsoft Document Imaging (TIF format) which (by my experience) has character recognition (at least for scanned documents). After thats done, you could perhaps compile it into PDF. PDF's would be more handy than reading off images... if quoting etc is required, its much more easier to organise on Acrobat Professional.
|
Yeah, that's a little too much work. The formatting of paragraphs, subheadings and tables will be totally messed up. The originality of the doc will be lost. Linking to images is much easier. Think of this thread as a read only thing. For now.
|
July 1, 2007, 04:23 PM
|
|
Moderator
|
|
Join Date: May 17, 2005
Location: Melbourne
Posts: 6,496
|
|
Quote:
Originally Posted by Arnab
Yeah, that's a little too much work. The formatting of paragraphs, subheadings and tables will be totally messed up. The originality of the doc will be lost. Linking to images is much easier. Think of this thread as a read only thing. For now.
|
I might give it a shot later tonight then. I dont think it'll be very hard.
|
July 1, 2007, 04:29 PM
|
Cricket Legend
|
|
Join Date: June 20, 2002
Location: BanglaCricket.com
Posts: 6,069
|
|
Go for it. The OCR will not be 100% accurate, I should warn you. The DSAL website has unclean OCR versions, too.
|
July 1, 2007, 04:42 PM
|
|
Cricket Guru
|
|
Join Date: September 3, 2006
Location: Mississauga, Ontario
Favorite Player: Sakib - the real Tiger
Posts: 11,194
|
|
Quote:
Originally Posted by ammark
see if you can convert it into Microsoft Document Imaging (TIF format) which (by my experience) has character recognition (at least for scanned documents). After thats done, you could perhaps compile it into PDF. PDF's would be more handy than reading off images... if quoting etc is required, its much more easier to organise on Acrobat Professional.
|
You can have images appear in PDFs too. The only thing that TIF format will serve in this case is accessibility. Also, TIF files tend to be massive. And text quoting in Acrobat isn't error-free. It removes any formatting.
Anyway, I've compiled a Word version of all the images, so that you can download and read from it. I don't have Acrobat Professional on my home computer, so can't convert it before I get to work on Tuesday.
Arnab bhai, if you wish, here's the Word version.
Bangladesh_1911.doc [11.8 MB]
__________________
cricket is a PROCESS, not an EVENT or two. -- Sohel_NR
Fans need to stop DUI (Dreaming Under Influence)!
|
July 1, 2007, 04:47 PM
|
Cricket Legend
|
|
Join Date: June 20, 2002
Location: BanglaCricket.com
Posts: 6,069
|
|
Thanks. I have Acrobat and will make a pdf version after I hunt down all the BD related entries. May be then we can add a link to the pdf in the first post of the thread. Will take time.
|
July 1, 2007, 04:52 PM
|
|
Cricket Guru
|
|
Join Date: September 3, 2006
Location: Mississauga, Ontario
Favorite Player: Sakib - the real Tiger
Posts: 11,194
|
|
Quote:
Originally Posted by Arnab
Thanks. I have Acrobat and will make a pdf version after I hunt down all the BD related entries. May be then we can add a link to the pdf in the first post of the thread. Will take time.
|
No probbies.
Btw, I forgot to change page setup after getting done. Can you change the left & right margin? Or rather, select everything (Ctrl+A) and make it all centered.
Also, the chapters could be separated with a chapter title page. Like "Bogra", "Dacca", etc.
__________________
cricket is a PROCESS, not an EVENT or two. -- Sohel_NR
Fans need to stop DUI (Dreaming Under Influence)!
|
July 1, 2007, 05:14 PM
|
|
Moderator
|
|
Join Date: May 17, 2005
Location: Melbourne
Posts: 6,496
|
|
Quote:
Originally Posted by Kabir
You can have images appear in PDFs too. The only thing that TIF format will serve in this case is accessibility. Also, TIF files tend to be massive. And text quoting in Acrobat isn't error-free. It removes any formatting.
Anyway, I've compiled a Word version of all the images, so that you can download and read from it. I don't have Acrobat Professional on my home computer, so can't convert it before I get to work on Tuesday.
Arnab bhai, if you wish, here's the Word version.
Bangladesh_1911.doc [11.8 MB]
|
Nicely Done I'm not sure if Acrobat Professional has character and word recognition too. If it does, then it'll be great... the process will have one less step. But if not, I'll try and give it a shot with the TIFF imaging so that word recognition feature shows up..... and then PDF that as text.
|
July 1, 2007, 05:25 PM
|
|
Cricket Guru
|
|
Join Date: September 3, 2006
Location: Mississauga, Ontario
Favorite Player: Sakib - the real Tiger
Posts: 11,194
|
|
Quote:
Originally Posted by ammark
Nicely Done I'm not sure if Acrobat Professional has character and word recognition too. If it does, then it'll be great... the process will have one less step. But if not, I'll try and give it a shot with the TIFF imaging so that word recognition feature shows up..... and then PDF that as text.
|
Nope. PDF will only show if you create it from a text document. If it's done from images, it won't do anything.
If you're at UofT, you can also go to RCAT. Print out all these pages, and push it through this scanner, and that scanner will convert all the image contents into text equivalents. It also does foreign languages, such as Bangla. You gotta make booking to use it though...but it's free.
If you're using TIF, and creating PDF with it, I'm not sure if PDF will enable text selection. I doubt it. But I may be wrong here.
__________________
cricket is a PROCESS, not an EVENT or two. -- Sohel_NR
Fans need to stop DUI (Dreaming Under Influence)!
|
July 1, 2007, 05:34 PM
|
|
Moderator
|
|
Join Date: May 17, 2005
Location: Melbourne
Posts: 6,496
|
|
Quote:
Originally Posted by Kabir
If you're at UofT, you can also go to RCAT. Print out all these pages, and push it through this scanner, and that scanner will convert all the image contents into text equivalents. It also does foreign languages, such as Bangla. You gotta make booking to use it though...but it's free.
|
Well I do live next to Robarts... but the working hours are hideous in summer, and now its a long weekend. I'll try and start in 20 min with the document imaging. At home... and see how it goes
|
July 1, 2007, 10:33 PM
|
|
Moderator
|
|
Join Date: May 17, 2005
Location: Melbourne
Posts: 6,496
|
|
I managed to do OCR in Microsoft Document Imaging... and then I converted that to PDF. It is possible to use Acrobat's OCR tool to render the contents of this pdf document to text, however as Arnab Bhai said, there are considerable errors, and the font and texture all becomes distorted. http://www.megaupload.com/?d=3ODS01Y9
Otherwise I have placed indents in Kabir's word document and have made that into another pdf as well... which however cannot be transcribed by OCR. http://www.megaupload.com/?d=JY51KCQ5
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is On
|
|
|
All times are GMT -5. The time now is 03:11 AM.
|
|