Jump to content

ସହଯୋଗ:ସାଧନ/ଅଣ ୟୁନିକୋଡ଼ରୁ ୟୁନିକୋଡ଼ କନଭର୍ଟର (ପୁରୁଣା)/en

ଉଇକିପିଡ଼ିଆ‌ରୁ
ଏହି ଲେଖାଟିକୁ ଓଡ଼ିଆରେ ପଢ଼ନ୍ତୁ

OR-TTsarala2 Unicode Converter is an open source software which converts '"non-standard ASCII equivalent codes"' (used for Odia fonts) to '"Odia Unicode Characters'". It has been designed and published by Manoj Kumar Sahukar.

This tool is designed using Game Maker Language (GML). This is a beta release. It could be used for converting OR-TTsarala Odia font to Unicode.

  1. Go to http://sourceforge.net/p/odiaconverter and download the software.
  2. After downloading move the OR-TTsaralaUnicodeConverter.exe file to desktop. Double click on the icon
  3. A small application window will open up. Click on "load" icon (As shown in the picture). An html file must have created in desktop. (TransWikOdia-TWO.html)
  4. Open any file typed in OR-TTsarala font. (See the picture). Select text and copy.
Select text from the pdf file and copy
  1. "Click On Load icon on the application".
  2. Find TransWikOdia-TWO.html file on your desktop and open it using Google crome.
When the page reloads the copied text appear in Unicode Odia. Now copy this text and use in anywhere.
  1. Use BackSpace to remove gap between Accented sign and Odia text to get proper text.(e.g.ବୁଝ ିବା, ଜାଣ ିବା)* (It happens if the source page doesn't contain texts in proper way)

Technical description

[ସମ୍ପାଦନା]

Generally speaking ,fonts are designed to give shape and size to a set of character/number/or any symbol which are specific data for computers. It is not fixed that a font can be only made to show/display that particular data in its standard shape. Shpae/Typeface can be made different irrespictive of the data it is bound to.
That's what happen with these non-unicode fonts. As in ASCII standard 'specific data' is not available for ODIA language characters, these fonts used available ASCII data to show ODIA characters on which word processing software or PDF documents worked well. It did serve the purpose of printing documents and some prefer them till today as the document size is less compared to UNICODE document containing same matter but it faced the limitation of sharing data on Internet as UNICODE is universally accepted and distinguished.
Thus simple logic behind this software would be to take a bunch of data like this "bÊaÒ_hèe ...etc" and replace them with "ଭୁବନେଶ୍ଵର ..etc". Each of these " bÊaÒ_hèe" corresponds to "ଭ ୁ ବ ନ େ ଶ୍ଵ ର", yet not in sequence in some cases. Like " େ" in unicode comes after "ନ" but in non-unicode placed before it as the symbol is" େ" predecessor of "ନ". There are also similar issues with "ୈ", "ୋ","ୌ","ରେଫ୍‍". Also some complex cases arise with many "ମାତ୍ରା". So the necessity of data manupulation (e.g. swaping- forward/backward once/twice, deletion) arise which follows a certian logic performed by software. The updated version replaces the Unicode characters in place of corresponding ASCII data according to font after manupulation. The interface is made user friendly by taking input right from clipboard and setting converted text to clipboard directly.
The previous version replaced non-unicode data with "html entity" and used web browser as a medium to convert them to "Unicode Character" which made the tool very unfriendly.

Target pdf resources

[ସମ୍ପାଦନା]