android应用中ocr的解决方案大致有两种,而采用最多的还是tesseract.小弟就在这里将我最近两天解决思路写下来,如有缺陷,欢迎拍砖:
有两种解决方案,一种是采用tesseract cloud-service,这钟是把图片信息发送到云端,然后获得图片分析数据;第二种就是不用联网,本地化分析图片上信息。我就说说第二种,第一种我会在最后给大家一个链接(文章很不错)。
搜先就是下载Tesseract native android library.这里有两个链接,你选哪个链接都可以:
a.svn checkout http://tesseract-android-tools.googlecode.com/svn/trunk/ tesseract-android-tools。(如果不能checkout到,废话别说就到官方上下:http://code.google.com/p/tesseract-android-tools/)
b.可能上面一个下载后编译有些人会遇到一些问题,比如找不到jgep库,编译不成功。所以有了这个项目:git clone git://github.com/rmtheis/tess-two.git (这个包里面内容太多,不过也省得下那么多库了)
这里先说采用第一个源下载:下载成功后,打开README文件,做下修改(如下):
git clone git://android.git.kernel.org/platform/external/jpeg.git libjpeg
修改为:
git clone https://android.googlesource.com/platform/external/jpeg libjpeg
ndk-build //这个编译要到jni文件夹里面编译
对于第二个源下载,由于里面没有README文件,操作命令如下:
cd <project-directory>/tess-two export TESSERACT_PATH=${PWD}/external/tesseract-3.01 export LEPTONICA_PATH=${PWD}/external/leptonica-1.68 export LIBJPEG_PATH=${PWD}/external/libjpeg ndk-build android update project --path . ant release
最终两个都得到你想要的libs里面的so文件和src里面的对so文件的封装类。这个就是我们开发所用到的东东啦。
然后新建工程,代码如下:
public class MainActivity extends Activity { private static final String TAG = "MainActivity ..."; private static final String TESSBASE_PATH = "/mnt/sdcard/tesseract/"; private static final String DEFAULT_LANGUAGE = "eng"; private static final String IMAGE_PATH = "/mnt/sdcard/test1.jpg"; private static final String EXPECTED_FILE = TESSBASE_PATH + "tessdata/" + DEFAULT_LANGUAGE + ".traineddata"; private TessBaseAPI service; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); testOcr(); } public void testOcr(){ mHandler.post(new Runnable() { @Override public void run() { Log.d(TAG, "begin>>>>>>>"); ocr(); //test(); } });
} public void test(){ // First, make sure the eng.traineddata file exists. /*assertTrue("Make sure that you've copied " + DEFAULT_LANGUAGE + ".traineddata to " + EXPECTED_FILE, new File(EXPECTED_FILE).exists());*/ final TessBaseAPI baseApi = new TessBaseAPI(); baseApi.init(TESSBASE_PATH, DEFAULT_LANGUAGE); final Bitmap bmp = BitmapFactory.decodeResource(getResources(), R.drawable.test); //digits is a .jpg image I found in one of the issues here. ImageView img = (ImageView) findViewById(R.id.image); img.setImageBitmap(bmp);//I can see the ImageView. So we know that it should work if I sent it to the setImage() baseApi.setImage(bmp); Log.v("Kishore","Kishore:Working");//This statement is never reached. Futhermore, on putting some more Log.v commands in the setImage function, I found out that the native function nativeSetImagePix is never accessed. I have attached the Logcat output below to show that it is not accessed. String outputText = baseApi.getUTF8Text(); Log.v("Kishore","Kishore:"+outputText); baseApi.end(); bmp.recycle(); } protected void ocr() { BitmapFactory.Options options = new BitmapFactory.Options(); options.inSampleSize = 2; Bitmap bitmap = BitmapFactory.decodeFile(IMAGE_PATH, options); try { ExifInterface exif = new ExifInterface(IMAGE_PATH); int exifOrientation = exif.getAttributeInt(ExifInterface.TAG_ORIENTATION, ExifInterface.ORIENTATION_NORMAL); Log.v(TAG, "Orient: " + exifOrientation); int rotate = 0; switch (exifOrientation) { case ExifInterface.ORIENTATION_ROTATE_90: rotate = 90; break; case ExifInterface.ORIENTATION_ROTATE_180: rotate = 180; break; case ExifInterface.ORIENTATION_ROTATE_270: rotate = 270; break; } Log.v(TAG, "Rotation: " + rotate); if (rotate != 0) { // Getting width & height of the given image. int w = bitmap.getWidth(); int h = bitmap.getHeight(); // Setting pre rotate Matrix mtx = new Matrix(); mtx.preRotate(rotate); // Rotating Bitmap bitmap = Bitmap.createBitmap(bitmap, 0, 0, w, h, mtx, false); // tesseract req. ARGB_8888 bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true); } } catch (IOException e) { Log.e(TAG, "Rotate or coversion failed: " + e.toString()); } ImageView iv = (ImageView) findViewById(R.id.image); iv.setImageBitmap(bitmap); iv.setVisibility(View.VISIBLE); Log.v(TAG, "Before baseApi"); TessBaseAPI baseApi = new TessBaseAPI(); baseApi.setDebug(true); baseApi.init(TESSBASE_PATH, DEFAULT_LANGUAGE); baseApi.setImage(bitmap); String recognizedText = baseApi.getUTF8Text(); baseApi.end(); Log.v(TAG, "OCR Result: " + recognizedText); // clean up and show if (DEFAULT_LANGUAGE.equalsIgnoreCase("eng")) { recognizedText = recognizedText.replaceAll("[^a-zA-Z0-9]+", " "); } if (recognizedText.length() != 0) { ((TextView) findViewById(R.id.field)).setText(recognizedText.trim()); } } private Handler mHandler = new Handler(){ public void handleMessage(android.os.Message msg) { }; }; }
当你很欢喜的运行程序的时候,发现事情没有你想象的那么简单。这个文件必须要用到一个语言包。不然你怎么匹配呢?想想也是:
adb shell mkdir /mnt/sdcard/tesseract adb shell mkdir /mnt/sdcard/tesseract/tessdata adb push eng.traineddata /mnt/sdcard/tesseract/tessdata/eng.traineddata adb shell ls -l /mnt/sdcard/tesseract/tessdata ls -l bin/tesseract-android-tools-test.apk adb install -r -s bin/tesseract-android-tools-test.apk adb shell am instrument -w -e class com.googlecode.tesseract.android.test.TessBaseAPITest com.googlecode.tesseract.android.test/android.test.InstrumentationTestRunner
上面的额eng.traineddata这个东西。你可以搜下,网络有的。(囧,我还不知到怎么上传附件)
最后效果如图(事实上解析结果是:44m><9。只不过那个字符不认识吧):
参考文章:
http://wolfpaulus.com/journal/android-and-ocr
http://labs.makemachine.net/2010/03/simple-android-photo-capture/
|