quinta-feira, 6 de janeiro de 2011

CellWriter - risujin.org

CellWriter - risujin.org:

CellWriter

CellWriter is a grid-entry natural handwriting input panel. As you write characters into the cells, your writing is instantly recognized at the character level. When you press Enter on the panel, the input you entered is sent to the currently focused application as if typed on the keyboard.

Overview top

CellWriter was developed under a University of Minnesota UROP grant. If you're interested in the underlying algorithms, have a look at my thesis.

Writer-dependent

CellWriter relies solely on training samples of your characters for recognition. After a brief training period, the recognition engine is tailored to your unique way of writing. While this means that in general, other writers may not be able to use CellWriter with your training data, the recognition rate for your writing is very reliable.

Corrective preprocessing

CellWriter includes multiple levels of preprocessing algorithms that correct for input aberrations. Preprocessing smoothes out digitizer noise and matches input to training symbols drawn with different stroke order, direction, and/or number of strokes.

On-screen keyboard

For the times you simply need a specific keystroke, CellWriter features a both a convenient mini-keyboard overlay and a full QWERTY on-screen keyboard mode.

Multilingual support

Want to write in your native language? CellWriter can be trained to generate any Unicode character. Right-to-left languages are also supported.

Manual top

Most buttons and options in the setup window have informative tooltips, for additional information hold the pointer over a control and read the tooltip.

Program dependencies

In order to run CellWriter you will need the following packages: libxtst6, libgtk2.0-0, and libgnome2-0. You may need to update these packages and their dependencies to the latest version before the Debian package will install. If you want to compile CellWriter, you will also need the development versions of these libraries: libxtst-dev, libgtk2.0-dev, and libgnome2-dev (or configure with --without-gnome).

Training characters

CellWriter does not come with any training samples. Before you can use it, you must train CellWriter with samples of your handwriting. Press Train on the main window to enter training mode. Entering training mode will not clear any input you have entered.

Draw each character in its cell. When you have finished a character, move the pointer outside of the cell to finish that sample. The more samples you train this way, the darker the character will appear until it is black and fully trained. Characters that appear inactive (green by default) do not yet have any training samples.

If you wish to train a different Unicode block, select it from the Combo Box of enabled blocks. If the desired block is not enabled, go to the Recognition tab in the setup window, find it in the list of blocks and make sure it is checked.

Inputing and editing text

To input text, draw in the cells from left to right. Any cells you skip over will be automatically turned into spaces. Inactive cells are not used when sending keystrokes. When you are finished, press Enter to send your input to the currently focused program. You can clear your input with Clear or use the mini-keyboard buttons to edit your text in the currently focused application.

Context menu

Many functions can be accessed by right-clicking on a cell and bringing up the context menu. However, some Tablet PCs or PDAs may not have a pen button or any other convenient way to right-click. The alternative gesture to bring up the context menu is the hold-click. Press with the pen without moving for one second and the context menu will show up. If you start drawing ink, you have moved the pen too far.

Erasing characters

There are several ways to delete a character. If your pen has an eraser end you can simply press on a cell with that to clear it. If you are using a mouse, middle-click will clear a cell. Otherwise, you can delete through the context menu or use a cross-out gesture. To cross out one cell, simply scribble an unrecognizable character inside and the recognition engine will reject it, clearing the cell. To cross out multiple cells, start drawing in the first cell and drag the pointer accross the cells you want to erase. The pointer will be in eraser mode as long as the pen is pressed down. Note that scribbling out a single cell will not work in training mode!

Inserting a space

To insert a space, point the mouse cursor at the insertion hotspot at either the bottom or the top of the dividing line between cells. If you are pointing at the hotspot, arrows will appear at the top and bottom of the dividing line, click to insert a space.

Correcting recognized text

No recognition system can read your mind. If a symbol is drawn sloppily or otherwise varies from previously trained samples, it may not be recognized correctly. Input characters that have been recognized with a low degree of confidence will appear hilighted. You may either redraw the character or open the context menu for that cell and select the correct character from the list of top choices. If you have not disabled training on input (on by default), all training samples that rated higher than the correct choice will be deleted and the input will be entered as a new training sample.

Recognition results

If you want to know more about what the recognition engine is doing under the hood, start CellWriter from a console and it will print various detailed information to standard out. Here is a sample of what that output looks like:

Recognized -- 71/87 (81%) disqualified, 21ms (1ms/symbol), 37% strong
'k' ( 100 [30587], 100 [32722], 69 [21491], 33 [ 17]) 79% [000---012]
'K' ( 58 [29191], 38 [32672], 100 [26191], 0 [ 0]) 42% [012---000]
'M' ( 77 [29845], 4 [32645], -2 [10154], 0 [ 0]) 16% [000R--012]
'P' ( 15 [27757], -12 [32631], 52 [18831], 0 [ 0]) 8% [011--R001]
'd' ( 45 [28759], -8 [32634], -23 [ 6823], 0 [ 0]) 2% [000RR-102]

The top stat shows how many samples were disqualified before detailed recognition. Next is the total time of recognition in milliseconds. Strength is defined as the match strength of the first result minus the second. For each letter, the ratings of the four recognition engines are displayed in normalized and raw form (in brackets). From left to right are the preprocessor, average distance, average angle, and word context engines. After the engine ratings is the post-penalty strength. Lastly is the mapping transformation in brackets.

The mapping transformation describes how the preprocessor mapped a symbol with more strokes onto a symbol with less strokes. The first set of columns describe which stroke on the larger symbol was mapped to which stroke on the smaller symbol. The next set of columns indicate whether any stroke was reversed ('R'). If any two or more strokes on the larger symbol were mapped to the same stroke on the smaller symbol, the column set on the end indicates the order in which the strokes were glued together.

In the example above, a sample for the character 'k' was rated highest by the preprocessor, average distance, and word frequency engines, but only second-highest by the average angle distance. After penalties, it had a strength of 79%, 37% above 'K', and was constructed by gluing together the three input strokes in order without reversal.

Support top

CellWriter is a very new program. I have tried very hard to track down and fix as many bugs as I can but there is always more work to do. If you find a bug in the program or have a great new idea, please send me an email! If you are reporting a bug, please include the version number, relevant console output or screenshots, and whether you compiled from source or are using the Debian the package.

Resetting your profile

If you are updating CellWriter and you find that the program hangs or crashes on startup or if you would like to reset all of the program settings and training samples for whatever reason, delete the .cellwriter directory within your home directory.

Pen and cursor issues

There have been a number of issues reported that are caused by extended input events. If you can't draw in a cell or if the ink position does not match your cursor, try disabling extended input events in Setup. Note, however, that this will disable the pen eraser end.

There is a bug in the LinuxWacom driver that will screw up Xinput applications when the screen is rotated with xrandr and xsetwacom. Disabling extended input events will resolve this issue. You can also simply restart the affected applications.

Poor recognition rate

Almost all recognition problems with CellWriter arise from bad samples. If you find that certain characters are frequently recognized incorrectly, see if a bad sample is at fault. Open the character cell's context menu and select Show Ink. The red ink belongs to the matched training sample, and the black is your input. If the red ink does not match the recognized letter, the sample is bad. Go into training mode and reset all training samples for the bad symbol (through the context menu or by using the pen eraser on the cell) and re-train it.

By default, CellWriter trains on your input characters when you press Enter. If your writing becomes sloppy or you do not correct badly recognized characters, bad samples may be generated. If you have adequately trained CellWriter, you can disable training on input in the Recognition tab of the Setup window.

If you are not using CellWriter to input English text, the Word context engine may be harming the recognition rate. English word context can disabled under the Recognition tab of the Setup window.

Missing characters

If you cannot find a certain character in any of the Unicode blocks, it is possible that you do not have a font installed to support it. Please install all available fonts for your language to ensure that you get the best font rendering possible.

Unicode input problems

No keyboard has every Unicode character on it so CellWriter must resort to a fairly complicated method to generate fake Unicode keystrokes. If you find that CellWriter does not properly send Unicode characters, if characters are skipped, repeated, or in any way mangled on the way, please send me an email along with the full console print-out. As a workaround, try limiting the number of characters you input at a time as a backlog can cause these kinds of problems.

Random key strokes

An issue that can happen with Unicode input and the on-screen keyboards is the generation of seemingly random key-strokes that attempt to eject the CD-ROM drive, bring up the GNOME Search window, or activate other hotkeys. CellWriter overwrites blank KeyCodes in order to rewire them to send specific characters that are otherwise unsupported on the keyboard. Some of these KeyCodes are not actually usable for this purpose although CellWriter has no way of detecting this. Any time CellWriter overwrites a KeyCode for use as a different character, a message will be printed to standard out:

Overwrote KeyCode 92 for Num_Lock

CellWriter comes with a blacklist of known "bad" KeyCodes that are not used for generating key events. If you notice that CellWriter will consistently cause problems when it overwrites a specific KeyCode, you can add that KeyCode to the CellWriter blacklist:

  1. Make sure you close CellWriter first
  2. Open ~/.cellwriter/profile
  3. Find the line that says: bad_keycodes 108 111 115 ... 245 248
  4. Add the KeyCode number to the end of that list
  5. Save the profile and start CellWriter
If this procedure solves the problem, please report the new bad KeyCode to me via email.

Nenhum comentário: