Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download

📚 The CoCalc Library - books, templates and other resources

132934 views
License: OTHER
1
\documentclass[a4paper]{scrartcl}
2
\usepackage{amssymb, amsmath} % needed for math
3
\usepackage[utf8]{inputenc} % this is needed for umlauts
4
\usepackage[english]{babel} % this is needed for umlauts
5
\usepackage[T1]{fontenc} % this is needed for correct output of umlauts in pdf
6
\usepackage[margin=2.5cm]{geometry} %layout
7
\usepackage{hyperref} % links im text
8
\usepackage{color}
9
\usepackage{framed}
10
\usepackage{enumerate} % for advanced numbering of lists
11
\usepackage{csquotes}
12
\usepackage{ifxetex,ifluatex}
13
\usepackage{etoolbox}
14
\usepackage[svgnames]{xcolor}
15
\usepackage{tikz}
16
\usepackage{framed}
17
\usepackage{parskip}
18
\usepackage{cite}
19
\usepackage{fancyref}
20
\usepackage{mystyle}
21
\clubpenalty = 10000 % Schusterjungen verhindern
22
\widowpenalty = 10000 % Hurenkinder verhindern
23
24
\hypersetup{
25
pdfauthor = {Martin Thoma},
26
pdfkeywords = {Bachelor proposal, LaTeX, handwriting recognition},
27
pdftitle = {Proposal for a Bachelor of Science Thesis:\\Interactive on-line handwriting recognition of mathematical formulae}
28
}
29
30
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
31
32
\begin{document}
33
\title{Proposal for a Bachelor of Science Thesis:\\Interactive on-line handwriting recognition of mathematical formulae}
34
\author{Martin Thoma}
35
\maketitle
36
\section{The problem background}
37
There are people who don't know how to write even
38
simple mathematical formulae with \LaTeX{} like
39
\[\pi/\alpha=\sum_{n=-\infty}^\infty \frac{\sin^2 (c+n)\alpha}{(c+n)^2}=\int_{-\infty}^\infty \frac{\sin^2 (c+n)\alpha}{(c+n)^2}\, \text{d}n\]
40
or who need much time to do so. Currently, there are several online
41
services, programs and apps that help to write mathematical
42
formulae, but all programs I know have serious disadvantages:
43
\begin{itemize}
44
\item \href{http://detexify.kirelabs.org/classify.html}{detexify.kirelabs.org}
45
recognizes \textbf{only symbols},
46
\item the formula editor of LibreOffice Writer 3.6 as shown
47
in \Fref{fig:libre-office-3.6} offers some
48
guidance by grouping common operations while showing
49
a WYSIWYG editor, but it has \textbf{no handwriting recognition}.
50
Another drawback is the fact that it is \textbf{not available
51
as an online service}, so you have to install LibreOffice
52
which might not be possible on all devices.
53
\item The \enquote{Daum Equation Editor} (see \Fref{fig:daum-editor}) is available online
54
and offers guidance through the creation of equations,
55
but does not offer handwriting recognition. Although
56
it might be OpenSource, the \textbf{source code is difficult to
57
find}. This means if you want to improve the recognition,
58
it is not possible. It also makes use of Adobe Flash
59
which is not available on many smartphones and tablet
60
computers.
61
\item Maple seems to offer handwritten symbol recognition (\href{http://www.maplesoft.com/products/maple/features/handwritten.aspx}{source}),
62
but on the one hand I was not able to test that, because
63
it is \textbf{not available for free}. On the other hand you
64
have to install additional software, it seems not to be
65
available for tablet computers and it does only recognize
66
single symbols.
67
\item Wolfram Mathematica seems to be able to do complete
68
formula recognition at least for simple formulae (\href{http://reference.wolfram.com/mathematica/tutorial/HandwrittenMathRecognition.html}{source})
69
by using Microsoft's \href{http://windows.microsoft.com/en-ph/windows7/use-math-input-panel-to-write-and-correct-math-equations}{Math Input Panel},
70
but this is neither OpenSource nor available as an
71
online service. Additionally it is not
72
available for Linux systems, so I can't test it.
73
\end{itemize}
74
75
A more comprehensive list can be found at \href{https://en.wikipedia.org/wiki/Formula_editor}{https://en.wikipedia.org/wiki/Formula\_editor}.
76
A problem of some of the projects presented there is that they
77
require the client to execute Java Applets which is a security
78
risk.
79
80
\begin{figure}[h]
81
\centering
82
\includegraphics*[width=5cm, keepaspectratio]{figures/libreoffice-writer.png}
83
\caption{LibreOffice Writer 3.6 - Formula Editor}
84
\label{fig:libre-office-3.6}
85
\end{figure}
86
87
\begin{figure}[h]
88
\centering
89
\includegraphics*[width=15cm, keepaspectratio]{figures/daum-editor.png}
90
\caption{Daum Equation editor}
91
\label{fig:daum-editor}
92
\end{figure}
93
\break
94
\section{The problem statement}
95
What I would like to have is an interactive on-line handwriting
96
recognition service, that is available as a web service which makes
97
use of touchscreens. Additionally, it should be for free and
98
OpenSource, the source code should be easy to find and documented.
99
This means:
100
\begin{itemize}
101
\item \textbf{Service}: The program can be accessed over the web, so
102
that the user does only have to have a modern browser.
103
As a consequence, the software could be used with any
104
device that has a touch screen.
105
\item \textbf{On-line handwriting recognition}: The service
106
starts recognizing while the user enters a formula.
107
\item \textbf{Interactive}: The service offers symbols and constructs
108
to the user before the user starts typing. These suggestions
109
might change depending on what the user has typed before.
110
\item \textbf{OpenSource}: Any license in this list: \href{http://opensource.org/licenses}{http://opensource.org/licenses}
111
\item \textbf{Easy to find}: Ideally, the project should have
112
an own domain that contains the source code, the service
113
and documentation. But it might be enough to provide
114
an email address to a developer within the top of
115
of the source code of the delivered HTML document.
116
\end{itemize}
117
118
This service should also encourage the users by techniques
119
of \enquote{Gamification} to give as much
120
meta information about their formulae as possible:
121
\begin{itemize}
122
\item Which problem domain does the formula belong to, e.~g. \enquote{Euclidean geometry}, \enquote{analysis} or \enquote{calculus}?
123
\item Does the formula itself have a name, e.~g. \enquote{Pythagorean theorem}, \enquote{Fibonacci numbers} or \enquote{geometric series}?
124
\end{itemize}
125
126
This information should be used to create a formula database.
127
128
\section{Significance}
129
For me as a Linux user, there no software that I can test and which
130
offers on-line, interactive math handwriting recognition. But the
131
need of such a software is there.
132
133
But there are more reasons why this bachelor's thesis matters:
134
Projects like \LaTeX{}, Linux, Apache or Firefox have shown that
135
OpenSource software can enrich the development in specific areas. The
136
\enquote{Browser Wars} might be the most famous result of an active
137
OpenSource community. Internet Explorer 6 had
138
a market share of over 80\% in 2003. Prequels of Firefox and the Mozilla
139
foundation already existed, but Firefox 1.0 was released not until
140
November 2004. After that, Firefox and other open browsers added many
141
features that Internet Explorer had to compete with, like tabbed browsing,
142
HTML4 standard conformance, support of the \texttt{<canvas>} tag and
143
speed of HTML rendering and JavaScript execution.\footnote{\href{http://www.evolutionoftheweb.com/}{www.evolutionoftheweb.com} offers a graphical overview. Although supporting standards like HTML4 or CSS~2 is not done with one version, but rather an incremental process.} Some of these
144
questions are interesting for science such as many problems related
145
to layouts and just-in-time compilation (JIT). With OpenSource software
146
that makes it easy to find its source and offers good documentation,
147
researchers can simply try their ideas without being blocked by
148
having to try to access the source code.
149
150
Additionally, such a project might give researchers more time to
151
concentrate on the tasks they really want to do rather than spending
152
hours by learning \LaTeX{}.
153
154
One last reason why this thesis matters is the formula database that
155
gets created by users. This database might be used in follow-up work,
156
e.~g. a formula spotter for presentations or a math detector for speech.
157
158
\section{Time schedule}
159
\begin{itemize}
160
\item[70h] Literature research about on-line handwriting recognition
161
techniques and Gamification.
162
\item[5h] Defining browsers and devices that should get supported
163
and required client side software like HTML5, CSS 3
164
and ECMAScript (better known as JavaScript). Also,
165
required input methods like touchscreens and stylus
166
should be mentioned.
167
\item[20h] Writing use cases. This is includes writing example
168
formula that the user should type and the system should
169
be able to recognize; finding people with different
170
knowledge of \LaTeX{} and from different fields who
171
want to participate in user tests.
172
\item[60h] Implementing the core of the application: Handwriting
173
recognition of digits and symbols by using only
174
HTML, CSS and on the client side. This includes implementing
175
a way for the user to enter new symbols and to correct the
176
symbol that was suggested by the recognition system.
177
\item[20h] Introduce testers that already know \LaTeX{} to the
178
current system. At this point, the system does only do
179
symbol recognition. The testers should train it,
180
insert symbols like $a-z, A-Z, 0-9, \alpha-\omega, A-\Omega, \cdot, \circ, \dots$
181
\item[10h] Get feedback by the users. This feedback will not be included
182
in the thesis, but the improvements will get documented.
183
\item[60h] Finding structures and ways how to enter them. Examples
184
of structures that can be nested are sums:
185
\begin{verbatim}\sum_{<some structure>}^{<another structure>} <a third structure>\end{verbatim}
186
Implement the recognition of those structures.
187
\item[30h] Observe \enquote{fresh} testers while they try to use
188
the system.
189
\item[70h] Improving the software to fix problems that were found
190
with user tests
191
\item[50h] Fix bugs, improve code quality and readability as well
192
as documentation.
193
\item[45h] Usability testing: Try Hallway testing. The results
194
of these tests get documented and will be part of the
195
bachelor's thesis. If possible, I would like
196
to let the testers use their own devices.
197
\item[10h] Mentioning open questions and ideas how they could be
198
analyzed with the service that was created.
199
\end{itemize}
200
201
\section{Outline}
202
I have described in which steps I would like to write the software,
203
but almost all points include writing the bachelor's thesis document.
204
A first draft of the outline could be like this:
205
206
\begin{enumerate}
207
\item Introduction
208
\item Definitions
209
\begin{enumerate}
210
\item Hardware: What is available and what is the distribution?
211
\item Software: What is available and what is the distribution?
212
\item Support of standards like HTML, CSS, ECMA-Script, Flash, Cookies, ...
213
\item Choice of hardware, software and standards that should get supported as well as the choice of Libraries and the required server-side software
214
\item Application to the domain of math recognition
215
\end{enumerate}
216
\item On-line handwriting techniques
217
\begin{enumerate}
218
\item Description of techniques in general
219
\item Application to the domain of math recognition
220
\end{enumerate}
221
\item Gamification techniques
222
\begin{enumerate}
223
\item Description of techniques in general
224
\item Application to the domain of math recognition in the web
225
\end{enumerate}
226
\item Software Project
227
\begin{enumerate}
228
\item Structure of the code
229
\item Availability of documentation
230
\item Availability of the service
231
\end{enumerate}
232
\item Summary
233
\begin{enumerate}
234
\item Future Work
235
\end{enumerate}
236
\end{enumerate}
237
\break
238
239
\renewcommand\refname{Related Literature}
240
\nocite{*}
241
\bibliographystyle{itmalpha}
242
\bibliography{literatur}
243
244
This literature list is only a list that seems to make sense to me
245
by now. As I proceed I might find more useful sources for the different
246
topics. So I might add, but also remove elements from this list.
247
Especially for Gamification I might read documents from
248
\href{http://gamification-research.org/}{gamification-research.org}.
249
\end{document}
250
251