Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download

📚 The CoCalc Library - books, templates and other resources

132928 views
License: OTHER
1
\documentclass[a4paper]{scrartcl}
2
\usepackage{amssymb, amsmath} % needed for math
3
\usepackage[utf8]{inputenc} % this is needed for umlauts
4
\usepackage[english]{babel} % this is needed for umlauts
5
\usepackage[T1]{fontenc} % this is needed for correct output of umlauts in pdf
6
\usepackage[margin=2.5cm]{geometry} %layout
7
\usepackage{hyperref} % links im text
8
\usepackage{color}
9
\usepackage{framed}
10
\usepackage{enumerate} % for advanced numbering of lists
11
\usepackage{csquotes}
12
\usepackage{ifxetex,ifluatex}
13
\usepackage{etoolbox}
14
\usepackage[svgnames]{xcolor}
15
\usepackage{tikz}
16
\usepackage{framed}
17
\usepackage{parskip}
18
\usepackage{cite}
19
\usepackage{mystyle}
20
\clubpenalty = 10000 % Schusterjungen verhindern
21
\widowpenalty = 10000 % Hurenkinder verhindern
22
23
\hypersetup{
24
pdfauthor = {Martin Thoma},
25
pdfkeywords = {Bachelor proposal: },
26
pdftitle = {Bachelor proposal}
27
}
28
29
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
30
31
\begin{document}
32
\title{Proposal for a Bachelor of Science Thesis:\\Recognition of mathematical formulae in the Context of Lecture Translation}
33
\author{Martin Thoma}
34
\maketitle
35
\section{The problem backgound}
36
The KIT Lecture Translator, CMUSphinx, Android voice typing and
37
many other speech recognition systems have proven that it is possible to
38
recognize speech. But at the moment, there seems not to be a single
39
system that manages to recognize natural language math speech
40
recognition. For example, a term like
41
\[\sum_{n=1}^\infty \frac{1}{n^2} \rightarrow \infty \]
42
would naturally be spoken as
43
44
\begin{shadequote}[l]{}
45
The sum of one divided by n squared for n from one to infinity diverges to infinity.
46
\end{shadequote}
47
48
in natural language. Today, speech recognition systems do only
49
recognize the words spoken. They don't recognize that it was a
50
mathematical term which could and should be expressed with symbols.
51
52
One way to extend an existing speech recognition $A$ systems would be
53
by the following steps:
54
\begin{enumerate}
55
\item $A$ recognizes speech and returns a text $T$. This text
56
has to contain anotations that indicate at which time
57
in the original recording the various parts of speech
58
were detected.
59
\item A math detecter parses $T$ and returns the time intervalls $I$
60
when math was detected.
61
\item A math parser tries to parse speech in $I$. This parser
62
can make use of a language model dedicated to math. It
63
returns weighted hypotheses which terms might have
64
been spoken.
65
\item Finally, a program compares the hypotheses with math
66
in a formula database. Many formulas might already been
67
written in \TeX{}, e.g. on Wikipedia, math.stackexchange.com
68
or in freely available \LaTeX{} / \TeX{} files.
69
\end{enumerate}
70
\break
71
72
\section{The problem statement}
73
The bachelor's thesis at KIT is worth 15 ECTS. It should be
74
created within 4 months and at most 450 hours.
75
76
This aim of this bachelor's thesis is to answer the following
77
questions:
78
\begin{itemize}
79
\item \textbf{Representation of Math:} How can math be expressed
80
for speech recognition in a textual way?
81
Especially:
82
\begin{itemize}
83
\item What reasons are there to use \TeX{}, which
84
reasons are there for MathML?
85
\item Are there alternatives?
86
\end{itemize}
87
\item \textbf{Detection:} How can parts of speech be detected
88
that contain math?
89
\begin{itemize}
90
\item Which keywords indicate mathematics?
91
\item Is a keyword-density based approach sufficient?
92
\end{itemize}
93
\item \textbf{Evalution of math recognition strength}:
94
\begin{itemize}
95
\item How can speech recognition systems be evaluated
96
for their strength in math recognition?
97
\item Is the \textbf{W}ord \textbf{E}rror \textbf{R}ate
98
to measure how well the recognition worked?
99
\end{itemize}
100
\item \textbf{Literature research:}
101
\begin{itemize}
102
\item Can \TeX{} be used as a grammar to recognize math speech?
103
\item Can MathML be used as a grammar to recognize math speech?
104
\end{itemize}
105
\end{itemize}
106
107
Follow-up tasks, that will not be part of this bachelor's thesis,
108
include:
109
\begin{itemize}
110
\item \textbf{Other languages}: This thesis will focus on math
111
recognition for the English language. Follow-up work might
112
try to deal with math independant of the language.
113
\item \textbf{Implementation}: The aim of this thesis is not
114
to create a working math recognition.
115
\end{itemize}
116
117
\section{Significance}
118
This thesis will create a basis for follow-up work in speech recognition
119
that contains mathematical content. It will enable people to evaluate
120
various speech2math recognition ideas. Also, it will give an overview
121
of the current state of art in math speech recognition and which
122
questions need to be tackled in feature.
123
124
\section{Time schedule}
125
\begin{itemize}
126
\item[10h] Research of ways to represent math
127
\item[20h] Research ways how \TeX{} deals with math
128
\item[20h] Research how MathML deals with math
129
\item[50h] Recording math lectures
130
\item[100h] Annotating math lectures; writing the best
131
representation for mathematical terms contained in
132
these lectures
133
\item[10h] Finding keywords that indicate mathematical formulas
134
\item[5h] Test the keyword-approach with the annotated lectures
135
\end{itemize}
136
137
\renewcommand\refname{Related Literature}
138
\nocite{*}
139
\bibliographystyle{itmalpha}
140
\bibliography{literatur}
141
142
\section{Hypotheses}
143
I think that MathML will be the best way to represent math, because
144
it was designed to do this. MathML~3.0, the most recent version,
145
is a W3C recommendation since October 2001.
146
147
\TeX{} in contrast is great in rendering mathematical equations,
148
but it grew over time. It existed even before the web was invented.
149
150
Another reason why I think MathML might be favorable for internal
151
representation is that it was created to be parsed and written by
152
machines. It is an XML standard and as such you can apply XML tools
153
and libraries to parse it. \TeX{} on the other hand was created
154
to be written by humans.
155
156
I'm pretty sure that it is hopless to create a grammar for math
157
in it's general form. But for some areas like boolean logic, arithmetic
158
or analysis it might work pretty well.
159
160
\end{document}
161
162