summaryrefslogtreecommitdiffstats
path: root/report/esoteric_project_report.tex
blob: f0d4183b2568b491db93c575665d0f7e4d25ca66 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
\documentclass[a4paper, oneside, 11pt]{report}
\usepackage{epsfig,pifont,float,multirow,amsmath,amssymb}
\newcommand{\mc}{\multicolumn{1}{c|}}
\newcommand{\mb}{\mathbf}
\newcommand{\mi}{\mathit}
\newcommand{\oa}{\overrightarrow}
\newcommand{\bs}{\boldsymbol}
\newcommand{\ra}{\rightarrow}
\newcommand{\la}{\leftarrow}
\usepackage{algorithm}
\usepackage{algorithmic}
\usepackage{amsmath}
\usepackage{mathtools}
\usepackage{natbib}

\usepackage{graphicx}
\graphicspath{ {./images/} }

\topmargin = 0pt
\voffset = -80pt
\oddsidemargin = 15pt
\textwidth = 425pt
\textheight = 750pt

\begin{document}

\begin{titlepage}
\begin{center}
\rule{12cm}{1mm} \\
\vspace{1cm}
{\large  CMP-6048A / CMP-7009A Advanced Programming Concepts and Techniques}
\vspace{7.5cm}
\\{\Large Project Report - 15 December 2021}
\vspace{1.5cm}
\\{\LARGE Esoteric Compiler / Interpreter Project}
\vspace{1.0cm}
\\{\Large Group members: \\ Eden Attenborough, Alfie Eagleton, Aiden Rushbrooke, Chris Sutcliffe}
\vspace{10.0cm}
\\{\large School of Computing Sciences, University of East Anglia}
\\ \rule{12cm}{0.5mm}
\\ \hspace{8.5cm} {\large Version 1.0}
\end{center}
\end{titlepage}


\setcounter{page}{1}
%\pagenumbering{roman}
%\newpage

\chapter{Introduction}
\label{chap:intro}
\section{Project Statement}
This report will detail work done on creating a compiler for an esoteric programming language, or Esolang. Both the language and the compiler will be designed and created as part of this project, with the final result being a working compiler for a Turing complete language. The plan for the project is to first research other compiler implementations, and existing esoteric languages to learn the basic structure and fundamentals. We will then take this knowledge and develop the compiler using advanced programming concepts, before running a series of designed experiments to test the compilers capabilities and effectiveness.

\section{Technology}
To develop the compiler we plan on using the programming language java for the majority of the implementation. We plan on either compiling directly into java bytecode, or into a secondary language such as C. 

\section{MoSCoW Analysis}
Before implementation we have decided on a series of requirements, split into sections using the MoSCoW analysis method.

Must have:
\begin{itemize}
	\item A working compiler without and crashes or unexpected behaviour 
	\item A Turing complete language with features including expressions, conditionals and loops
	\item Basic error handling and reporting
	\item Reading and executing single lines of code from command line
	\item Output to command line
	\item Basic numerical variable handing
\end{itemize}
Should have:
\begin{itemize}
	\item Reading and executing from source files
	\item Additional features including strings and logical expressions
	\item Improved error handling, providing useful information to the user
	\item Optimization steps during compilation and before runtime
\end{itemize}
Might have:
\begin{itemize}
	\item Further features such as functions and recursion
	\item Array handling
	\item Further string handling such as indexing
	\item Memory allocation and disposal
\end{itemize}
Won't have:
\begin{itemize}
	\item User interface
	\item Object oriented programming features
	\item IDE features
\end{itemize}

\section{Report structure}
This report will first cover our research into esoteric languages in Chapter 2. Then, we will describe the overall design and structure of a compiler based on our research in Chapter 3. Chapter 4 will focus on the design of our own esoteric language, including grammar tables. Chapter 5 will describe the implementation and documentation of the compiler, explaining how it was implemented using agile methods. Chapter 6 will show evidence of testing the compiler, with both unit tests and overall test harnesses for each section. Finally, Chapter 7 will be the conclusion, where we will compare the final result to our initial requirements, and give any further work we would do if given time.

\chapter{Research of Esoteric Languages}

\section{Esoteric Languages (Esolangs)}
Esolangs are programming languages designed to be jokes or proof of concept languages, rather than those to be used in actual programming tasks. Typically, the reason they are created is for fun rather than any serious desire to solve a particular programming problem, and so most languages aim to be funny, to be as difficult to use as possible or to have few features while still being Turing complete. 
In other words, it could be said that they are created see explore the possibilities or boundaries of programming language design \citep{mateas2008weird}.
Some esoteric languages are also designed to adapt existing languages into simpler or more usable forms.

\section{Examples of Esoteric Languages}
There are many different examples of esoteric languages. 
One such language is the Shakespearian Programming Language, which is an  where code is written so it reads like how a script would, such as \texttt{[Enter Juliet]} where \texttt{Juliet} may be a variable name. 
Another example of an esoteric language is Befunge, one of the most popular and well known esoteric languages, which is intentionally designed to be hard to compile. 
Moreover, Pyth is yet another esoteric language that is designed to be similar to python, but with every instruction a single character. When ran, a pyth code is first compiled into python before being executed. 

\chapter{Compiler Design and Methodology}
This chapter will discuss how compilers work, and the design behind them. The goal of a compiler is to convert the source code into an from which the CPU can execute. As mentioned in chapter 2, a compiler works by converting the entire source code in one go, then executing, unlike a interpreter which executes each line at a time. However, before the code can be executed a number of steps have to be taken.
\section{The Lexer (Lexical Analysis)}
The first step for a compiler is to convert the initial source code into groups of characters which the computer can understand, generally known as tokens. This process is also known as tokenisation, particularly in the field of natural language processing (NLP). When designing a programming language, the task of lexical analysis or tokenisation is performed by a lexer or scanner. This lexer looks through each character and creates equivalent tokens based on the character values. For example, the character '+' may be converted into the token "plus", which in a programming language would likely represent addition. Some tokens may also be multiple lines long, such as strings, identifiers and keywords like "if" or "int". This stage can also detect some syntax errors, for any unrecognised characters. The end result of lexical analysis is a series of tokens representing the starting source code.
\section{The Parser}
The second stage of the compiler is to try and understand how the tokens link together to form a a program.  The parser uses the list of tokens created and compares it to the language grammar. This allows it to understand how the tokens are linked, and can then use this information to create abstract syntax trees (AST), which represents the code in a logical fashion. The parser can also detect any syntax errors  in the initial source code when comparing against the language grammar. 

The specific parser we have chosen is a recursive descent parser, a Top-Down parser where each rule in the grammar maps to a specific function in the parser implementation. A recursive descent parser works by starting at the first rule in the grammar, then working downward, finding each statement or expression. We chose this parser design as it is both simple to implement and gives a lot of control to how the parser functions and error reporting compared to parsers created through a parser generator and grammar table. Many compiler implementations also use recursive descent parsers, such as GCC \cite{GCCParser} and Clang. 
\section{Code Generation}
At this stage, the code is converted into a form which can be executed by the CPU, such as machine code. However, due to the complexity of converting directly to machine code, an alternative is commonly done. Here, the compiler writes to a alternate form designed to run on a created virtual machine. This form is known as bytecode. Java is a well known example which uses this method, where java programs are compiled to run in the Java virtual machine.

One other method is to compile into another, existing programming language. This method is known as a source-to-source compiler. Here, instead of writing to machine code or bytecode, the compiler creates a string of valid source code for the language being targeted, based on the initial source code for the main compiler. This can then be executed using other, pre-existing compilers. One common use of these compilers is for web browsers, where many languages compile to javascript.

\chapter{Our Esoteric Language}
\section{EsotericFortran}
For our esoteric language we decided to adapt an existing language into a modern form. Specifically, we chose the language Fortran. We chose Fortran as contained all the the features we were interested in implementing, but was also different enough from most currently popular languages.

Of course, we have decided to make some changes from Fortran. Most importantly, we decided to include keywords in our language unlike Fortran, as we thought that allowing variables to share names with keywords adds unnecessary confusion. We have also decided to make other, small changes from Fortran, either due to time constraints or to simplify to a more modern form.
\section{Language Grammar}
Before we can begin implementation, we need to design the grammar of our language. The grammar not only gives us and understanding of how our language will work and allow us to spot and issues, but also our chosen parser, a recursive descent parser, heavily relies on the language grammar for its implementation.

To define the language we are using formal grammars. Formal grammars are a way represent the language as a set of rules \citep{davie1982recursive}. In order to fully represent the language we are implementing we have a context-free grammar, a type 2 grammar on the Chomsky hierarchy. Context-free grammars are a type of formal grammar in the form
T $\rightarrow$ t
where A is a nonterminal, and t a series of terminals and nonterminals. Nonterminals are references to other rules in the language, and a terminal is an end point in the language. They are context-free as they do not look at the context of the nonterminal. 

In order to define the language in a readable way, we have used a slightly adjusted from of the Backus-Naur form (BNF).
\begin{verbatim}<program> ::= <statement>* 
\end{verbatim}
This example means that a program is defined as some number of statements. Here, we have used the '*' symbol to mean one or more and surrounded nonterminals with '<' and '>'. For terminals we have decided to surround any terminals with '"', such as for keywords or defining identifiers or numbers. If a nontermial has multiple possible rules we have seperated them using '|', meaning "or". 
\newpage
\subsection{Statements Grammar}
\begin{table}[!htb]
	\begin{center}
		\begin{tabular}{|l|r|}
			\hline
			\textbf{Abbreviation} & \textbf{Term}\\
			\hline
			$<$program$>$ ::= & $<$function-call$>$* \\
			\hline
			$<$function-call$>$ ::= & $<$main-function$> |$ \\
								   & $<$function$> |$ \\
								   & $<$subroutine$>$\\
			\hline
			$<$main-function$>$ ::= & "program" $<$identifier$>$ $<$block$>$ "program "$<$identifier$>$\\
			\hline
			$<$function$>$::=&$<$type$>$$<$identifier$>$"("$<$expression$>$","*")"$<$block$>$\\
			\hline
			$<$subroutine$>$::=&$<$identifier$>$"("$<$expression$>$","*")"$<$block$>$\\
			\hline
			$<$block$>$::=& $<$statement$>$* "end" $|$\\
						 & $<$statement$>$* $<$return$>$ "end" $|$\\
			\hline
			$<$statement$>$ ::= & $<$declaration$>$ $|$ \\
			& $<$expr-Statement$>$ $|$\\
			& $<$print-Statement$>$ $|$ \\
			& $<$if-Statement$>$ $|$\\
			& $<$do-Statement$>$ $|$\\
			& $<$do-while-Statement$>$ \\
			\hline
			$<$declaration$>$ ::= & "character (len = "$<$number$>$")::"$<$identifier$>$ $|$ \\
			& "int::"$<$identifier$>$ $|$\\
			& "real::"$<$identifier$>$ \\
			& $<$type$>$ "dimension("$<$expression$>$*")"::$<$identifier$>$\\
			\hline
			$<$print-Statement$>$ ::= & "print *" (","$<$expression$>$)* \\
			\hline
			$<$if-Statement$>$ ::= & "if ("$<$expression$>$") then" $<$block$>$ \\
			& ("else" $<$block$>$)?\\
			& if"\\
			\hline
			$<$do-Statement$>$ ::= & "do" $<$identifier$>$ "=" $<$number$>$","$<$number$>$(","$<$number$>$)?\\
			&$<$block$>$ \\
			& "do"\\
			\hline
			$<$do-while-Statement$>$ ::= & "do while ("$<$expression$>$")"\\
			& $<$block$>$ \\
			& "do"\\
			\hline
			$<$return$>$ ::= & "return" $<$expression$>$\\
			\hline
			$<$expr-statement$>$ ::= & $<$expression$>$\\
			\hline
			\end{tabular}
		\label{tab:table1}
	\caption{Grammar table for EsotericFortran Statements}
\end{center}
\end{table}
\newpage
\subsection{Expression Grammar}
\begin{table}[!htb]
	\begin{center}
		\begin{tabular}{|l|r|}
			\hline
			\textbf{Abbreviation} & \textbf{Term}\\
			\hline
			\hline
			$<$expression$>$ ::= & $<$assignment$>$\\
			\hline
			$<$assignment$>$ ::= & $<$identifier$>$"="$<$equality$>$$|$\\
			& $<$equality$>$$|$\\
			\hline
			$<$equality$>$ ::= & $<$logical$>$("=="$|$"/=")$<$logical$>$$|$\\
			& $<$logical$>$$|$\\
			\hline
			$<$logical$>$ ::= & $<$comparison$>$(".and."$|$".or.")$<$comparison$>$$|$\\
			& $<$".not."$>$$<$comparison$>$$|$\\
			& $<$comparison$>$$|$\\
			\hline
			$<$comparison$>$ ::= & $<$term$>$("$>$"$|$"$<$"$|$"$>=$"$|$"$>=$")$<$term$>$$|$\\
			& $<$term$>$$|$\\
			\hline
			$<$term$>$ ::= & $<$factor$>$("+"$|$"-")$<$factor$>$$|$\\
			& $<$factor$>$$|$\\
			\hline
			$<$factor$>$ ::= & $<$exponent$>$("*"$|$"/")$<$exponent$>$$|$\\
			& $<$exponent$>$$|$\\
			\hline
			$<$exponent$>$ ::= & $<$primary$>$"**"$<$primary$>$$|$\\
			& $<$primary$>$$|$\\
			\hline
			$<$primary$>$ ::= & $<$number$>$$|$\\
			& $<$identifier$>$$|$\\
			& $<$string$>$$|$\\
			& $<$identifier$>$"("$<$expression$>$","*")"\\
			
			\hline
			$<$number$>$ ::= & $<$digit$>$*("."$<$digit$>$*)?\\
			\hline
			
			$<$identifier$>$ ::= & $<$alpha$>$*\\
			\hline
			$<$string$>$ ::= & """$<$alpha$>$*"""\\
			\hline
			
			$<$digit$>$::=&0\textbar1\textbar2\textbar3\textbar4\textbar5\textbar6\textbar7\textbar8\textbar9 \\
			\hline
			
			$<$alpha$>$::=&"a"..."z"$|$"A"..."Z"\\
			\hline
			$<$type$>$::=&"int"$|$"real"$|$"string"\\
			\hline
		\end{tabular}
		\label{tab:table1}
		\caption{Grammar table for EsotericFortran Expressions}
	\end{center}
\end{table}

\chapter{Implementation}\label{MethLab}

We decided it would be a good idea to note down and record our application stage by stage while developing our Esolang so that we may refer back to previous parts for reference. It will also help track our progress throughout the project to ensure we stay on track.

\section{Language.java}
Of course the very first thing that is required when it comes to translating any kind of code into execution, is the ability to read it either from command line or file. In this case, we use \texttt{Language.java} to divide line by line, essentially a while loop until EOF, end-of-file, to read all from file. Essentially our program keeps dividing from an entire source text into lines into characters and then starts building into tokens and expressions and strings until it can recognise commands. \newline

\begin{figure}[h!]
	\begin{center}
	\includegraphics{Language.JPG}
	\caption{A series of steps to examine source text by line using \texttt{while}}
	\end{center}
\end{figure}

\section{TokenScanner.java \& TokenType.java}

This begins the lexical analysis of our program, by examining each string by character, and identifying what type of symbol it is, whether it is text or something further such as operations. When the program reads a particular character, such as ``+'', it needs to know that it is firstly, an operator and second, a plus operand. There are other examples other than the addition operator. We also have prepared for ``-'' or subtract, the lookahead function since ``=='' is conveyed differently to ``=''. All of these are converted to tokens one by one before being sent to our parser algorithms. 

\begin{figure}[h!]
	\begin{center}
		\includegraphics{TokenScanner.JPG}
		\caption{An example of using \texttt{switch} to identify operands, including looking ahead for multiple character tokens}
	\end{center}
\end{figure}

In the image above we have identified a means for using switch statements to recognise strings and turning them into fixed tokens. This is also why we use \texttt{TokenType.java}, which uses a fixed number of enums so the program knows exactly what they have read. In this case for example, case `*' is equal to \texttt{TokenType.STAR}, and so on. As mentioned, some tokens are multiple characters long and so we have further cases to compensate for these. They usually consist of double equals ==, or comparables such as $<$= and $>$=.

\section{Parser.java}

This class is the primary code of our program. It contains everything we need to start recognising the code given and actually \emph{do} something with it. It takes a list of tokens from our analyser, and goes through our list of defined grammar, attempting to work out what each token translates to, essentially breaking down each line to find out what it is. We have lots of conditions is this class and each one is tested until one of them is met, denoting the result. Each possible grammar rules must match one of the conditions in the parser, otherwise it does not make sense and the code may fail to compile. We have supported this via exceptions and default case error messages. \newline

Conditions could include ``have they written \texttt{int} before this keyword'' or ``Is this a single colon or a double colon?'' or ``Is there a string after said colons?'' If all of these conditions are met, we can identify that a variable can be declared, for example.

\begin{figure}[h!]
	\begin{center}
		\includegraphics{ParserStatement.jpg}
		\caption{If the next line is a statement, this function identifies which statement it is using the enums from \texttt{TokenType.java}}
	\end{center}
\end{figure}

The above code represents our decision making when it comes to identifying statements.

\section{Expression.java \& Statement.java}
Programs are generally made up of expressions and statements. This is why we made abstract classes Expression.java and Statement.java. Different types of expressions and statements then inherit from these. An example of an expression class is a binary expression, made up of a left and right section, and a middle operator. Our Statements are lists of blocks which have a certain type, whether they are Prints or conditionals, or even iterables such as Whiles. An expression is also a statement, so we also accounted for having an ExpressionStatement. These classes can be used for identification when our translator writes to C.

\section{Translator.java \& ExecuteC.java}
Our translator file takes our list of statements previously analysed by our chain process, going through out lexer and parser, and one by one is converted into C code which can then be executed as it would normally to provide an output. We store these statements in an ArrayList and simply iterate over them using functions such as compileToC(), where for each statement, we evaluate it and add it to a text file which is eventually printed to the UI.

\section{IDE (written in Python)}
While not part of the original scope of the project, a simple integrated development (IDE) environment was also created in Python to showcase our software. 
\begin{figure}[h!]
	\begin{center}
		\includegraphics[scale=0.22]{images/ide.png}
		\caption{Our implemented IDE software}
	\end{center}
\end{figure}

The IDE features three panes, which allow the user to write FORTRAN code, see the compiler produced C code and the output of the compiled code all in 1 window.
The IDE was written using the Python tkinter library, and interfaces with the java portion of project through the command line stdout interface. 
The IDE has file reading and writing capability, can reload files from disk with the F3 key, and can compile and run FORTRAN code. 
The IDE is cross platform, so it can run on either Widows or Unix based systems. 



\section{Agile \& Project Management}
We decided to use the Agile method since it would lead to not only increased team communication, but better organisation when it came to our code. It also meant our development was flexible; we could choose what we wanted to work on at a given time with the use of sprints. \newline

During our allocated project time, we did four major sprints. Our initial priority was creating the main framework for the compiler of our esoteric language. Then we handled variables and print statements. Then implemented conditionals and loops. Finally, we tackled arrays and functions. Once we tied up these loose ends, we could actually make a full Sieve of Eratosthenes algorithm to show the capability of the program. \\

\subsection{Sprint 1: Creating the framework for the compiler}
Focused on creating the framework for the compiler, and added simple expression handling.
\begin{itemize}
\item Language.java
\item Token.java
\item TokenScanner.java
\item TokenType.java
\item Expression.java
\item Parser.java
\end{itemize}

\subsection{Sprint 2: Variables and Print statements}
Implemented basic statements for variables and prints. Also implemented error detection and reporting.
\begin{itemize}
\item Parser.java
\item Statement.java
\end{itemize}
\subsection{Sprint 3: Conditionals and loops}
Added support for conditional statements and loops. In this sprint the classes to write to C and execute were also added.

\begin{itemize}
\item Parser.java
\item Environment.java
\item ExecuteC.java
\item Translator.java
\item Statement.java
\item Utils.java
\end{itemize}
\subsection{Sprint 4: Arrays \& functions}
Added functions, subroutines and arrays to the language, as well as additional utilities and a simple user interface.
\begin{itemize}
\item Parser.java
\item Statement.java
\item Expression.java
\item Utils.java

\end{itemize}

\chapter{Testing}

The testing for this project was done in two main parts . 
Once each sprint was completed, tests on features that had been implemented were run to ensure that the compiler performed as expected.
This tests can be seen in tables 6.1, 6.2 and 6.3.
In addition to this, test Fortran source files were created that made use of all implemented features, and the output was checked to establish that features were behaving correctly.

\section{Sprint testing}
The testing that occurred after each sprint was done in two main stages. 
Firstly, the lexical scanner was unit tested independently from the rest of the code, then once it was established that the scanner was working correctly, we tested the scanner and parser together. 
Due to the nature of the parser output, we utilised a simple tree-walk interpreter for the first two sprints, then used the full compiler for the later sprints.
The results of each test are shown in the tables below, giving the reason for the test, the input and output, and whether the test was successful.
Extensive testing was done for each sprint, ensuring that each stage successfully built upon the logic and code that was created in the sprint before it. 


\subsection{Sprint 1 testing}
\begin{table}[h]
	\caption[]{Sprint 1 Tests and their results}
	\begin{center}
		\begin{tabular}{|l|l|l|l|}\hline
			Reason & Input & Output & Success \\ \hline
			Basic expression scanning & 1+1 & \parbox{2.5cm}{NUMBER 1 \\ PLUS + \\ NUMBER 1 \\EOF} & True \\ \hline
			Decimal number scanning & 1.5 & \parbox{2.5cm}{NUMBER 1.5\\EOF} & True \\ \hline
			Multiple character token scanning & 4==6 & \parbox{3.5cm}{NUMBER 4\\EQUALITY ==\\NUMBER 6\\EOF} & True \\ \hline
			Invalid token character scanning & \# & \parbox{3cm}{Unexpected Character Error} & True \\ \hline
			Literal expression parsing & 5 & 5 & True \\ \hline
			Binary expression parsing & 1+1 & 2 & True \\ \hline
			Bracketed expression parsing & (2*3) & 6 & True \\ \hline
			Comparison parsing& 4$>$3 & true & True \\ \hline
			Invalid syntax checking& 5+(3+2 & Error: Expected ')' & True \\ \hline
			Invalid expression checking& 5++1 & Error: Syntax Error & True \\ \hline
			Full expression test& 5*(3+1)-9/3 & 17 & True \\ \hline
		\end{tabular}
		\label{sprint1Test}
	\end{center}
\end{table}

Testing in the first sprint was almost entirely limited to parsing mathematical expressions and error checking related to this.
Equalities were tested, as well as ensuring that syntax errors were appropriately caught and output the the user.


\newpage
\subsection{Sprint 2 testing}
\begin{table}[h]
	\caption[]{Sprint 2 Tests and their results}
	\begin{center}
		\begin{tabular}{|l|l|l|l|}\hline
			Reason & Input & Output & Success \\ \hline
			Keyword scanning & print 5 & \parbox{2.5cm}{PRINT print \\ NUMBER 5 \\ EOF} & True \\ \hline
			Identifier scanning & test & \parbox{3.5cm}{IDENTIFIER test\\EOF} & True \\ \hline
			Print Expression test & print 5 & \parbox{3cm}{5} & True \\ \hline
			Variable test & \parbox{3cm}{var::a\\a=5\\print a} & \parbox{3cm}{5} & True \\ \hline
			Undefined variable test & \parbox{3cm}{a=5\\print a} & Error: Undefined Variable & True \\ \hline
			Invalid syntax error & \parbox{3cm}{a=5\\prit 5} & Error: Syntax Error & True \\ \hline
		\end{tabular}
		\label{sprint2Test}
	\end{center}
\end{table}
The second sprint tested the implementation of various different functionalities such as keyword scanning, as well as variable assignment and printing functionality. 


\subsection{Sprint 3 testing}

\begin{table}[!htb]
	\caption[]{Sprint 3 Tests and their results}
	\begin{center}
		\begin{tabular}{|l|l|l|l|}\hline
			Reason & Input & Output & Success \\ \hline
			If statement scanning & \parbox{3.5cm}{if 5==5 then\\endif} & \parbox{4cm}{IF \\ NUMBER 5 \\ EQUALS == \\ NUMBER 5\\ THEN\\ ENDIF \\EOF} & True \\ \hline
			Print statement scanning & print*,5 endprint & \parbox{3.5cm}{PRINT\\STAR *\\COMMA ,\\NUMBER 5\\ENDPRINT\\EOF} & True \\ \hline
			String scanning test & "test" & \parbox{3cm}{STRING test} & True \\ \hline
			If statement parsing & \parbox{3cm}{int::a\\if 5==5 then\\a=3\\endif\\print*, a endprint} & \parbox{3cm}{3} & True \\ \hline
			String parsing & \parbox{3cm}{print*,"hello world" endprint} & \parbox{3cm}{hello world} & True \\ \hline
			Multiple print values & \parbox{3cm}{print*,5,"hello",6} & \parbox{3cm}{5hello6} & True \\ \hline
		\end{tabular}
		\label{sprint3Test}
	\end{center}
\end{table}
Testing during this sprint focused on more advanced program functionality, including conditional testing, and the concatenation of multiple strings for a print statement. 
These tests ensured that the logical and conditional parts of our compiler worked as expected. 
\newpage
\section{Fortran Source File Testing}
Once the compiler was fully implemented, testing was moved to custom harnesses that were written in Fortran. 
Each test file was run and the output carefully checked to ensure that the language and compiler were working as expected.
In addition, example source code files were written to demonstrate the capabilities of the implemented Fortran language. 
The use of actual source files in testing meant that compiler and program behaviour, as well as that of the implemented IDE could be checked on tandem, thus ensuring that the software worked properly.
Logical testing using test harnesses was particularly thorough, ensuring that the parser was able to handle comparing integers, as well as real numbers. 
In addition, extensive logical testing for the AND and OR operators was carried out to ensure that there were no bugs from the position of particular variables being evaluated. 
In other words, the software was tested so that the order of variables in a logical test did not matter, for example A AND B is equivalent to B AND A (see appendix for examples of this).
Below are some examples of output testing harnesses, run through the IDE program.

\begin{figure}[!htb]
	\centering
	\includegraphics[scale=0.5]{images/test_harness_1.png}
	\caption{Mathematical testing}
	\label{test_harness_one}	
\end{figure}

\begin{figure}[!htb]
	\centering
	\includegraphics[scale=0.5]{images/test_harness_2.png}
	\caption{Logical testing}
	\label{test_harness_two}	
\end{figure}

As can be seen from the above screenshots, various different mathematical, conditional and logical tests were used to test the functionality of the implemented code. 
These tests, combined with example code, allowed our compiler system to be checked in a real world scenario, and thus solidify the testing done in earlier sprints.


\chapter{Discussion, conclusion and future work}
Overall we are happy with the end product produced during this project. We have created a working, capable compiler able to work with our modified version of Fortran. From our initial MoSCoW analysis in Chapter 1 we have achieved all the "must have" requirements as the language is Turing Complete and can detect a wide variety of both syntax and runtime errors.  From the "should have" section, we have functionally to read from files, and has all of the additional features. However, since we decided to compile to C, we excluded optimisation methods as the C compiler would perform these anyway. Again, from the "must have" section we implemented the additional features described, such as functions, but again decided against direct memory allocation as our language is still reasonably simple, so the memory handing from C suffices. Finally, even though we said the end product would not have a GUI, we decided to implement a simple interface to test the compiler and more clearly show the final product.

Future work would include further expanding the language to include further Fortran features. Most obviously, we would further develop string handling to include string indexing and slicing. We would also implement additional input and output features, including reading from files. Other implemented features would include modules, pointers and including some built-in functions. Finally, we would further improve the error handing, detecting more errors before the code is compiled to C and displaying additional error information.

\bibliographystyle{unsrt}
\bibliography{References}

\chapter*{Contributions}

State here the \% contribution to the project of each individual member of the group and describe in brief what each member has done (if this corresponds to particular sections in the report then please specify these).

\chapter*{Appendix A}

\begin{figure}[h!]
	\begin{center}
		\includegraphics[scale=0.5]{UMLsprint1.pdf}
		\caption{UML diagram for Sprint 1}
	\end{center}
\end{figure}
\begin{figure}[h!]
	\begin{center}
		\includegraphics[scale=0.35]{UMLsprint2.pdf}
		\caption{UML diagram for Sprint 2}
	\end{center}
\end{figure}
\begin{figure}[h!]
	\begin{center}
		\includegraphics[scale=0.35]{UMLsprint3.pdf}
		\caption{UML diagram for Sprint 3}
	\end{center}
\end{figure}
\begin{figure}[h!]
	\begin{center}
		\includegraphics[scale=0.35]{UMLsprint4.pdf}
		\caption{UML diagram for Sprint 4, equivalent to the final compiler design}
	\end{center}
\end{figure}


\begin{figure}[h!]
	\begin{center}
		\includegraphics[scale=0.55]{images/test_harness_code_1.png}
		\caption{Excerpt of test harness showing mathematical expression testing}
	\end{center}
\end{figure}

\begin{figure}[h!]
	\begin{center}
		\includegraphics[scale=0.55]{images/test_harness_code_2.png}
		\caption{Excerpt from test harness showing array and string testing}
	\end{center}
\end{figure}


\begin{figure}[h!]
	\begin{center}
		\includegraphics[scale=0.55]{images/test_harness_code_3.png}
		\caption{Excerpt of test harness showing conditional and logical testing}
	\end{center}
\end{figure}

\end{document}