software-testing/project_task_sheets/phase_05/project_phase05_tasks/Solution_Phase05_MichaelChen.tex
2022-07-14 13:47:30 +02:00

380 lines
27 KiB
TeX

\documentclass[a4paper]{scrreprt}
\usepackage[left=4cm,bottom=3cm,top=3cm,right=4cm,nohead,nofoot]{geometry}
\usepackage{graphicx}
\usepackage{tabularx}
\usepackage{listings}
\usepackage{enumitem}
\usepackage{subcaption}
\usepackage[acronym]{glossaries}
\newacronym{bnf}{BNF}{Backus-Naur form}
\newacronym{fsm}{FSM}{finite state machine}
\newacronym{tsc}{TSC}{terminal symbol coverage}
\newacronym{pdc}{PDC}{production coverage}
\newacronym{dc}{DC}{derivation coverage}
\newacronym{moc}{MOC}{mutation operator coverage}
\newacronym{mpc}{MPC}{mutation production coverage}
\newacronym{wmc}{WMC}{weak mutant coverage}
\newacronym{smc}{SMC}{strong mutant coverage}
\usepackage{pgf}
\usepackage{tikz}
\usetikzlibrary{arrows,automata}
\usepackage{xparse}
\usepackage{multirow}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\setlength{\textfloatsep}{16pt}
\renewcommand{\labelenumi}{\alph{enumi})}
\renewcommand{\labelenumii}{\arabic{enumii}) }
\newcommand{\baseinfo}[5]{
\begin{center}
\begin{tabular}{p{15cm}r}
\vspace{-4.5pt}{ \Large \bfseries #1} & \multirow{2}{*}{} \\[0.4cm]
#2 & \\[0.5cm]
\end{tabular}
\end{center}
\vspace{-18pt}\hrule\vspace{6pt}
\begin{tabular}{ll}
\textbf{Name:} & #4\\
\textbf{Group:} & #5\\
\end{tabular}
\vspace{4pt}\hrule\vspace{2pt}
\footnotesize \textbf{Software Testing} \hfil - \hfil Summer 2022 \hfil - \hfil #3 \hfil - \hfil Sibylle Schupp / Sascha Lehmann \hfil \\
}
\newcounter{question}
\NewDocumentEnvironment{question}{m o}{%
\addtocounter{question}{1}%
\paragraph{\textcolor{red}{Task~\arabic{question}} - #1\hfill\IfNoValueTF{#2}{}{[#2 P]}}
\leavevmode\\%
}{%
\vskip 1em%
}
\NewDocumentEnvironment{answer}{}{%
\vspace{6pt}
\leavevmode\\
\textit{Answer:}\\[-0.25cm]
{\color{red}\rule{\textwidth}{0.4mm}}
}{%
\leavevmode\\
{\color{red}\rule{\textwidth}{0.4mm}}
}
\newcommand{\projectinfo}[5]{
\baseinfo{Project Phase #1 - Submission Sheet}{#2}{#3}{#4}{#5}
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\def\name{Michael Chen}
\def\group{Group 01 (fastjson)}
\begin{document}
\projectinfo{5}{Software Testing - Syntax Coverage\small}{\today}{\name}{\group}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Task 1 %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{question}{Answer basic questions on Syntax Coverage}[3]
\begin{enumerate}[topsep=0pt, leftmargin=*]
\item Define the following terms in your own words:
\begin{enumerate}
\item \textit{Production Rule}, \textit{Generator}, and \textit{Terminal}
\begin{answer}
A Grammar is a set of \textit{words} or, as called here, \textit{strings}, that it accepts. Grammars can be described in different ways such as set notation, regular expressions, or in \gls{bnf}. \Gls{bnf} is set of production rules, also called rewrite rules, that map a set of symbols to a single symbol which is the name of the production rule. Symbols can be either non-terminals which represent non-trivial sytactic constructs such as loops or declarations in programming, while terminal symbols represent tokens, such as integer or string literals in programming languages. A generator is a \gls{fsm} that, given a grammar in \gls{bnf}, can generate a word of the grammar by recursively applying productions on the starting symbol until the syntax tree's leafs are only terminal symbols.
\end{answer}
\item \textit{Mutant}, \textit{Ground String}, and \textit{Mutation Operator}
\begin{answer}
A ground string is a string that is accepted by a grammar. In other terms it is a string where we already know that a proof exists, that shows the string is in the grammar (ground truth). We can now change this string a little bit by applying specific rules, called mutation operators, that generate new strings that may or may not be accepted by the grammar, called mutants.
\end{answer}
\item \textit{RIP Model}
\begin{answer}
The RIP model declares three requirements that are fulfilled when a failure occurrs. If all three of these are true, only then a failure can be detected. Firstly, the location of the fault in the code must be reached (\textbf{Reachability}), then, the program's internal state must be corrupted, i.e. an error occurred (\textbf{Infection}) and, finally, the error must have propagated to the program's output where the failure can then be observed (\textbf{Propagation}).
\end{answer}
\end{enumerate}
\item Name and describe \textit{2 syntax-based coverage criteria}. Does one of these two criteria subsume the other? Explain why, or provide a counterexample.
\begin{answer}
Two criteria on mutation operators are \gls{moc} and \gls{mpc}. \Gls{moc} introduces test requirements for every mutation operator, where the test is applied to the mutant created by the operator. This is trivially subsumed by the \gls{mpc}, which in addition to that, requires that for every production in the grammar, which the operator is applicable to, a test requirement is added.
\end{answer}
\item What does it mean to \textit{"kill a mutant"}? Explain the concept with an own code example.
\begin{answer}
A test (strongly) kills a mutant if all three conditions of the RIP model are satisfied for it's test execution. Consider the following mutated \texttt{max} program:
\begin{lstlisting}[language=C]
int max(int a, int b) {
// return a > b ? a : b; // original
return a < b ? a : b; // mutant
}
\end{lstlisting}
A test that kills this mutant could be the following:
\begin{lstlisting}[language=C]
void kill_max_mutation() {
assert(max(1, 3) == 3);
}
\end{lstlisting}
\begin{enumerate}
\item This test satisfies the reachability requirement trivially.
\item The program is infected because the changed operator causes the program to select the wrong branch of the ternary expression.
\item And finally the error propagated to the function output, in this case also trivially because the incorrect value is directly returned.
\end{enumerate}
Because the RIP model is satisfied the mutation is successfully killed by the test.
\end{answer}
\item Name and describe the \textit{4 different classified types} of mutants in the context of mutant killing.
\begin{answer}
\begin{itemize}
\item \textbf{Dead mutants} have been successfully killed by a test case.
\item \textbf{Trivial mutants} are killed by any any test case in that the variation is so severe, that the program will almost certainly result in a failure.
\item \textbf{Stillborn mutants} cannot be killed by any test case because they are syntactically illegal, i.e. cannot be executed.
\item \textbf{Equivalent mutants} can also not be killed, because the mutation does not cause the program to behave differently.
\end{itemize}
\end{answer}
\end{enumerate}
\end{question}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Task 2 %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{question}{Apply Syntax Coverage criteria to a sample program}[5]
\begin{enumerate}[topsep=0pt, leftmargin=*]
\item Select \textit{one} mutation operator from Table~\ref{table:operators} (the operators are taken from \textit{Introduction to Software Testing} by Ammann and Offutt), and apply it \textit{once} to every occurrence of a target expression, statement or instruction inside the given \texttt{toString()} method (e.g., choose the \textit{SOR} operator, and for each \verb|<<|, \verb|>>|, or \verb|>>>| that occurs in the method, replace it with \textit{one} other shift operator). The operator has to be applicable to at least two parts of the code. For each mutation, document the line, original epression, and mutated expression.
\begin{answer}
Here are the applications of the mutation operator ROR. All the mutation rules you have provided in Table~\ref{table:operators} differ from the definitions in \textit{Introduction to Software Testing} by Ammann and Offutt. The textbook definition requires all operators to be replaced by all other operators (and \textit{trueOp} and \textit{falseOp}), whereas your formal definition only requires the \verb|>| to be replaced (see left side of the production). Even though your definition is not precisely correct i have listed the missing mutations to the textbook definition in the lower part of the table.
\begin{center}
\begin{tabular}{r|c|c}
Line & Ground String & Mutant \\ \hline \hline
32 & \verb|buf.length() > 2| & \verb|buf.length() != 2| \\
35 & \verb|nanos > 0| & \verb|nanos < 0| \\
44 & \verb|nanos > 0| & \verb|true| \\ \hline
18 & \verb|this == ZERO| & \verb|this != ZERO| \\
26 & \verb|hours != 0| & \verb|hours == 0| \\
29 & \verb|minutes != 0| & \verb|minutes < 0| \\
32 & \verb|secs == 0| & \verb|false| \\
32 & \verb|nanos == 0| & \verb|nanos > 0| \\
35 & \verb|secs < 0| & \verb|secs == 0| \\
36 & \verb|secs == -1| & \verb|secs <= -1| \\
46 & \verb|secs < 0| & \verb|secs >= 0| \\
51 & \verb|lastChar(buf) == '0'| & \verb|lastChar(buf) != '0'|
\end{tabular}
\end{center}
\end{answer}
\item Select \textit{one} other mutation operator, and apply all of its concrete mutations to \textit{one} target expression, statement or instruction inside the given \texttt{toString()} method (e.g., choose the \textit{SOR} operator, and replace one occurring \verb|<<| with the other shift operators \verb|>>| and \verb|>>>|). Document the line, original epression, and mutated expressions.
\begin{answer}
The predicate \verb|secs < 0 && nanos > 0| in line~35 results in the following mutations when applying COR:
\begin{enumerate}
\item \verb+secs < 0 || nanos > 0+
\item \verb+secs < 0 & nanos > 0+
\item \verb+secs < 0 | nanos > 0+
\item \verb+secs < 0 ^ nanos > 0+
\item \verb+false+
\item \verb+true+
\item \verb+secs < 0+
\item \verb+nanos > 0+
\end{enumerate}
\end{answer}
\item Pick one set of method mutations (either the mutations of task 2a or 2b), and provide a minimum set of tests that kill all introduced mutants \textit{strongly} (\textit{Strong Mutation Coverage (SMC)}). If that is not possible for a concrete mutant, explain why, and kill that mutant \textit{weakly} (\textit{Weak Mutation Coverage (WMC)}) instead. If neither of that is possible, explain it as well.
\begin{answer}
The following test case kills all mutations from the COR set from task~2b except the second mutation, because for boolean values the \verb|&| operator is similar to the \verb|&&| operator, thus we have an equivalent mutant.
\lstinputlisting[language=Java,firstline=5,lastline=16,belowskip=-0.8\baselineskip]{DurationTest.java}
\end{answer}
\item For the \textit{other} set of method mutations, provide a minimum set of tests that kill the mutants only \textit{weakly}, but \textbf{not} \textit{strongly}. If that is not possible for a concrete mutant, explain why, and kill the mutants \textit{strongly} instead. If that is not possible as well, explain it.
\begin{answer}
The mutant in line 32 \verb|buf.length() != 2| cannot be killed because the effect of the unequal operation in the context of buffer length is equal to the strictly greater than, because the buffer is always prefixed with \texttt{PT}. Of the remaining mutants, the following cannot be weakly killed (note, that all of those were optional as per your ROR definition):
\begin{center}
\begin{tabular}{r|c|l}
Line & Ground String & Mutant \\ \hline \hline
26 & \verb|hours != 0| & \verb|hours == 0| \\
29 & \verb|minutes != 0| & \verb|minutes < 0| \\
36 & \verb|secs == -1| & \verb|secs <= -1| \\
46 & \verb|secs < 0| & \verb|secs >= 0| \\
51 & \verb|lastChar(buf) == '0'| & \verb|lastChar(buf) != '0'|
\end{tabular}
\end{center}
Most of them are not weakly killable for the same reason: if the line is reached and infection is observed during execution, it will \textbf{always} cause the control flow to change into another branch and infect the string builder buffer. Except for the mutant in line~51, where in addition an infection will likely cause IndexOutOfRangeExceptions for the following decimal point insertion in line~54.
\lstinputlisting[language=Java,firstline=18,lastline=44,belowskip=-0.8\baselineskip]{DurationTest.java}
\end{answer}
\item For both applied operators, select one concrete mutation and describe the conditions that need to hold for the \textit{Reachability}, \textit{Infection}, and \textit{Propagation} of the induced change.
\begin{answer}
\begin{itemize}
\item \verb+secs < 0 || nanos > 0+ mutation in line~35. In order for the control flow to reach this line the guard in line~32 must be false, which means, either the seconds (not total seconds) or nanoseconds are not zero and still have to be appended, or no hours and minutes have already been appended so we have to append seconds, regardless if they are zero or not. To infect this line we have to create a wrong output in the mutation:
\begin{center}
\begin{tabular}{c|c||c|c|c}
\verb+secs < 0+ & \verb+nanos > 0+ & \verb+&&+ & \verb+||+ & Infected \\ \hline \hline
F & F & F & F & F \\
F & T & F & T & T \\
T & F & F & T & T \\
T & T & T & T & F
\end{tabular}
\end{center}
So, in order to infect the line we have have to make sure exactly one of the conditions from the above table is true. In this case the error will always propagate because in case of infection the buffer will be changed \textbf{or} returned immediately.
The conditions for the RIP model are as follows:
\begin{itemize}
\item \textbf{Reachability:} $\texttt{secs} \neq 0 \lor \texttt{nanos} \neq 0 \lor (\texttt{hours} = 0 \land \texttt{minutes} = 0)$
\item \textbf{Infection:} $\texttt{secs} < 0 \oplus \texttt{nanos} > 0$
\item \textbf{Propagation:} $\top$
\end{itemize}
\item \verb+nanos < 0+ in line~35. Reachability is similar to the above (here I am conforming to what we discussed in a previous exercise, where I was told to ignore boolean short circuiting on high-level language view and I will only consider predicate logic). Reachability is similar to the above example, because the source code line is the same.
\begin{center}
\begin{tabular}{c||c|c|c}
\texttt{nanos} & \verb+nanos < 0+ & \verb+nanos > 0+ & Infected \\ \hline \hline
0 & F & F & F \\
positive & F & T & T \\
negative & T & F & T
\end{tabular}
\end{center}
There are only three possible combinations of truth values for the infection matrix, because it is not possible for both to be true at the same time. So infection will always happen if the \texttt{nanos} are non-zero.
The only possible propagation method for this predicate is the branch in line~35 that is indirectly decided by the mutated statement. If the seconds are negative the branch is directly decided by the infection. If the seconds are non-negative the infection does not change the outcome of the branch predicate and will not propagate.
The conditions for the RIP model are as follows:
\begin{itemize}
\item \textbf{Reachability:} $\texttt{secs} \neq 0 \lor \texttt{nanos} \neq 0 \lor (\texttt{hours} = 0 \land \texttt{minutes} = 0)$
\item \textbf{Infection:} $\texttt{nanos} \neq 0$
\item \textbf{Propagation:} $\texttt{secs} < 0$
\end{itemize}
\end{itemize}
\end{answer}
\begin{table}[t]
\footnotesize
\hspace*{-1cm}\begin{tabular}{|c|c|l|}
\hline
\textbf{Abbr.} & \textbf{Mutation Operator Name} & \textbf{Mutation (\textit{a} is a constant or expression)} \\ \cline{1-3}
ABS & Absolute Value Insertion & $a \rightarrow abs(a), -abs(a), failOnZero(a)$ \\ \cline{1-3}
UOI & Unary Operator Insertion & $a \rightarrow \{!,-,+,\sim\}a$\\ \cline{1-3}
UOD & Unary Operator Deletion & $\{!,-,+,\sim\}a \rightarrow a$\\ \cline{1-3}
AOR & Arithmetic Operator Replacement & $a+b \rightarrow a \{-,*,/,**,\%\} b, a, b$\\ \cline{1-3}
ROR & Relational Operator Replacement & $a>b \rightarrow a \{>=,<,<=,==,!=\} b, true, false$\\ \cline{1-3}
COR & Conditional Operator Replacement & $a\&\&b \rightarrow a \{||,\&,|,\hat{}\} b, true, false, a, b$\\ \cline{1-3}
SOR & Shift Operator Replacement & $a<<b \rightarrow a \{>>,>>>\} b, a$\\ \cline{1-3}
LOR & Logical Operator Replacement & $a\&b \rightarrow a \{|,\hat{}\} b, a, b$\\ \cline{1-3}
ASR & Assignment Operator Replacement & $a+=b \rightarrow a \{-=,*=,/=,\%=,\&=,|=,\hat{}=,<<=,>>=,>>>=\} b$\\ \cline{1-3}
SVR & Scalar Variable Replacement & $a=b*c \rightarrow a=b*b,a=b*a,a=a*c,a=c*c,b=b*c,c=b*c$\\ \cline{1-3}
BSR & Bomb Statement Replacement & instruction $\rightarrow$ Bomb() (throws exception when reached)\\
\hline
\end{tabular}
\caption{Exemplary mutation operators with their introduced changes for an expression / instruction}
\label{table:operators}
\end{table}
\end{enumerate}
\end{question}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Task 3 %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{question}{Application to software project}[8]
\begin{enumerate}[topsep=0pt, leftmargin=*]
\item As a group, decide on \textbf{one test suite} (original + own tests) that you want to use for this task
\begin{answer}
The current head of the \texttt{develop} branch will be used by our group to test our project.
\end{answer}
\item In consultation with your group, pick a unique mutation tool for Java code (i.e., no two members of one group should use the same tool).
\begin{answer}
I picked the Major mutation tool for testing our project. Note that, because the mutation testing tool does not integrate well with Junit we have to create an additional test driver class that manually runs the Junit tests. Because of this manual effort I restrict the test to the \texttt{JSONScanner} class and mutation testing to its \texttt{checkTime} function that I have been developing tests for in the previous exercise.
\end{answer}
\item Depending on the capabilities of the selected mutation testing tool, determine the amount of \textit{killed mutants} for at least \textbf{3 different mutation operators} (handle the results individually as long as your tool returns the results for each individual operator, or supports the application of single selected operators).
\begin{answer}
For the method under test the major tool generated a mutated class with 6239 mutations, 112 of those in the \texttt{checkTime} function. For this method the tool managed to create 28 COR mutations, 11 LVR mutations, 63 ROR mutations and 10 STD mutations. I did not manage to analyze the mutations that were killed. This is probably due to incompatible build tools (Major uses Java 1.7 while fastjson uses Java 1.5). For this reason I took the mutations from the generated log (attached as file \texttt{mutants.txt}) and for all 4 applicable mutation operators in the method under test manually edited the Java file and ran the test suite. The result can be seen in the below table:
\begin{center}
\begin{tabular}{c|c|c|c|c}
Operator & Line & Original & Mutation & Killed \\ \hline \hline
ROR & 729 & \verb+h0 == '0'+ & \verb+h0 <= '0'+ & live \\ % ID: 2275
ROR & 729 & \verb+h0 == '0'+ & \verb+h0 >= '0'+ & killed \\ % ID: 2276
ROR & 729 & \verb+h0 == '0'+ & \verb+false+ & killed \\ % ID: 2277
ROR & 730 & \verb+h1 < '0'+ & \verb+h1 != '0'+ & killed \\ % ID: 2278
ROR & 730 & \verb+h1 < '0'+ & \verb+h1 <= '0'+ & live \\ % ID: 2279
COR & 730 & \verb+h1 < '0' || h1 > '9'+ & \verb+h1 < '0' != h1 > '9'+ & live \\ % ID: 2284
COR & 730 & \verb+h1 < '0' || h1 > '9'+ & \verb+h1 < '0'+ & killed \\ % ID: 2285
COR & 730 & \verb+h1 < '0' || h1 > '9'+ & \verb+h1 > '9'+ & killed \\ % ID: 2286
COR & 730 & \verb+h1 < '0' || h1 > '9'+ & \verb+true+ & killed \\ % ID: 2287
STD & 731 & \verb+return false;+ & \verb+<NO-OP>+ & killed \\ % ID: 2289
STD & 735 & \verb+return false;+ & \verb+<NO-OP>+ & live \\ % ID: 2304
STD & 739 & \verb+return false;+ & \verb+<NO-OP>+ & killed \\ % ID: 2319
LVR & 763 & \verb+false+ & \verb+true+ & killed \\ % ID: 2382
LVR & 766 & \verb+false+ & \verb+true+ & killed \\ % ID: 2384
LVR & 769 & \verb+true+ & \verb+false+ & killed % ID: 2386
\end{tabular}
\end{center}
\end{answer}
\item For each of the applied mutation operators, pick a distinct method of your project that was affected by that operator, and highlight / describe the instructions that are changed. Explain how the test suite does or does not satisfy the criteria of \textit{Reachability}, \textit{Infection}, and \textit{Propagation} for that part.
\begin{answer}
\begin{center}
\begin{tabular}{l|c|c|c|c|c}
Function & Operator & Line & Original & Mutation & Killed \\ \hline \hline
\texttt{scanISO8601DateIfMatch} & EVR & 442 & \verb#c0# & \verb#0# & live \\ % ID: 1075
\texttt{checkTime} & COR & 730 & \verb+h1 < '0' || h1 > '9'+ & \verb+h1 < '0'+ & killed \\ % ID: 2285
\texttt{subString} & AOR & 161 & \verb#offset + count# & \verb#offset % count# & killed \\ % ID: 195
\end{tabular}
\end{center}
For the EVR the test suite fails does not cover the line. If \verb+c0+ was non-zero, infection would be guaranteed, but since the mutation is not reachable by the test suite it is a live mutation.
The COR has been killed strongly, because the line of code was reached by my extended test suite from the last exercise, and infection has been observed, because I had added a test case that has \verb+h1 > '9'+ which is not caught by the mutated code. The infection propagates directly because the condition directly branches to the return statement.
The AOR in the \texttt{subString} function is also killed by three of my \texttt{inputTokenTest} methods from a previous exercise and also two of Mark's tests. This method is at the core of the scanner as it is used to stringify the buffer at the current location to pass it on to specific parser methods. It is thus easily reached by the test suite and will most certainly infect any valid inputs. The propagation is also rather trivially visible as the infected method will return an incorrect string length and thus wrong buffer contents.
\end{answer}
\item Pick one suitable method (restricted to int values) or class from your project, and derive an automaton expressing its behaviour. For the creation of your automaton, use the \textit{Uppaal} model checker environment.
\begin{answer}
For this exercise I selected a helper method from the \texttt{TypeUtils} class, namely the \texttt{num(char, char, char, char)} method that takes 4 digit characters and converts them to the corresponding 4 digit number. If the digits are invalid, i.e. non-digit characters, the output is $-1$.
\end{answer}
\item Formulate at least 2 formal requirements that your model / method needs to satisfy, create corresponding \textit{queries} in Uppaal, and execute them to assure that your base model satisfies them. Enable \texttt{Options $\rightarrow$ Diagnostic Trace $\rightarrow$ Some} to get a counterexample trace in case that a queried formula is not satisfied. This trace can be inspected in the \texttt{Simulator} tab.
\begin{answer}
\begin{enumerate}
\item \texttt{A[] not deadlock || NUM.End} assures that the model will never deadlock except for the end location, meaning that the function always returns. This property is easily satisfied as there are no loops or complex behaviours in the model.
\item \texttt{A[] NUM.End imply (NUM.c >= -1 \&\& NUM.c < 10000)} assures that the return value of the function is always a 4 digit number or negative one. This property is also satisfied by the model.
\item \texttt{A[] (NUM.End imply (NUM.c1 == 48 \&\& NUM.c1 == 49 \&\& NUM.c2 == 50 \&\& NUM.c3 == 51 imply NUM.c == 1234))} is a simple test case for the assertion \texttt{num('1', '2', '3', '4') == 1234}.
\item \texttt{A[] (NUM.End \&\& NUM\_MUT.End) imply (NUM.c == NUM\_MUT.c)} verifies that for all executions of the function and the mutated function, i.e. when the functions each have reached their end state, have equal return values. This indicates that the mutation does not influence the function behaviour. In our example this property does not hold, because the mutation removes a boundary check. UPPAAL returns as a counterexample trace the inputs \texttt{num('9', '1', '6', ':')} for which the ground string function correctly returns $-1$ while the mutation missing the boundary check for the last character outputs $9170$.
\end{enumerate}
\end{answer}
\item Apply \textit{one} mutation operator from Table~\ref{table:operators} to \textit{one} single expression in your model, and re-evaluate your queries. Explain why the verification results of the model changed or remained the same. If a counterexample is created, note the test vector that kills the mutant.
\begin{answer}
Here are the results for each of the properties from the previous subtask.
\begin{enumerate}
\item Since only one branch property changed and not the model structure the function still does not deadlock so the property still holds and cannot be killed by this property.
\item This property strongly killes the mutation because an edge case is now missing for the last character. The validator gives the counterexample trace input \texttt{num('9', '9', '9', ':')}, which now incorrectly outputs $10000$. The result should be $-1$ because \texttt{':'} isn't a digit character.
\item The last property still holds because the simple test case does not catch the mutated edge case.
\item This property already compares the function to its mutation so the task does not apply here.
\end{enumerate}
\end{answer}
\item Reroute one transition in your model (i.e., connect the edge to a different destination node), and re-evaluate your queries. Explain why the verification results of the model changed or remained the same. If a counterexample is created, note the test vector that kills the mutant.
\begin{answer}
With the branch mutation (see model file), most properties still hold except for the deadlock property. This is because the properties are written as \texttt{A[] NUM.End imply ...}, but since the mutated version never reaches the end this property is always true. The deadlock property however is violated because now the model can be in a deadlock state without reaching the end state (note, that for modelling a deadlock here we require another guard to avoid a livelock situation here).
\end{answer}
\end{enumerate}
\end{question}
\end{document}