software-testing/project_task_sheets/phase_02/project_phase02_tasks/Solution_Phase02_MichaelChen.tex

\documentclass[a4paper]{scrreprt}
\usepackage[left=4cm,bottom=3cm,top=3cm,right=4cm,nohead,nofoot]{geometry}
\usepackage{graphicx}
\usepackage{tabularx}
\usepackage{listings}
\usepackage{enumitem}
\usepackage{subcaption}
\usepackage{amsmath}
\usepackage{float}
\usepackage{fancyvrb} % for "\Verb" macro
\usepackage{hyperref}

\usepackage{pgf}
\usepackage{tikz}
\usetikzlibrary{arrows,automata}

\usepackage{xparse}
\usepackage{multirow}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\setlength{\textfloatsep}{16pt}

\renewcommand{\labelenumi}{\alph{enumi})}
\renewcommand{\labelenumii}{\arabic{enumii}) }

\newcommand{\baseinfo}[5]{
  \begin{center}
    \begin{tabular}{p{15cm}r}
      \vspace{-4.5pt}{ \Large \bfseries #1} & \multirow{2}{*}{} \\[0.4cm]
      #2 & \\[0.5cm]
    \end{tabular}
  \end{center}
  \vspace{-18pt}\hrule\vspace{6pt}
  \begin{tabular}{ll}
    \textbf{Name:}  & #4\\
    \textbf{Group:} & #5\\
  \end{tabular}
  \vspace{4pt}\hrule\vspace{2pt}
  \footnotesize \textbf{Software Testing} \hfil - \hfil Summer 2022 \hfil - \hfil #3 \hfil - \hfil Sibylle Schupp / Sascha Lehmann \hfil \\
}

\newcounter{question}
\NewDocumentEnvironment{question}{m o}{%
  \addtocounter{question}{1}%
  \paragraph{\textcolor{red}{Task~\arabic{question}} - #1\hfill\IfNoValueTF{#2}{}{[#2 P]}}
  \leavevmode\\%
}{%
  \vskip 1em%
}

\NewDocumentEnvironment{answer}{}{%
  \vspace{6pt}
  \leavevmode\\
  \textit{Answer:}\\[-0.25cm]
  {\color{red}\rule{\textwidth}{0.4mm}}
}{%
  \leavevmode\\
  {\color{red}\rule{\textwidth}{0.4mm}}
}

\newcommand{\projectinfo}[5]{
  \baseinfo{Project Phase #1 - Submission Sheet}{#2}{#3}{#4}{#5}
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\def\name{Michael Chen}
\def\group{Group 01 (fastjson)}

\begin{document}
\projectinfo{2}{Software Testing - Input Space Partitioning\small}{\today}{\name}{\group}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Task 1 %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{question}{Answer basic questions on ISP-Testing}[3]
\begin{enumerate}[topsep=0pt, leftmargin=*]
  \item Define the following terms - related to ISP - in your own words:
  \begin{enumerate}
      \item Input Domain
      \begin{answer}
      The input domain is a set that contains all possible sets of input values a program accepts. For instance a function \texttt{int test(bool x, int y)} has the input domain:
      \begin{equation}
        \left\{ \left\{\texttt{true},0\right\}, \left\{\texttt{false},0\right\}, \left\{\texttt{true},1\right\}, \left\{\texttt{false},1\right\}, \left\{\texttt{true},-1\right\}, \left\{\texttt{false},-1\right\}, \dots \right\}
      \end{equation}
      Note however, that the input values to a function, especially in low-level programming also includes side-effects such as object state, global variables, persistent data, user input, and many others inherited from the environment. This makes pure functions or lambda functions with explicit captures easier to model.
      \end{answer}

      \item Domain Partition (include the two criteria a partitions needs to fulfill)
      \begin{answer}
      A domain partition divides the domain in blocks that are pairwise distinct, meaning every element in the domain is contained by at most one of the blocks (\textit{disjointness}), and the union of those blocks is equivalent to the entire domain, meaning every element in the domain is contained by at least one of the blocks (\textit{completeness}). It follows that every element in the domain is contained by one and only one block in the partition.
      \end{answer}

      \item Characteristics
      \begin{answer}
      A characteristic is a property of a domain that can be used to generate a corresponding partition that might be useful for testing. Characteristics usually capture interesting values or relations between different input values.
      \end{answer}

      \item Equivalence Class
      \begin{answer}
      An equivalence class is set of values of a domain that are equivalent to another given an application specific equivalence relation, i.e. and area equivalence relation would create a set of equivalence classes of arbitrary shapes that have the same area. Using an equivalence relation corresponding to a well formed characteristic you can partition a domain.
      \end{answer}

  \end{enumerate}

  \item Which testing situations are suitable for the ISP approach?
  \begin{answer}
  ISP is suitable for complex functions that require extensive testing but it is infeasible to test even large parts of the entire input domain. In such scenarios ISP is a tool that can be used to systematically apply input selection from different functionally categories of inputs. Instead of blindly picking inputs ISP can help picking specific values that ideally trigger every possible scenario of errors and additionally it helps you to realize when enough tests were executed because the partitions are complete (of course the characteristics for partitioning have to be ideally complete).
  \end{answer}

  \item Name and briefly explain the two main strategies to model the input domain. What are their significant pros and cons?
  \begin{answer}
  \textbf{Interface-Based:} We extract characteristics from the input parameters without knowledge about the relation between input parameters. This allows for a rather easy and automatable identification of characteristics, however it does not incorporate any knowledge of the semantics of the test subject and might thus miss important cases where the functionality is dependent on the input combination.

  \textbf{Functionality-Based:} Here we create characteristics from the behaviour of the program or the output values. This approach requires domain knowledge as well as knowledge of the semantics of the program which makes it reasonably harder to develop characteristics. Sometimes program specifications directly correspond to possible characteristics by specifying the wanted behaviour of programs which \textit{can} sometimes make it easy model the input domain and even develop the IDM in the earlier development stages because specifications are made before the program is implemented. This method is harder to automate because it requires domain knowledge, but it might yield more effective tests for the same reason (No-free-lunch theorem).
  \end{answer}

  \item Using the standard values that may be provided in the specifications is one way to select the actual test values from each partition block. Which other strategies can you consider to derive representative values from a given partition? Name and describe two of them.
  \begin{answer}
  \textbf{Pair-wise:} With this strategy we have to select input values from each block and combine them with values from all other blocks. This ensures that most combinations of blocks are covered without having to test the absolute powerset of text values.

  \textbf{Base-choice:} This method again incorporates domain knowledge to select a base block from each characteristic. The base test is then performed using choices form all base blocks for each characteristic and then performing tests for the other characteristics keeping a base block constant and modifying only the other blocks. This massively decreases the amount of test cases while increasing the selection effort (No-free-lunch theorem, similar to functionality-based IDM).
  \end{answer}

\end{enumerate}
\end{question}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Task 2 %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{question}{Apply ISP-Testing to a sample component}[5]
\begin{enumerate}[topsep=0pt, leftmargin=*]
  \item Gather the results of the following sub-tasks as table:
  \begin{enumerate}
    \item The distinction between the minimum, maximum and remaining \texttt{int} values is a possible interface-based partition for the input parameters \textit{a} and \textit{b}. Come up with another interface-based characteristic and the corresponding partition.
    \item The distinction between valid and invalid divisions / modulo operations is a possible functionality-based partition for the input parameters \textit{a} and \textit{b}. Come up with another functionality-based characteristic and the corresponding partition.
  \end{enumerate}
  \begin{answer}
  See table~\ref{tab:characteristics}.
  \begin{table}
    \centering
    \begin{tabular}{llll}
    \hline
    Characteristic & Eq-Class 1 & Eq-Class 2 & Eq-Class 3 \\
    \hline
    $q_1 = \text{"\texttt{int} values"}$ & $\{a,b=\texttt{int.Max}\}$ & $\{a,b=\texttt{int.Min}\}$ & remaining \\
    $q_2 = \text{"signedness"}$ & $\{a,b>0\}$ & $\{a,b<0\}$ & $\{a,b=0\}$ \\
    $q_3 = \text{"valid \texttt{mod}"}$ & $\{b\neq{}0\}$ & $\{b=0\}$ \\
    $q_4 = \text{"divisor"}$ & $\{(a \mod b) = 0\}$ & $\{(a \mod b) \neq 0\}$ \\
    \hline
    \end{tabular}
    \caption{Equivalence classes for different characteristics}
    \label{tab:characteristics}
  \end{table}
  \end{answer}

  \item Can you identify any non-valid combinations of blocks from your characteristics? Develop the required constraints that prevent these combinations.
  \begin{answer}
  A simple invalid block combination is the valid modulo operation in combination with the signedness of $b$. We can mitigate that by disallowing $b$ to be zero. Also, when $a$ or $b$ is fixed at the maximum or minimum integer value we cannot combine this with all $q_4$ blocks.
  \end{answer}

  \item Derive one representative value for each of the blocks of the exemplary and your own characteristics. Which approach did you choose to select the values?
  \begin{answer}
  See table~\ref{tab:testvalues}.
  \begin{table}
    \centering
    \begin{tabular}{llll}
    \hline
    Characteristic & Val-Class 1 & Val-Class 2 & Val-Class 3 \\
    \hline
    $q_1 = \text{"\texttt{int} values"}$ & $(\texttt{int.Max},b)$ & $(\texttt{int.Min},b)$ & $(1,b)$ \\
    $q_2 = \text{"signedness"}$ & $(1,b)$ & $(-1,b)$ & $(0,b)$ \\
    $q_3 = \text{"valid \texttt{mod}"}$ & $(5,1)$ & $(5,0)$ \\
    $q_4 = \text{"divisor"}$ & $(20,5)$ & $(21,5)$ \\
    \hline
    \end{tabular}
    \caption{Examplary test values for different characteristics}
    \label{tab:testvalues}
  \end{table}
  \end{answer}

  \item Read the \texttt{acts\_user\_guide} to get a basic understanding of how to create a system for ACTS and how to generate test vector sets of this system.
  \begin{answer}
  See \texttt{Phase02\_Task2\_ACTS\_System.txt} file.
  \end{answer}

  \item Create a system named \textit{Phase02\_Task2\_ACTS\_System.txt} based on your characteristics and their representative block values. You can use the \texttt{Enum} type with meaningful names for any characteristics not representable as \texttt{Boolean}, \texttt{Number} or \texttt{Range}. Use ACTS in command line mode to generate the set of test vectors without constraints.
  \begin{answer}
  ACTS generated $81$ tests covering $411$ tuples in $0.6$ seconds.
  \end{answer}

  \item In the \texttt{Constraint} section of your system file, add the constraints you identified for valid combinations of blocks, and generate a new set of test vectors. Compare the number of test vectors of the unconstrained and constrained system. Check if the new set of test vectors still contains non-valid combinations, and adapt your constraints if necessary.
  \begin{answer}
  Now ACTS generated $83$ tests covering $312$ tuples in $12.9$ seconds and due to the constraints more than $50$ tuples were forbidden.
  \end{answer}

\end{enumerate}
\end{question}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Task 3 %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{question}{Apply ISP-Testing to your software project}[8]
\begin{enumerate}[topsep=0pt, leftmargin=*]
    \item Explain why the selected method is suitable for the input domain model approach (1-2 sentences)
    \begin{answer}
    I chose the \texttt{JSONScanner.token()} function as a test subject. This method is particularly interesting because it is does not have any direct function parameters but rather solely depends on the state of an object as an input parameter. This does not make the method particularly suitable for the IDM approach, but it is still important to test this method extensively because the entire project is based on this.
    \end{answer}

    \item Identify the input domain of your method (2-3 sentences)
    \begin{answer}
    The \texttt{JSONScanner} is quite a complex object and not all of the fields are of interest for this specific method so I will only name input parameters that are relevant for this test subject. Most significantly the text field of the scanner is of importance because it provides the scanner with the raw character data it has to tokenize (called $t$). Next the current token position and the token end position are important to keep track of the current state of the lexer. Finally the input text length is of importance as a boundary to the position parameters. This also influences the EOF state of the object, which signals if the input buffer is exhausted.
    \end{answer}

    \item Similar to task 2, gather the results of the following sub-tasks as shown in Table~\ref{table:example}:
    \begin{enumerate}
        \item Identify reasonable characteristics of the possible input data
        \item Derive sets of equivalence classes from these characteristics (Note: At least one of your partitions must consist of a number of blocks $\geq 3$. Also make sure that you have enough characteristics and corresponding blocks to derive $12$ distinct tests from them later on.)
    \end{enumerate}
    \begin{answer}
    See table~\ref{tab:scannercharacteristics}. Note: the predicates valid and invalid on $t$ correspond to the input containing a well-formed sequence of tokens or not, respectively.
    \begin{table}
      \centering
      \begin{tabular}{llll}
      \hline
      Characteristic & Eq-Class 1 & Eq-Class 2 & Eq-Class 3 \\
      \hline
      $q_1 = \text{"input token"}$ & others & \multicolumn{2}{c}{$\forall T : \{ t = "T" \}$} \\
      $q_2 = \text{"position"}$ & $\{0<\text{pos}<\text{len}-1\}$ & $\{\text{pos}=0\}$ & $\{\text{pos}=\text{len}-1\}$ \\
      $q_3 = \text{"EOF state"}$ & $\{\text{eof} = \texttt{true}\}$ & $\{\text{eof} = \texttt{false}\}$ \\
      $q_4 = \text{"input format"}$ & $\{t\text{ is valid}\}$ & $\{t\text{ is invalid}\}$ \\
      \hline
      \end{tabular}
      \caption{Characteristics for the \texttt{JSONScanner.token()} function}
      \label{tab:scannercharacteristics}
    \end{table}
    \end{answer}

    \item Can you identify any non-valid combinations of blocks from your characteristics? (1-2 sentences)
    \begin{answer}
    Because the EOF flag and the position are tightly coupled. If a token at the current position consumes the rest of the input buffer the object will signal an EOF soon. Another invalid combination is between characteristics $q_1$ and $q_4$, because trivially every input token as a string is also a well-formed trivial sequence of tokens, this the second class of $q_4$ is not achievable with any of the token strings.
    \end{answer}

    \item Derive (at least) one representative value for each equivalence class. Which strategy did you use for the value selection?
    \begin{answer}
    See table~\ref{tab:scannervalues}.
    \begin{table}
      \centering
      \begin{tabular}{llll}
      \hline
      Characteristic & Val-Class 1 & Val-Class 2 & Val-Class 3 \\
      \hline
      $q_1 = \text{"input token"}$ & \texttt{"5,:"} & \multicolumn{2}{c}{\texttt{""text""}, \texttt{"5"}, \texttt{":"}, $\dots$} \\
      $q_2 = \text{"position"}$ & $1$ & $0$ & $\text{len}-1$ \\
      $q_3 = \text{"EOF state"}$ & \texttt{true} & \texttt{false} \\
      $q_4 = \text{"input format"}$ & \multicolumn{2}{c}{\Verb+"\{"Name":"Michael","Last":"Chen"\}"+} & \texttt{"53fds"} \\
      \hline
      \end{tabular}
      \caption{Characteristics for the \texttt{JSONScanner.token()} function}
      \label{tab:scannervalues}
    \end{table}
    \end{answer}

    \item Combine the selected values from the partitions of each input argument to complete test vectors. You can decide whether you want to approach this sub-task manually or using \textit{ACTS} (therefore, only the resulting vectors should be documented).
    \begin{answer}
    See test suite in file \texttt{JSONScannerTest.java}.
    \end{answer}

    \item Create JUnit tests for your method, which trigger test runs with the selected test vectors, and assert each individual output. For the submission, merge all these tests into one \texttt{*.java} file.
    \begin{answer}
    See test suite in file \texttt{JSONScannerTest.java}.
    \end{answer}

\end{enumerate}
\end{question}

\end{document}