software-testing/project_task_sheets/phase_research/SpecialSession/SpecialSession01_MichaelChen.tex

160 lines
8.3 KiB
TeX
Raw Permalink Normal View History

2022-05-09 13:56:14 +02:00
\documentclass[a4paper]{scrreprt}
\usepackage[left=4cm,bottom=3cm,top=3cm,right=4cm,nohead,nofoot]{geometry}
\usepackage{graphicx}
\usepackage{tabularx}
\usepackage{listings}
\usepackage{enumitem}
\usepackage{subcaption}
\usepackage{amsmath}
\usepackage{float}
\usepackage{fancyvrb} % for "\Verb" macro
\usepackage{hyperref}
\usepackage{csquotes}
\usepackage[acronym]{glossaries}
\usepackage{pgf}
\usepackage{tikz}
\usetikzlibrary{arrows,automata}
\newacronym{svm}{SVM}{support-vector machine}
\newacronym{nb}{NB}{naive Bayes}
\newacronym{roc}{ROC}{receiver operating characteristic}
\usepackage{xparse}
\usepackage{multirow}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\setlength{\textfloatsep}{16pt}
\renewcommand{\labelenumi}{\alph{enumi})}
\renewcommand{\labelenumii}{\arabic{enumii}) }
\newcommand{\baseinfo}[5]{
\begin{center}
\begin{tabular}{p{15cm}r}
\vspace{-4.5pt}{ \Large \bfseries #1} & \multirow{2}{*}{} \\[0.4cm]
#2 & \\[0.5cm]
\end{tabular}
\end{center}
\vspace{-18pt}\hrule\vspace{6pt}
\begin{tabular}{ll}
\textbf{Name:} & #4\\
\textbf{Group:} & #5\\
\end{tabular}
\vspace{4pt}\hrule\vspace{2pt}
\footnotesize \textbf{Software Testing} \hfil - \hfil Summer 2022 \hfil - \hfil #3 \hfil - \hfil Sibylle Schupp / Sascha Lehmann \hfil \\
}
\newcounter{question}
\NewDocumentEnvironment{question}{m o}{%
\addtocounter{question}{1}%
\paragraph{\textcolor{red}{Task~\arabic{question}} - #1\hfill\IfNoValueTF{#2}{}{[#2 P]}}
\leavevmode\\%
}{%
\vskip 1em%
}
\NewDocumentEnvironment{answer}{}{%
\vspace{6pt}
\leavevmode\\
\textit{Answer:}\\[-0.25cm]
{\color{red}\rule{\textwidth}{0.4mm}}
}{%
\leavevmode\\
{\color{red}\rule{\textwidth}{0.4mm}}
}
\newcommand{\projectinfo}[5]{
\baseinfo{Special Session #1 - Submission Sheet}{#2}{#3}{#4}{#5}
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\def\name{Michael Chen}
\def\group{Group 01 (fastjson)}
\begin{document}
\projectinfo{1}{Software Testing - Introduction Write-Up\small}{\today}{\name}{\group}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Task 1 %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{question}{Analysis Task (for the given introduction text)}[0]
\begin{enumerate}[topsep=0pt, leftmargin=*]
\item Read the provided \textit{Introduction} carefully.
\item Summarize the main contributions of the paper as depicted in / conceived from the \textit{Introduction}.
\begin{answer}
\begin{enumerate}
\item Introduction 1:
Email spam and malware is a major problem, therefore the study compares the performance of three different supervised machine learning based spam classification algorithms. The main contribution of this paper is the review of the the performance of those algorithms using six different performance metrics. This paper identified \gls{svm} models as the most reliable means of email filtering.
2022-05-09 18:25:56 +02:00
\item Introduction 2:
This paper introduces a sorting algorithm with a runtime and space complexity of $\mathcal{O}(n)$.
2022-05-09 13:56:14 +02:00
\end{enumerate}
\end{answer}
\item Come up with two title suggestions that appropriately describe the (conceived) topic of the paper.
\begin{answer}
\begin{enumerate}
\item Introduction 1:
\enquote{Comprehensive Study on Supervised Machine Learning Models for Email Spam Filtering}
2022-05-09 18:25:56 +02:00
\item Introduction 2:
\enquote{Linear Complexity Integer Sorting Algorithm}
2022-05-09 13:56:14 +02:00
\end{enumerate}
\end{answer}
\item When you are done with a) - c), ask the supervisor for the actual title and topic of the paper.
\begin{answer}
\begin{enumerate}
\item Introduction 1:
2022-05-09 18:25:56 +02:00
The actual title is \enquote{Analysis and result of classification algorithm on email classification}.
\item Introduction ":
The actual title is \enquote{Darksort: A New Linear Sorting Algorithm}.
2022-05-09 13:56:14 +02:00
\end{enumerate}
\end{answer}
\item What are the main shortcomings of the \textit{Introduction} in its given form?
\begin{answer}
\begin{enumerate}
\item Introduction 1:
The introduction of this paper is not as bad as I expected it to be, given the task question. The introduction well outlines the environment and use case of email communication and establishes the rationale of why the paper is important right now and why it should be of interest to research. The introduction also mentions prior research of several \gls{nb} models that will be compared with the new research. The methodology is also outlined (the different performance metrics that are applied). The main shortcomings that I could identify are the following:
\begin{enumerate}
\item The introduction should end with the statement that summarizes the main idea of the entire research.
\item The introduction is also missing an outline of how the paper is structured. After reading the introduction the reader is left without a guide on how to navigate the paper.
\item There should not be any full URL citations in the introduction, and generally not in the paper, rather as a citation reference in the bibliography.
\item There are multiple unexpanded acronyms of the different \gls{nb} models and the \gls{roc} curve.
\item There are some phrases that are not formal language such as \enquote{waste of time} and \enquote{such huge spam}.
\item The word email and some others is used very inconsistently. Just because other formats of the word were introduced it does not mean that you should switch between those so much.
\item Many consecutive sentences start with the same word, specifically \enquote{it} and \enquote{email}.
\item There are some spacing and formatting issues: email with wrong spacing \enquote{e-~mail}, missing Oxford commas like \enquote{NB, NBT, BN and DTNB}, bad citations \enquote{(Nilam et al.,2017~)}, the word \enquote{f~measure} should be \enquote{F-score}, and multiple occurrences of inconsistent multiple whitespace fillers.
\end{enumerate}
2022-05-09 18:25:56 +02:00
\item Introduction 2:
\begin{enumerate}
\item The research context is missing entirely. The introduction should mention existing sorting algorithms.
\item The importance of sorting algorithms must be highlighted, as a better complexity class for sorting algorithms could immensely help in scaling some performance critical applications.
\item There is no need to introduce \enquote{Darksort} in upper and lower case.
\item Sentences should mostly be self contained and not refer to other sentences from three sentences earlier: \enquote{It is an integer sorting algorithm}.
\item The outline of the paper structure is missing entirely.
\item The first sentence is informal and should rather refer to the author in plural form: \enquote{we propose a new sorting algorithm}.
\end{enumerate}
2022-05-09 13:56:14 +02:00
\end{enumerate}
\end{answer}
\item Make suggestions on how to improve the \textit{Introduction}. Which improvements would you prioritize, and why?
\begin{answer}
\begin{enumerate}
\item Introduction 1:
Obviously, the highest priortiy should be in fixing \textit{all} formatting and spacing issues. The second highest priority should be fixing the informal language and the repeating sentence starts. Finally, I would add the missing outline of the paper layout as this helps any reader understand and navigate the research paper more quickly and effectively.
2022-05-09 18:25:56 +02:00
\item Introduction 2:
First, I would work on actually writing an introduction into the research context. While most readers will know what a sorting algorithm is, even if very brief, an introduction should not omit this entirely. Breakthroughs in sorting complexity can have a great impact on other research in many different fields of study and this should also be highlighted. After this we should actually work on a paper structure outline and a description of the methods used in this paper to analyze the new algorithm.
2022-05-09 13:56:14 +02:00
\end{enumerate}
\end{answer}
\end{enumerate}
\end{question}
\end{document}