{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lecture 3: Basic algorithms"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Random generation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Random numbers can be generated with the `random` module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "number = random.randint(1, 100)\n",
    "print(number)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generate 10 numbers between 1 and 100:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "numbers = []\n",
    "for i in range(10):\n",
    "    numbers.append(random.randint(1, 100))\n",
    "print(numbers)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Same with using Python's list generator expressions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "numbers = [random.randint(1, 100) for _ in range(10)]\n",
    "print(numbers)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Help:*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "numbers = [10, 20, 30, 40, 50]\n",
    "print(numbers)\n",
    "\n",
    "# iteration\n",
    "halves = []\n",
    "for x in numbers:\n",
    "    halves.append(x / 2)\n",
    "print(halves)\n",
    "\n",
    "# list generation\n",
    "halves = [x / 2 for x in numbers]\n",
    "print(halves)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pseudo-random numbers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is no real randomicity in computer science (unless you build a device which measures e.g. cosmic radiation). These numbers generated are so called pseudo-random numbers. They are generated by a deterministic mathematical algorithm along a uniform distribution.\n",
    "\n",
    "The following algorithm will always generate the same number, because we seeded the algorithm a fix initial value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "random.seed(42) # To reproduce the same results\n",
    "data = [random.randint(1, 100) for _ in range(10)]\n",
    "print(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default the seed value is related to the milliseconds passed since the start of the computer, hence it will look like real \"random\" numbers."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $f: [m..n] \\rightarrow H$ be a function. Let the addition operator $+$ be defined over the elements of $H$, which is an associative operation with a left identity element $0$. Our task is to summarize the values of $f$ function over the interval.\n",
    "Formally:\n",
    "$$s = \\sum_{i=m}^{n} f(i)$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Counting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given the $[m..n]$ interval, count the number of items inside it.\n",
    "Formally:\n",
    "$$s = \\sum_{\\substack{i=m}}^{n} 1$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Maximum search"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $f: [m..n] \\rightarrow H$ be a function, $m \\le n$. Over the elements of $H$ let a total ordering relation be defined (reflexivity, antisymmetry, tranzitivity and connexcity), with a symbol $\\le$, for the strict version $<$.\n",
    "Our task is to determine the greatest value in the interval. Also determine an element of the interval, where function $f$ evalutes to this greatest value.\n",
    "Formally:\n",
    "$$max = f(ind) \\wedge \\forall i \\in [m..n]: f(i) \\le f(ind)$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Linear search"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $\\beta: [m..n] \\rightarrow \\mathbb{L}$ condition be defined. Determine the first element of the interval which fulfills the condition (if any).\n",
    "Formally:\n",
    "\\begin{equation*}\n",
    "\t\\begin{aligned}\n",
    "\t\tl = (\\exists i \\in [m..n]:\\beta(i)) \\\\\n",
    "\t\tl \\rightarrow (ind \\in [m..n] \\wedge \\beta(ind) \\wedge \\forall i \\in [m..ind-1]: \\neg\\beta(i))\n",
    "\t\\end{aligned}\n",
    "\\end{equation*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Beta condition"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Introduce an `is_even` function which determine whether a number is even or not:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def is_even(number):\n",
    "    return number % 2 == 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Solution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conditional summation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The algorithm of summation can be further generalized when a $\\beta: [m..n] \\rightarrow \\mathbb{L}$ condition is defined to restrict the set of elements.\n",
    "\n",
    "Let $f: [m..n] \\rightarrow H$ be a function and $\\beta: [m..n] \\rightarrow \\mathbb{L}$ a condition. Let the addition operator $+$ be defined over the elements of $H$, which is an associative operation with a left identity element $0$. Our task is to summarize the values of $f$ function over the interval where the $\\beta$ condition is fulfilled.\n",
    "Formally:\n",
    "$$s = \\sum_{\\substack{i=m\\\\ \\beta(i)}}^{n} f(i)$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conditional counting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $\\beta: [m..n] \\rightarrow \\mathbb{L}$ be a condition. Count how many items of the interval fulfills the condition!\n",
    "Formally:\n",
    "$$s = \\sum_{\\substack{i=m\\\\ \\beta(i)}}^{n} 1$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conditional maximum search"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The algorithm of maximum search can be further generalized with combining the $\\beta: [m..n] \\rightarrow \\mathbb{L}$ condition used in *linear search* as a restriction. Note that now the existence of a maximum value is not guaranteed.\n",
    "\n",
    "Let $f: [m..n] \\rightarrow H$ be a function and $\\beta: [m..n] \\rightarrow \\mathbb{L}$ a condition. Over the elements of $H$ let a total ordering relation be defined (reflexivity, antisymmetry, tranzitivity and connexcity), with a symbol $\\le$, for the strict version $<$.\n",
    "Our task is to determine the greatest value in the interval which fulfills the $\\beta$ condition. Also determine an element of the interval, where function $f$ evalutes to this greatest value.\n",
    "Formally:\n",
    "\\begin{equation*}\n",
    "\t\\begin{aligned}\n",
    "\t\tl = (\\exists i \\in [m..n]:\\beta(i)) \\\\\n",
    "\t\tl \\rightarrow (\\beta(ind) \\wedge max = f(ind) \\wedge \\forall i \\in [m..n]: \\beta(i) \\rightarrow f(i) \\le f(ind))\n",
    "\t\\end{aligned}\n",
    "\\end{equation*}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Task:** read the area and population data from `02_neighbours.csv`. Calculate the population density for each neighbouring country. Determine which country has the highest population density."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Approach 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By manually iterating through all the row with the `iterrows()` function. Example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "neighbours_df = pd.read_csv(\"02_neighbours.csv\")\n",
    "for index, row in neighbours_df.iterrows():\n",
    "    print(\"Index: %d, Name: %s\" % (index, row[\"Country\"]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now calculate the maximum density:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Approach 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By adding a new column to the *Pandas DataFrames*:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "def population_density(row):\n",
    "    return row['Population'] / row['Area']\n",
    "\n",
    "neighbours_df = pd.read_csv(\"02_neighbours.csv\")\n",
    "neighbours_df['Density'] = neighbours_df.apply(population_density, axis = 1)\n",
    "\n",
    "display(neighbours_df) # display is a special Jupyter Notebook function to provide a pretty display of complex data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then use the `max()` and `idxmax()` function of the *Pandas DataFrames*:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Logarithmic search (optional)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Also called binary search.*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $f: [m..n] \\rightarrow H$ be a *monotonically increasing* function. Over the elements of $H$ let a total ordering relation be defined (reflexivity, antisymmetry, tranzitivity and connexcity), with a symbol $\\le$, for the strict version $<$.\n",
    "Determine whether function $f$ evaluates to a given value $h \\in H$ at any location. If yes, specify such a location.\n",
    "Formally:\n",
    "$$l = (\\exists i \\in [m..n]: f(i) = h) \\wedge l \\rightarrow f(ind) = h$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}