\documentclass[12pt]{article}
% Author: Sergey Kopeliovich
\usepackage[T2A]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{hyperref}
\usepackage{datetime}
\usepackage{cmap}
\usepackage{enumerate}
\usepackage{hologo}
\usepackage{minted}
\usepackage{lastpage}
\usepackage{amsmath}
\usepackage{amssymb}
\sloppy
\voffset=-20mm
\textheight=220mm
\hoffset=-25mm
\textwidth=180mm
\def\EPS{\varepsilon}
\def\SO{\Rightarrow}
\def\EQ{\Leftrightarrow}
\def\t{\texttt}
\def\O{\mathcal{O}}
\newcommand{\q}[1]{\langle #1 \rangle} %
\newcommand\URL[1]{{\footnotesize{\url{#1}}}}
\def\myindent{\hspace*{\parindent}}
\def\up{\vspace*{-\baselineskip}}
\def\NO{\t{\#}}
\newcommand{\ITEM}[1]{{\bf \underline{#1}}\hspace{0.5em}}
\makeatletter
\renewcommand{\@oddfoot}{
\parbox{\textwidth}{
%\hrule
%\vspace{6pt}
\sffamily
{{\hfil}\thepage/\pageref*{LastPage} \hfil}%
}
}
\newenvironment{MyList}{
\begin{enumerate}
\setlength{\parskip}{-5pt}
\setlength{\itemsep}{5pt}
}{
\end{enumerate}
}
\begin{document}
\begin{center}
{\Large \bf MIPT, camp 2014, day \NO{3}} \\
\vspace{0.5em}
{\Large \bf Theme: data structures \\ \vspace{0.2em} sqrt-decomposition, semiplanes intersection} \\
\vspace{0.5em}
{\large November 14, Sergey Kopeliovich} \\
\end{center}
\vspace{-1em}
\noindent \underline{\hbox to 1\textwidth{{ } \hfil{ } \hfil{ } }}
\section{Sqrt decomposition}
\myindent{}Consider searching substrings $s_1, s_2, \dots, s_k$ of equal length $l$ in the string $t$.
For each $i$ we are interested, if $s_i$ is a substring of $t$.
We can solve this problem in linear time, in $\O(lk+|t|)$ time, using hash table of polynomial hashes of the strings $s$.
Here you may ask, why don't we use Aho-Corasic? Let imagine, we just do not know this algorithm,
but already heard about hashes and hash tables.
Let we have strings of arbitrary lengths. How to upgrade our solution to previous problem?
The idea of sqrt decomposition helps. Let denote summary length of all strings $s_i$ as $L$ then
we may iterate lengths less than $\sqrt{L}$ in time $\O(L + |t|\sqrt{L})$.
Moreover we have at most $\O(\sqrt{L})$ strings of length at least $\sqrt{L}$, so
summary work time of solution "group string by length and proceed each length in linear time"
is also $\O(L + |t|\sqrt{L})$.
You may use this approach not only string problems. For example, consider problems about trees.
Let we have tree of $n$ vertices. It may contain a lot of vertices of degree less than $\sqrt{n}$,
but there are at most $\sqrt{n}$ vertices of degree $\sqrt{n}$.
\section{Sqrt decomposition on array}
\myindent{}Consider an abstract problem "we have an array $a$ and want to proceed many different hard queries on its subsegments".
Let start with the simplest problem. Query \NO{1}: sum on the segment. Query \NO{2}: change $a[i]$.
Let denote data structure which can proceed the first type queries in time $f(n)$ and can proceed queries of the second type in time $g(n)$ as $[f(n), g(n)]$.
For example, segment tree as well as Fenwick tree is $[\O(\log n), \O(\log n)]$ data structure.
Below we'll describe $[\O(1), \O(\sqrt{n})]$ and $[\O(\sqrt{n}), \O(1)]$ data structures.
Denote $k = \lfloor\sqrt{n}\rfloor$.
\vspace{0.5em}
Solution \NO{0}. Note, we may calculate partial sums of initial array \t{sum[i+1]=sum[i]+a[i]},
sum on the segment $[l,r]$ is equal to \t{sum[r+1]-sum[l]}, and during changing any $a[i]$ we may simply recalculate all the array \t{sum[]}.
Another way is do not precompute anything, we may calculate sum on the segment in linear time.
These are solutions in $[\O(n), \O(1)]$ and $[\O(1), \O(n)]$.
\vspace{0.5em}
Solution \NO{1} in $[\O(k), \O(1)]$. Let maintain array $s$, $s[i]$ is equal to sum of $a[j]$ for $j \in [ki..k(i+1))$.
Query "sum on the segment": the segment can be viewed as head + body + tail, where body consist of big parts whose sums we already know.
Head and tail are not longer than $k$.
Query "change $a[i]$": \t{set(i,x) \{ s[i/k] += x-a[i]; a[i] = x; \}}
\vspace{0.5em}
Solution \NO{2} in $[\O(1), \O(k)]$. Let maintain the same array $s$, and partial sums on it.
Moreover let maintain partial sums for each segment $[ki..k(i+1))$.
To change $a[i]$ we have to fully recalculate two arrays of partial sums (sums of small part, sums of $s$).
To get sum on the segment, we may view it as head + body + tail. For each of three parts we may get the sum in $\O(1)$ time.
\vspace{0.5em}
Solution \NO{3} in $[\O(k), \O(k)]$. Let maintain partial sums
\t{sum[i+1]=sum[i]+a[i]} and array of changes, which have been applied to array since partial sums were calculated. Denote with array as $Changes$.
One change is pair $\q{i,x}$. It denotes operation \t{a[i]+=x}.
Let maintain property $|Changes| \le k$.
Query "sum on the segment": sum on the segment $[l,r]$ in initial array was equal to \t{sum[r+1]-sum[l]},
in $\O(|Changes|)$ time we may calculate, how much it was changed.
Query "change $a[i]$": to set $a[i] := x$, let add to $Changes$ the pair $\q{i,x-a[i]}$, and put $x$ into $a[i]$.
If now $|Changes| > k$ then build partial sums of current version of the array in time $\O(n)$, and clear the list $Changes$.
Let denote this operation $Rebuild$. Note we'll call $Rebuild$ not so often. One time per $k$ queries. So
amortized time of proceeding one query is $\O(\frac{n}{k}) = \O(\sqrt{n})$.
\vspace{0.5em}
Last approach we'll call {\it{delayed operations}}.
This approach has no advantages solving this task. But it will be useful later.
\section{Sqrt decomposition on array: split \& rebuild}
Let solve more complicated task: we have four operations with the array
\begin{MyList}
\item {\it{Insert(i, x)}} -- insert $x$ on $i$-th position.
\item {\it{Erase(i)}} -- erase $i$-th element of the array.
\item {\it{Sum(l,r,x)}} -- calculate sum of elements greater than $x$ on the segment $[l,r]$.
\item {\it{Reverse(l,r)}} -- reverse the segment $[l,r]$.
\end{MyList}
To solve this problem, at first, let answer to query {\it{Sum(l,r,x)}} (the only query with some answer),
applied to all the array, ie query {\it{Sum(0,n-1,x)}}.
Let we have no other types of queries, only the queries of type {\it{Sum}}.
So we can easily maintain sorted array and partial sums on it.
To get answer to query let do binary search and get sum on the suffix.
The time to build the structure is $\O(n \log n)$, the time to answer one query is $\O(\log n)$.
Let solve full version of the problem.
We have the array $a[0..n)$. We will maintain some partition of the array into segments $T = [A_1, A_2, \dots, A_m]$.
For each segment $A_i$ we store two versions -- initial and sorted with partial sums.
We will maintain two properties: $\forall i \colon |A_i| \le \sqrt{n}$ and $m < 3\sqrt{n}$.
Initially let split the array into $k = \sqrt{n}$ segments of length $\sqrt{n}$.
For each of $k$ segments we'll call operation {\it Build} which sorts the segments and calculates partial sums.
The time spent for each segment is $\sqrt{n}\log n$
Now let describe the main operation {\it Split(i)},
which returns such $j$, that $i$-the element is the beginning of $j$-th segment.
If $i$ is not any beginning, find $A_j = [l,r)$, such that $l < i < r$,
and split it into two segments $B = [l,i)$ and $C = [i,r)$. For segments $B$ and $C$ call {\it Build},
we've got new partition of the array into segments: $T' = [A_1, A_2, \dots, A_{j-1}, B, C, A_{j+1}, \dots, A_m]$.
Time to proceed one {\it Split(i)} is $\O(k) + \O(\t{build}(\frac{n}{k}))$, means if $k = \sqrt{n}$ time is $\sqrt{n}\log n$.
Here we assume $T$ stores not the segments, but links to it, so to "copy" segment we need only $\O(1)$ of time.
We have {\it Split(i)}. Now let express all other operations in terms of {\it Split(i)}.
\begin{minted}{c}
vector T;
int Split( int i ) { ... }
void Insert(i, x) {
a[n++] = x;
int j = Split(i);
T.insert(T.begin() + j, new Segment(n-1, n));
}
void Erase(i) {
int j = Split(i);
split(i + 1);
T.erase(T.begin() + j);
}
int Sum(l, r, x) { // [l, r]
l = split(l), r = split(r + 1); // [l, r)
int res = 0;
while (l <= r)
res += T[l++].get(x); // binary search and use partial sums
return res;
}
\end{minted}
To proceed queries of type \t{Reverse}, we need to store additional flag "is the segment reversed",
implementation of {\it Split(i)} becomes bit more complicated, all other parts stay the same.
\begin{minted}{c}
void Reverse(l, r) {
l = split(l), r = split(r + 1);
reverse(T + l, T + r)
while (l <= r)
T[l++].reversed ^= 1;
}
\end{minted}
We have the solution, which starts with $k = \sqrt{n}$ segments, proceeding of each query may increase $k$ by at most $2$.
After $k$ queries it may happens, that $k \ge 3\sqrt{n}$ (three times bigger), at this moment let rebuild all the structure in $\O(n \log n)$ time.
Amortized time of one rebuild is $\frac{n \log n}{k} = \sqrt{n}\log n$.
In total our solution can proceed any query in amortized time $\O(\sqrt{n}\log n)$.
We may speed up it. Note, {\it Split(i)} may be done in linear time, more precisely $\O(k+\frac{n}{k})$,
rebuild may be also done in liner time, in $\O(n)$.
So let number of segments $k$ is equal to $\sqrt{n /\log n}$,
amortized time to proceed one query of type {\it Sum} is $\O(S + G + R)$, where
$S$ -- time of {\it Split(i)}, is equal to $\O(\frac{n}{k}+k)$,
$G$ -- time of inner for-loop of function {\it Sum}, is equal to $\O(k\log n)$,
$B$ -- amortized time per one rebuild, is equal to $\O(\frac{n}{k})$.
In total we get $\O(\sqrt{nlogn})$ per query.
\section{Semiplanes intersection}
\myindent{}Let solve one complicated problem \href{http://acm.timus.ru/problem.aspx?space=1&num=1390}{acm.timus.ru:1390}.
The statement is: we stay on the plane at point $(0,0)$. From time to time walls appear on the plane.
The walls are segments do not containing $(0,0)$. The walls may intersect each other.
From time to time we launch the bullet. The bullet flies in a straight line, while does not touch a wall.
Answer to launch-bullet-query (type = "bullet") is distance, which buller has flied (possibly infinity).
In total we'll learn to proceed $n$ queries of arbitrary type in time $\O(n \log^2n)$.
But at first let try to understand how to approach this problem.
Problem is complicated. Queries of different types in arbitrary order, walls may intersect...
Let's try to simplify the task. Let there are some walls, but no new walls will appear. Moreover
let all the walls are straight infinite lines.
{\bf Now the task is following:}
we have straight lines, which cut out convex polygon, containing point $(0,0)$,
we need to proceed queries kind of by given ray from $(0,0)$ find intersection point of the ray and the border of the polygon.
Solution to the task: let intersect all semiplanes in $\O(n \log n)$, we'll get an convex polygon.
Then each query can made by binary search in $\O(\log n)$ time.
Describe the solution more precisely. Let start from binary search.
Take angle of any polygon's vertex and denote it by $\alpha$.
For each other vertex take angle from $[\alpha, \alpha+2\pi)$, write down this numbers in ascending order,
we'll get an array of angles, denote it $a$. To know, where the bullet flying into $(x,y)$ will stop, let calculate its angle $\beta \in [\alpha, \alpha+2\pi)$
and find (using binary search) such minimal $i \colon a[i] \le \beta < a[i + 1]$. Finally we simply intersect the segment, formed by points $i$, $i+1$
with the ray (trace of the bullet). Time is $\O(\log n)$.
Now let talk about semiplanes intersection.
At first, let add semiplanes $-M \le x, x \le M, -M \le y, y \le M$ for some big $M$. Now we are sure,
intersection is finite convex polygon.
Sort all lines by its angle (angle is from $0$ to $2\pi$). If several lines have the same angle, let leave only one among them -- the nearest to $(0,0)$.
Then we'll add lines in such order into the stack. Before addition of new line, some top lines in stack may be popped out.
Let two top lines on the stack are $l_1$ and $l_2$.
The top line should be popped out, if point $p = intersect(l_1, l_2)$ lies in opposite side from new line than $(0,0)$.
In the stack we obtain the answer, the polygon. To obtain whole polygon, let add all the lines
twice in the same order.
Now contents of the stack is head + the answer + tail. To cut off the tails consider
points of intersection of adjacent lines on the stack and take any point, which meets twice.
Between these two positions the answer lies.
The time to build the structure is $\O(sort) + \O(n)$.
{\bf Complicate the problem:} let some new walls (lines) appear.
We may use approach of "delayed operations" and proceed each query of type "bullet" in time $O(\sqrt{n})$.
If number of delayed operation becomes greater than $\sqrt{n}$, we call the function rebuild, which
works in $\O(n)$, because old walls (lines) are already sorted, and new $\sqrt{n}$ items we may add
in time $\O(\sqrt{n}\log n + n)$. In total time to proceed $n$ queries is $\O(n\sqrt{n})$.
We may do it in other way. Let store vertices of the polygon in balanced search tree (before we stored it in sorted array).
To add new line with direction vector $\vec{v}$ and normal $\vec{n}$ (direction of normal is out of $(0,0)$), we need
find the vertex $\vec{p}$ of the polygon such that scalar product $\q{\vec{n}, \vec{p}}$ is maximal possible.
It can be done by binary search by sorted array or, if we have BST (and we have), by descent throw BST tree from root to leaves
If the point $\vec{p}$ lies at the same side from the new line, as point $(0, 0)$ does, we should do nothing.
Else we have to erase point $\vec{p}$ and, possibly, some other adjacent points. Finally we have to add two new points instead of all deleted data.
Let delete vertices in loop, until ordinary point is at the same side from the line as point $(0, 0)$ does.
Finally we add intersection of the line and the segment (process of deletion stops, it gives us two points -- the last deleted, the first not deleted).
So the time to add new line is $\O(\log n + k \log n)$, where $k$ -- number of already deleted points.
Each point will be deleted only once, so in total we can proceed $n$ queries in $\O(n \log n)$.
In terms of speed this solution is much better than previous one. In terms of simplicity of implementation
the last one is much shorter and simply.
{\bf Complicate the problem one more time:}
now we have not lines, but segments. At fist solve the task in offline (all queries are known in advance).
Then we know set of all possible angles of all ends of all segments.
Sort it and build on the obtained array segment tree.
Each vertex corresponds to a certain range of angles.
When we add new wall, the new wall is splitted into $\O(\log n)$ parts by our segment tree.
In each of these vertices the wall lies exactly from left border two right border, means behaves like a line.
In total: in each vertex of the segment tree we'll maintain semiplanes intersection.
Note, to build semiplanes intersection inside one vertex of the segment tree, we
need to build a polyline, not a polygon.
This polyline passes from left ray to right one (we may assume, difference of angles is not greater than $\frac{\pi}{4}$).
So procedure of building of semiplanes intersect can be much simplified in this case --
it's enough to add each line only once. Also we do not need to give off the cycle.
So. We have the segment tree. In each vertex of the segment tree semiplanes intersection lies.
Query of type "bullet": angle of the bullet lies in $\log n$ ranges (vertices of the segment tree).
In each of these vertices we'll ask inner structure in time $\O(\log n)$.
Summary time is $\O(\log^2 n)$.
Query of time "add new wall": split the wall into $\log n$ vertices of the segment tree.
Add the "line" into each of the vertices. Summary working time is $\O(\log^2 n)$.
Amount of used memory is $O(n \log n)$.
Now to obtain online solution, we make from our segment tree "dynamic segment tree"
Initially segment tree consist of the only vertex -- its root.
Then during the query to segment tree, which is proceeding downwards, if we go
into nonexistent vertex, we just create it.
The process of downning stops, when length of the segment is less than $\EPS$.
We may also use "delayed operations" here. Then time to proceed query would be $\O(k + \log^2n)$.
Where $k$ -- number of delayed operations. The structure can be built in time $\O(n \log n)$: sort all the walls one time,
then in each vertex of the segment tree intersect semiplanes in linear time.
Optimal $k$ is equal to $\sqrt{n \log n}$. In total summary time to proceed $n$ queries is $(n \log n)^{3/2}$.
\end{document}