Description of the problem
Assume that we have an array A = {E1, E2, ···, EN } where N = 109 or 1 billion elements. Each
element Ei is composed of integers denoted as Lj where each integer Lj is between 0 and 9, and
3 ≤|Ei|≤20.
Now also assume that you have an array B = {F1, F2, ···, FN } where N = 109 or 1 billion
elements. Each element Fy is composed of integers denoted as Km where each integer Km is
between 0 and 9 and 1000 ≤|Fy|≤2000.
Problem Statement: The problem is to determine which elements in array A are present in
elements in array B; and if they are present which elements it is mapped to i.e what is the location
and index where it is found. Therefore two metrics needs to calculated for each element in A: if it
is present in B or not; and if it is present what is the element number and the starting index of the
element. An example is illustrated in Fig. 1.
Since the length of the elements in array A are small as compared to the elements in array B there
are multiple scenarios that are possible:
Case 1: One distinct Element Ei in array A maps to one distinct element Fy in array B.
Case 2: One distinct Element Ei in array A maps to multiple elements Fy in array B.
Case 3: Multiple elements Ei in array A maps to one distinct element Fy in array B.
For the purposes of the examination you should assume a 64-bit architecture. Your solutions to the
problem should cater for all of the cases listed and any other dimension that you can think of
the problem. All the information required is either listed in the problem and in the subsequent
problems. In order to solve the problems you can make any number of realistic assumptions.
However, any such assumptions should be clearly stated in your solutions.
Figure 1: Example illustrating the problem that needs to be solved. Notice the the solution of the problem
should spill out if an element from A is present in B; and if yes, what is the location of the element. The
location is composed of two things: the element number, and the index where the element starts as illustrated
in this example
1. Write down the pseudo code for a Parallel algorithm to compare elements in array A with elements
in array B. You can assume any realistic high performance architecture that you wish to use.
2. Prove the correctness of your stated algorithm (i.e. you arguments to tell why your algorithm should
correctly work)?
3. Give the big-O notation of your parallel algorithm, both communication and computations costs.
(Hint: this is a good place to check if your parallel algorithm is scalable. For a parallel algorithm to
be scalable your communication costs should be much smaller than the computation costs).