andreamad8 / algo2problems Goto Github PK

2016/1027

Python 1.23% TeX 47.59% C++ 0.40% Lua 1.76% HTML 48.05% Scala 0.97%

algo2problems's Introduction

Algo2Problems

This repo is used to collect different exercize solution of the Algorithm course. Please push you solution, both code and any kind of text file (e.g. latex, markdown, word, plain txt), in the folder corresponding to the right exercise. We will try to keep the best solution in the pdf named: "ProblemX_SOLUTION.pdf". In this file there could be more than one solution.

CURRENT PROBLEMS

EX1: Range updates. SOLVED
EX2: Depth of a node in a random search tree.SOLVED
EX3: Karp-Rabin fingerprinting on strings. SOLVED
EX4: Hashing sets. SOLVED
EX5: Family of uniform hash functions. SOLVED
EX6: Deterministic data streaming. SOLVED
EX7: Special case of most frequent item in a stream. SOLVED
EX8: Count-min sketch: extension to negative counters. SOLVED
EX9: Count-min sketch: range queries. SOLVED
EX10: Space-efficient perfect hash SOLVED
EX11: Bloom filters vs. space-efficient perfect hash SOLVED
EX12: MinHash sketches SOLVED
EX13: Randomized min-cut algorithm SOLVED
EX14: External memory implicit hashing SOLVED
EX15: Implicit navigation in vEB layour SOLVED
EX16: 1-D range query SOLVED
EX17: External memory mergesort SOLVED
EX18: External memory (EM) permuting SOLVED
EX19: Suffix sorting in EM SOLVED
EX20: Wrong greedy for minimum vertex cover SOLVED
EX21: Greedy 2-approximation for MAX-CUT on weighted graphs SOLVED

POLICY

If you believe that the proposed solutions are wrong, or there are mistakes, or you have a better solution, please do not esitate to review the already existing file, or upload your solution. Actually, if you want to upload your solution, you are very welcome. Last but not least, if you modify or find any mistake, please create and issue, in order to let everyone know about that.

algo2problems's People

Contributors

Stargazers

Watchers

algo2problems's Issues

EX 14

The solution seems correct to me, the only thing I think is to fix is the position of the pad to infinite. Probably, it's better to insert as a padding when needed the next element in the sorted array, and only when there is no such element the infinite (or the sequence of infinite). In this way, in the example, the last leaf becomes (22,23), the father (21, 24) and all the other nodes on the right have their values increased by 1. In this way, the infinite should always be at the rightmost position in the tree.

EDIT: I wrote that the leaf becomes (21,23), my fault, it's (22,23).

ex17

I don't get the part: "Instead, if we use an MinHeap
(priority queue), we pay O(log k) to insert an element in the Heap and O(1) to retrieve the min.
To do so we should modifed a bit an implementation detail, since each time we insert the head
(the currently min element of the block) of each block B. To do so, we insert in the min heap a
pair < key;#block >, where key is the value which the heap keep sorted, and #block keep track
which to the position of the element in the block. To keep updated the latter, we set it to B when
we upload a new block, and we decrease it each time when we insert an element of the block to
heap. This allowed us also to know when we need a new block from the run."

It is not clear to me what is the role of #block. it seems to me that #block actually counts the number of elements that are still in the block. Since we know the block size B, can't we just keep a counter associated to each block and check whether the counter == B in order to know when it's time to fetch a new block? It's seems to me a useless overkill to include this information in the heap itself, since at the end we just need the sorted values, we don't care about their starting blocks.

VeB tree EX15

There are 3 possible solution, it would be great if we could vote for the more easy to understand one. We will still keep all of them in the folder, but it would be better to propose a more compact one as PROBLEM15_SOLUTION.

Thank again

Andrea

EX18 new text

Proposed a solution for EX18 with the updated text, take a look but it should be good.

Ex6 new solution

I put a new solution based on the same idea of the first but with a different proof (based on the alternative solution proposed but never finished which is in the directory EX6).

Probably I did it to much formal, but the idea behind is very simple and in my opinion simpler then the first one.
Even if you prefer a more informal approach you can steal something from each of the two solution.

Let me know if it is correct.

ex14 IO complexity analysis

In the last paragraphs:
"The first iteration is somehow non-standard" : i don't get this part. Don't we do the same exact things as in the other iterations (ie. scanning A and then building the corresponding nodes?).
it also says that "the total cost is two times the sum of the length of A divided by the size of the block, for each level" thus 2|A| / B, but in the analysis it is reported as being 2|A| / k . I know this doesn't change the overall complexity, since B and k only differs by a constant, but it's better to address this little inconsistency.
Anyway, very good job, the solution is very elegant imho.

little inconsistency in ex11

In the first paragraph m = 1/f' . Some lines below m = ceil(1/f) (NB: the first time is f prime and the second is only f). Which is the correct one?

Ex10: Some problem that I will fix

I think there are some problems with the solution of ex10.

In particular:

In the construction of T it's not clear that we use nj^2 and not nj (becomes clear in the last part of the solution);
In the calculation of the space occupied we can't use O(n) since it's required o(n).
There are some error in calculations (log(2n^2) is not < 2log(n), log(2n^2) = log(2) + log(n^2) = 1 + 2log(n));
We need to store also a and b for the first level of hash function (cost < log(2n) each)
Since each hash function h_(a,b) comes from a possibly different family we need to store also p (a,b,p for each bucket);
Since for hash function is needed also nj^2 maybe is better to explain how to obtain from what we have, without memorizing explicitly (I know it's trivial but it clarifies probably).

I think I already have the solutions to these problems, so please concentrate on the next problems, I wrote exactly for advertising that the correction is arriving today or tomorrow.

DC3 EM

Very welcome anyone that can upload a decent solution for this exercise.

If you check solution of PDF 2013, there is a full example well explained, but the solution is quite vague. However seams a good starting.

Thanks a lot to every one

EX9

I looked at the proof and I think it mostly checks out. My only issue is when you state E[X_i] < eps/e*||F||_1: when you're at level i you don't have F~ but rather an F~_i with n/2^i "buckets", each of which represents 2^i of the original elements. The above inequality holds, but only because the frequencies of the buckets are the sum of the frequencies of the original elements, which summed over the whole vector give again the 1-norm of F. This should be pointed out, imho.

Ex04 prosal to modification

I updated the solution of ex4 with a modification in the proof of the first point.
The problems with the previous solution was (in my opinion):

we didn't stress the fact that the random choice is on h and not on k,
since k is not choose at random we must say the probability of error taken each possible k, and then we must say that the probability is 0 if k is in S,
the probability given is the probability of error for an element which is not in S,
the probability of error is the probability that B_s[h(k)] = 1, than it is the probability that
\exists j\in S : h(k)=h(j)
and not the probability that
\exists j\in S : h(k)=h(j) but j == k
as we say in the previous solution (look at the point b, here we use the probability exactly as I tell here!)
I think collision is not the right term to express that an element j in S has the same hash function of
(i.e. h(k) = h(j)), since it is a collision only if also k is in S (and in that case we don't care of any other j).

I left the previous version in the tex file, you can restore it ore take the new version or merge, let me know what you think.

Ex3: Only one of the two can be correct (probably)

I just modified the solution of the third exercise, the second solution I propose should have been a correction of the first one.
I left the first solution because the result I arrive to are very different from the one of the previous solution, and probably only one of the two can be correct.
Please have a look and check whether there are some mistakes.

what do you mean "those two states are in the same place in two different moment." Do you mean that Ai and Aj are actually the SAME state, but in two different moments in time?
i think that "with loss of generality" should be "without loss of generality", right?

Exercise 7

"if the element is different, we decrease the counter by 1, but if the counter become zero then we substitute the counter item with the new element, and we set the counter to zero. "

There is a little problem.

if the counter become zero .... and we set the counter to zero... is redundant
if you decrease the counter when is already 0 the counter may be negative.

Take as example: a = [3,2,3,3,3,4,4,4];

the result will be 2 with c = -6

Using this code:

a = [3,2,3,3,3,4,4,4];
v = a[0]
c = 1;
for(i=1; i<count(a); i++) {
    if(v != a[i])
    {
        c--;
        if(c == 0)
            v = a[i];
    }
    else
        c++;
}

The solution is to decrement c if and only if is greater than zero.

So I propose to change the solution text to:

"if the element is different:

we decrease the counter by 1 if the counter is greater than zero,
if the counter become zero then we substitute the counter item with the new element"