Starting from:

$25

CS69011-Assignment 1 Solved

Consider an online meeting app or friendship recommender system which recommends meetings to people based on their geographical convenience, e.g. they walk / bike / travel on common routes within the city. They have a road network in the form of a graph 𝐺(𝑉, 𝐸), where 𝑉 = {1, … , 𝑁} is the set of vertices denoting endpoints of road segments, and 𝐸 ⊆ 𝑉 × π‘‰ is the set edges denoting the road segments. The company also maintains a set of paths 𝐿 = {𝑝1, … , 𝑝2}, one corresponding to each of the 𝑀 users. Each path is a sequence of adjacent vertices, and of arbitrary length. A path denotes the actual road segments which the user travels by: 𝑝4 = (𝑣1, … , 𝑣67). Note that vertices can be repeated in the paths.

 

For recommendation, the company tries to find the maximal common subpath, between the paths 𝑝4 and 𝑝8 taken by two users 𝑖 and 𝑗, and recommends users sharing a higher length of path to each other for meeting. Given a path 𝑝 = (𝑣1, … , 𝑣6), a subpath is another path π‘ž = (𝑀1, … , 𝑀=), π‘š ≤ 𝑛, such that 𝑣4 = 𝑀1, … , 𝑣4A=B1 = 𝑀= for some 𝑖. A maximal common subpath π‘Ÿ between two paths 𝑝 and π‘ž, satisfies:

•        π‘Ÿ is a subpath of both 𝑝 and π‘ž.

•        There is no other common subpath of 𝑝 and π‘ž with length strictly larger than π‘Ÿ.

 

Task 1: 

 

The first task is to write a program which takes as input two such paths in the following format, read from an input file:

<User No. 1: <vertex id 1, <vertex id 2, … , <vertex id m 

<User No. 2: <vertex id 1, <vertex id 2, … , <vertex id n 

Where, m and n are the lengths of each path.

The program outputs a maximal common subpath.

 

 

Algorithm 1:                                                                                                                                                                  

 

This problem is similar to longest common substring problem

(see: https://en.wikipedia.org/wiki/Longest_common_substring_problem)

 

One way solve it is using a dynamic programming based algorithm, which solves the problem by calculating the length of longest common subpath between prefixes of the two paths. Hence, 𝐿(𝑖, 𝑗) is the length of longest common subpath between the prefixes of paths 𝑒1, … , 𝑒4 and 𝑣1, … , 𝑣8.   

One can see that if 𝑒4 = 𝑣8, then 𝐿(𝑖, 𝑗) = 𝐿(𝑖 − 1, 𝑗 − 1), else 𝐿(𝑖, 𝑗) = 0. For more details, see section 15.4 in the book by Cormen et al.

 

 

Algorithm 2:                                                                                                                                                                     

 

The second way of solving this problem is using a rolling string hashing function, similar to the one used in Rabin and Karp algorithm. The overall scheme is:

1.     Pick a length 𝑙, and hash all 𝑙-length subpaths from both paths (𝑒 and 𝑣) into a hashtable and find the pairs of subpaths which collide. This can be done in Θ(π‘š + 𝑛) time. Check that these subpaths are actually same and return a match.

2.     Repeat the above step for all lengths 𝑙, and return a subpath of highest length.

 

Hashing algorithm: Rolling hash for a string π‘₯1, … , π‘₯6, can be computed as: ∑64M1 π‘₯4𝑑4 π‘šπ‘œπ‘‘     𝑝, where p is a prime and d is the alphabet size (in this case the total number of vertices).

 

 

Task 2:                                                                                                                                                             

 In the second task, you are given a set of such paths. You have to find a maximal common subpath, which is common to all the input paths. The input format is:

<Number of Paths 

<User No. 1: <vertex id 1, <vertex id 2, … , <vertex id n1 <User No. 2: <vertex id 1, <vertex id 2, … , <vertex id n2 



<User No. H: <vertex id 1, <vertex id 2, … , <vertex id nH 

Where, the number of paths is 𝐻, and each path is of length n1, n2, …, nH.

 

 

Algorithm: 

 

Use the hashing-based approach described in Algorithm 2 of Task 1. However, in Step 1 return a match only if sub-paths from 𝐻 paths have been matched in a hash bucket.

 

More products