Professional Documents
Culture Documents
Documentation
Documentation
Overview
This C++ program implements the PageRank algorithm using an Adjacency List data structure. PageRank is an algorithm used by search engines to rank webpages in
their search results. It measures the importance of webpages based on the structure of the web graph.
Program Structure
AdjacencyList Class
The AdjacencyList class represents the web graph using an adjacency list. It has the following private member variables:
Member Functions
This function adds an edge between two webpages in the graph. It takes two parameters: from , the source webpage, and to , the destination webpage. The function
creates a directed edge from from to to .
void mapUrlsToIds()
This function maps URLs to unique IDs and initializes the PageRank vector. It iterates through the graph's nodes and assigns a unique ID to each webpage. These
mappings are stored in idMapping , allowing efficient access to webpage IDs. Additionally, the function initializes the pageRank vector, setting the initial PageRank
values to be uniformly distributed among the webpages.
This function computes the PageRank of webpages after a given number of power iterations. The PageRank algorithm involves iterative calculations based on the
graph's structure. In each iteration, the function updates the PageRank values of webpages according to their incoming links. The powerIterations parameter
determines how many iterations of the PageRank algorithm will be performed. After the specified number of iterations, the PageRank values converge to stable
values.
void printPageRank()
This function prints the PageRank of all webpages in ascending alphabetical order. It first constructs a vector of pairs, where each pair contains a webpage URL and
its corresponding PageRank. This vector is sorted alphabetically based on the URLs. Then, the function iterates through the sorted vector and prints each webpage's
URL and PageRank, formatted to two decimal places.
These member functions collectively allow the AdjacencyList class to manage the graph structure, compute PageRank using the specified algorithm, and display
the final PageRank values in a readable format.
Main Function
The main() function takes input for the number of webpages and power iterations. It reads edges between webpages and calculates their PageRank using the
AdjacencyList class.
Usage
To run the program, input the number of webpages, power iterations, and webpage connections when prompted. The program will output the PageRank values for
each webpage.
Input Format
Line 1: Number of webpages ( numLines ) and number of power iterations ( powerIterations ).
Lines 2 to numLines + 1 : Pairs of webpage URLs separated by a space, representing edges in the graph.
Output Format
The program outputs the PageRank of webpages after the specified power iterations. Each line contains a webpage URL and its corresponding PageRank value.
Constraints
1 <= powerIterations <= 10,000
1 <= numLines <= 10,000
1 <= Number of unique webpages ( |V| ) <= 10,000
Example
Input
7 2
google.com gmail.com
google.com maps.com
facebook.com ufl.edu
ufl.edu google.com
ufl.edu gmail.com
maps.com facebook.com
gmail.com maps.com
Output
facebook.com 0.20
gmail.com 0.20
google.com 0.10
maps.com 0.30
ufl.edu 0.20
Output Explanation
Each line represents a webpage URL followed by its PageRank value after 2 power iterations. The PageRank values are calculated based on the input connections
between webpages and the specified power iterations. Webpages are listed in ascending alphabetical order.
Connections: gmail.com is connected to maps.com. It also receives backlinks from google.com and ufl.edu.
After 2 power iterations, its PageRank is computed as 0.20.
Connections: maps.com is connected to facebook.com. It receives backlinks from google.com and gmail.com.
After 2 power iterations, its PageRank is computed as 0.30.
These PageRank values are the result of applying the PageRank algorithm for 2 iterations on the given web graph. Each webpage's PageRank represents its
importance in the network, calculated based on the links between webpages.