|
From: | Tamas Nepusz |
Subject: | Re: [igraph] personalized pagerank computation issue |
Date: | Tue, 28 Jan 2020 15:34:46 +0100 |
Thank you very much for tracking the code! Unfortunately, that doesn't work either. I am also fairly certain that allowing the node to stay at the same spot would give that node an unwarranted boost in pagerank, so it is probably undesirable.
I do have an interesting result though; when I use "arpack" instead of the default "prpack", I get the exact same results as my custom written function. In other words, when there are nodes that have no outgoing edges, "prpack" and "arpack" do the computation differently.
My problem seems to be solved (as long as there is no reason why "arpack" is wrong) but this difference between the two algorithms might be of interest to you.
Thank you very much.Omer
From: Tamas Nepusz <address@hidden>
Sent: Monday, January 27, 2020 4:15 PM
To: Yalcin, Omer Faruk <address@hidden>
Cc: Help for igraph users <address@hidden>
Subject: Re: [igraph] personalized pagerank computation issue
That being said, after your question, I set the probability of navigating to other nodes from a node that has no outbound links to the personalization vector. That doesn't reproduce the igraph result either.There's also a third option: if there are no outbound nodes, stay at the same node with probability equal to 1-damping, _or_ navigate to a randomly picked node accoding to the persionalization vector with probability equal to damping. Sorry for not being too precise here; the thing is that igraph is using an external library (PRPACK) to calculate personalized PageRank scores, and I only managed to track the code to a point where I am convinced that we are passing down two vectors to PRPACK; one is a uniform vector, and the other one is the personalization vector submitted by the user. Based on this, I would assume that PRPACK uses the personalization vector when the random walk is reset, and the uniform vector for a random teleport (after all, why would PRPACK need two vectors if it used the personalization vector for both cases?), but I did not manage to track it down further because PRPACK contains at least six different solvers, optimized for different use-cases, and I did not manage to figure out which one it would use in your particular case. But I'm pretty sure that the discrepancy between your results and ours is due to some corner case in the handling of sink nodes.
T.
[Prev in Thread] | Current Thread | [Next in Thread] |