Title: | Statistical Rank Aggregation: Inference, Evaluation, and Visualization |
---|---|
Description: | A set of methods to implement Generalized Method of Moments and Maximal Likelihood methods for Random Utility Models. These methods are meant to provide inference on rank comparison data. These methods accept full, partial, and pairwise rankings, and provides methods to break down full or partial rankings into their pairwise components. Please see Generalized Method-of-Moments for Rank Aggregation from NIPS 2013 for a description of some of our methods. |
Authors: | Hossein Azari Soufiani, William Chen |
Maintainer: | Hossein Azari Soufiani <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.0.6 |
Built: | 2025-01-19 03:30:00 UTC |
Source: | https://github.com/cran/StatRank |
Given full or partial orderings, this function will generate pairwise comparison Options 1. full - All available pairwise comparisons. This is used for partial rank data where the ranked objects are a random subset of all objects 2. adjacent - Only adjacent pairwise breakings 3. top - also takes in k, will break within top k and will also generate pairwise comparisons comparing the top k with the rest of the data 4. top.partial - This is used for partial rank data where the ranked alternatives are preferred over the non-ranked alternatives
Breaking(Data, method, k = NULL)
Breaking(Data, method, k = NULL)
Data |
data in either full or partial ranking format |
method |
- can be full, adjacent, top or top.partial |
k |
This applies to the top method, choose which top k to focus on |
Pairwise breakings, where the three columns are winner, loser and rank distance (latter used for Zemel)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full")
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full")
As named, this function takes a vector where each element is a mean, then returns back a list, with each list item having the mean
convert.vector.to.list(Parameters, name = "Mean")
convert.vector.to.list(Parameters, name = "Mean")
Parameters |
a vector of parameters |
name |
Name of the parameter |
a list, where each element represents an alternative and has a Mean value
This is a public election dataset collected by Nicolaus Tideman where the voters provided partial orders on candidates. A partial order includes comparisons among a subset of alternative, and the non-mentioned alternatives in the partial order are considered to be ranked lower than the lowest ranked alternative among mentioned alternatives.
data(Data.Election1)
data(Data.Election1)
Nicolaus Tideman
This is a public election dataset collected by Nicolaus Tideman where the voters provided partial orders on candidates. A partial order includes comparisons among a subset of alternative, and the non-mentioned alternatives in the partial order are considered to be ranked lower than the lowest ranked alternative among mentioned alternatives.
data(Data.Election6)
data(Data.Election6)
Nicolaus Tideman
This is a public election dataset collected by Nicolaus Tideman where the voters provided partial orders on candidates. A partial order includes comparisons among a subset of alternative, and the non-mentioned alternatives in the partial order are considered to be ranked lower than the lowest ranked alternative among mentioned alternatives.
data(Data.Election9)
data(Data.Election9)
Nicolaus Tideman
Nascar data that only keeps racers that are represented in between 20 - 30 of total races
data(Data.NascarTrimmed)
data(Data.NascarTrimmed)
This is a randomly generated tiny ranks file that we can use to test our methods
data(Data.Test)
data(Data.Test)
This function supports RUMs 1) Normal with fixed variance (fixed at 1)
Estimation.GRUM.MLE(Data, X, Z, iter, dist, din, Bin)
Estimation.GRUM.MLE(Data, X, Z, iter, dist, din, Bin)
Data |
data in either partial or full rankings |
X |
user characteristics |
Z |
alternative characteristics |
iter |
number of iterations to run algorithm |
dist |
choice of distribution |
din |
initialization of delta vector |
Bin |
intialization of B matrix |
results from the inference
#data(Data.Test) #Data.X= matrix( runif(15),5,3) #Data.Z= matrix(runif(10),2,5) #Estimation.GRUM.MLE(Data.Test, Data.X, Data.Z, iter = 3, dist = "norm", #din=runif(5), Bin=matrix(runif(6),3,2))
#data(Data.Test) #Data.X= matrix( runif(15),5,3) #Data.Z= matrix(runif(10),2,5) #Estimation.GRUM.MLE(Data.Test, Data.X, Data.Z, iter = 3, dist = "norm", #din=runif(5), Bin=matrix(runif(6),3,2))
GMM Method for Estimating Random Utility Model wih Normal dsitributions
Estimation.Normal.GMM(Data.pairs, m, iter = 1000, Var = FALSE, prior = 0)
Estimation.Normal.GMM(Data.pairs, m, iter = 1000, Var = FALSE, prior = 0)
Data.pairs |
data broken up into pairs |
m |
number of alternatives |
iter |
number of iterations to run |
Var |
indicator for difference variance (default is FALSE) |
prior |
magnitude of fake observations input into the model |
Estimated mean parameters for distribution of underlying normal (variance is fixed at 1)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") Estimation.Normal.GMM(Data.Test.pairs, 5)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") Estimation.Normal.GMM(Data.Test.pairs, 5)
GMM Method for estimating Plackett-Luce model parameters
Estimation.PL.GMM(Data.pairs, m, prior = 0, weighted = FALSE)
Estimation.PL.GMM(Data.pairs, m, prior = 0, weighted = FALSE)
Data.pairs |
data broken up into pairs |
m |
number of alternatives |
prior |
magnitude of fake observations input into the model |
weighted |
if this is true, then the third column of Data.pairs is used as a weight for that data point |
Estimated mean parameters for distribution of underlying exponential
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") Estimation.PL.GMM(Data.Test.pairs, 5)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") Estimation.PL.GMM(Data.Test.pairs, 5)
Performs parameter estimation for the Plackett-Luce model using an Minorize Maximize algorithm
Estimation.PL.MLE(Data, iter = 10)
Estimation.PL.MLE(Data, iter = 10)
Data |
data in either partial or full rankings (Partial rank case works for settings like car racing) |
iter |
number of MM iterations to run |
list of estimated means (Gamma) and the log likelihoods
data(Data.Test) Estimation.PL.MLE(Data.Test)
data(Data.Test) Estimation.PL.MLE(Data.Test)
This function supports RUMs 1) Normal 2) Normal with fixed variance (fixed at 1) 3) Exponential (top k setting like Election)
Estimation.RUM.MLE(Data, iter = 10, dist, race = FALSE)
Estimation.RUM.MLE(Data, iter = 10, dist, race = FALSE)
Data |
data in either partial or full rankings |
iter |
number of EM iterations to run |
dist |
underlying distribution. Can be "norm", "norm.fixedvariance", "exp" |
race |
indicator that each agent chose a random subset of alternatives to compare |
parameters of the latent RUM distributions
Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimation.RUM.MLE(Data.Tiny, iter = 2, dist="norm")
Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimation.RUM.MLE(Data.Tiny, iter = 2, dist="norm")
This function supports RUMs 1) Normal 2) Normal with fixed variance (fixed at 1) 3) Exponential
Estimation.RUM.MultiType.MLE(Data, K = 2, iter = 10, dist, ratio = 0.2, race = FALSE)
Estimation.RUM.MultiType.MLE(Data, K = 2, iter = 10, dist, ratio = 0.2, race = FALSE)
Data |
data in either partial or full rankings |
K |
number of components in mixture distribution |
iter |
number of EM iterations to run |
dist |
underlying distribution. Can be "norm", "norm.fixedvariance", "exp" |
ratio |
parameter in the algorithm that controls the difference of the starting points, the bigger the ratio the more the distance |
race |
TRUE if data is sub partial, FALSE (default) if not |
results from the inference
Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimation.RUM.MultiType.MLE(Data.Tiny, K=2, iter = 3, dist= "norm.fixedvariance")
Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimation.RUM.MultiType.MLE(Data.Tiny, K=2, iter = 3, dist= "norm.fixedvariance")
Given rank data (full, top partial, or sub partial), this function returns an inference object that fits nonparametric latent utilties on the rank data.
Estimation.RUM.Nonparametric(Data, m, iter = 10, bw = 0.025, utilities.per.agent = 20, race = FALSE)
Estimation.RUM.Nonparametric(Data, m, iter = 10, bw = 0.025, utilities.per.agent = 20, race = FALSE)
Data |
full, top partial, or sub partial rank data |
m |
number of alternatives |
iter |
number of EM iterations to run |
bw |
bandwidth, or smoothing parameter for KDE |
utilities.per.agent |
Number of utility vector samples that we get per agent. More generally gives a more accurate estimate |
race |
TRUE if data is sub partial, FALSE (default) if not |
data(Data.Test) Estimation.RUM.Nonparametric(Data.Test, m = 5, iter = 3)
data(Data.Test) Estimation.RUM.Nonparametric(Data.Test, m = 5, iter = 3)
This function takes in data broken into pairs, and estimates the parameters of the Zemel mode via Gradient Descent
Estimation.Zemel.MLE(Data.pairs, m, threshold = 1e-04, learning.rate = 1/30000)
Estimation.Zemel.MLE(Data.pairs, m, threshold = 1e-04, learning.rate = 1/30000)
Data.pairs |
data broken up into pairwise comparisons |
m |
how many alternatives |
threshold |
turning parameter for gradient descent |
learning.rate |
turning parameter for gradient descent |
a set of scores for the alternatives, normalized such that the sum of the log scores is 0 scores <- Generate.Zemel.Parameters(10)$Score pairs <- Generate.Zemel.Ranks.Pairs(scores, 10, 10) Estimation.Zemel.MLE(pairs, 10, threshold = .1)
Calculates the Average Precision
Evaluation.AveragePrecision(EstimatedRank, RelevanceLevel)
Evaluation.AveragePrecision(EstimatedRank, RelevanceLevel)
EstimatedRank |
estimated ranking |
RelevanceLevel |
score for the document |
The AP for this estimation and relevance level
EstimatedRank <- scramble(1:10) RelevanceLevel <- runif(10) Evaluation.AveragePrecision(EstimatedRank, RelevanceLevel)
EstimatedRank <- scramble(1:10) RelevanceLevel <- runif(10) Evaluation.AveragePrecision(EstimatedRank, RelevanceLevel)
Calculates the Kendall Tau correlation between two ranks
Evaluation.KendallTau(rank1, rank2)
Evaluation.KendallTau(rank1, rank2)
rank1 |
two rankings. Order does not matter |
rank2 |
two rankings. Order does not matter |
The Kendall Tau correlation
rank1 <- scramble(1:10) rank2 <- scramble(1:10) Evaluation.KendallTau(rank1, rank2)
rank1 <- scramble(1:10) rank2 <- scramble(1:10) Evaluation.KendallTau(rank1, rank2)
Calculates KL divergence between empirical pairwise preferences and modeled pairwise preferences
Evaluation.KL(Data.pairs, m, Estimate, pairwise.prob = NA, prior = 0, nonparametric = FALSE, ...)
Evaluation.KL(Data.pairs, m, Estimate, pairwise.prob = NA, prior = 0, nonparametric = FALSE, ...)
Data.pairs |
data broken up into pairs using Breaking function |
m |
number of alternatives |
Estimate |
estimation object from an Estimate function |
pairwise.prob |
Function that given two alternatives from the the Parameters argument, returns back a model probability that one is larger than the other |
prior |
prior weight to put in pairwise frequency matrix |
nonparametric |
indicator that model is nonparametric (default FALSE) |
... |
additional arguments passed to generateC.model |
the KL divergence between modeled and empirical pairwise preferences, thinking of the probabilities as a probability distribution over the (n choose 2) pairs
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") m <- 5 Estimate <- Estimation.PL.GMM(Data.Test.pairs, m) Evaluation.KL(Data.Test.pairs, m, Estimate, PL.Pairwise.Prob)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") m <- 5 Estimate <- Estimation.PL.GMM(Data.Test.pairs, m) Evaluation.KL(Data.Test.pairs, m, Estimate, PL.Pairwise.Prob)
Calculates the location of the True winner in the estimated ranking
Evaluation.LocationofWinner(EstimatedRank, TrueRank)
Evaluation.LocationofWinner(EstimatedRank, TrueRank)
EstimatedRank |
estimated ranking |
TrueRank |
true ranking |
The location of the true best in the estimated rank
rank1 <- scramble(1:10) rank2 <- scramble(1:10) Evaluation.LocationofWinner(rank1, rank2)
rank1 <- scramble(1:10) rank2 <- scramble(1:10) Evaluation.LocationofWinner(rank1, rank2)
Calculates MSE between empirical pairwise preferences and modeled pairwise preferences
Evaluation.MSE(Data.pairs, m, Estimate, pairwise.prob = NA, prior = 0, nonparametric = FALSE, ...)
Evaluation.MSE(Data.pairs, m, Estimate, pairwise.prob = NA, prior = 0, nonparametric = FALSE, ...)
Data.pairs |
data broken up into pairs using Breaking function |
m |
number of alternatives |
Estimate |
estimation object from an Estimate function |
pairwise.prob |
Function that given two alternatives from |
prior |
prior weight to put in pairwise frequency matrix |
nonparametric |
indicator that model is nonparametric (default FALSE) the the Parameters argument, returns back a model probability that one is larger than the other |
... |
additioanal parameters passed into generateC.model |
the KL divergence between modeled and empirical pairwise preferences, thinking of the probabilities as a probability distribution over the (n choose 2) pairs
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") m <- 5 Estimate <- Estimation.PL.GMM(Data.Test.pairs, m) Evaluation.MSE(Data.Test.pairs, m, Estimate, PL.Pairwise.Prob)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") m <- 5 Estimate <- Estimation.PL.GMM(Data.Test.pairs, m) Evaluation.MSE(Data.Test.pairs, m, Estimate, PL.Pairwise.Prob)
Calculates the Normalized Discounted Cumluative Gain
Evaluation.NDCG(EstimatedRank, RelevanceLevel)
Evaluation.NDCG(EstimatedRank, RelevanceLevel)
EstimatedRank |
estimated ranking |
RelevanceLevel |
score for the document |
The NDCG for this estimation and relevance level
EstimatedRank <- scramble(1:10) RelevanceLevel <- runif(10) Evaluation.NDCG(EstimatedRank, RelevanceLevel)
EstimatedRank <- scramble(1:10) RelevanceLevel <- runif(10) Evaluation.NDCG(EstimatedRank, RelevanceLevel)
Calculates the Average Precision at k
Evaluation.Precision.at.k(EstimatedRank, RelevanceLevel, k)
Evaluation.Precision.at.k(EstimatedRank, RelevanceLevel, k)
EstimatedRank |
estimated ranking |
RelevanceLevel |
score for the document |
k |
positive that we want to run this algorithm for |
The AP at k for this estimation and relevance level
EstimatedRank <- scramble(1:10) RelevanceLevel <- runif(10) Evaluation.Precision.at.k(EstimatedRank, RelevanceLevel, 5)
EstimatedRank <- scramble(1:10) RelevanceLevel <- runif(10) Evaluation.Precision.at.k(EstimatedRank, RelevanceLevel, 5)
Calculates TVD between empirical pairwise preferences and modeled pairwise preferences
Evaluation.TVD(Data.pairs, m, Estimate, pairwise.prob = NA, prior = 0, nonparametric = FALSE, ...)
Evaluation.TVD(Data.pairs, m, Estimate, pairwise.prob = NA, prior = 0, nonparametric = FALSE, ...)
Data.pairs |
data broken up into pairs using Breaking function |
m |
number of alternatives |
Estimate |
estimation object from an Estimate function |
pairwise.prob |
Function that given two alternatives from |
prior |
prior weight to put in pairwise frequency matrix |
nonparametric |
indicator that model is nonparametric (default FALSE) the the Parameters argument, returns back a model probability that one is larger than the other |
... |
additional arguments passed to generateC.model |
the TVD between modeled and empirical pairwise preferences, thinking of the probabilities as a probability distribution over the (n choose 2) pairs
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") m <- 5 Estimate <- Estimation.PL.GMM(Data.Test.pairs, m) Evaluation.TVD(Data.Test.pairs, m, Estimate, PL.Pairwise.Prob)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") m <- 5 Estimate <- Estimation.PL.GMM(Data.Test.pairs, m) Evaluation.TVD(Data.Test.pairs, m, Estimate, PL.Pairwise.Prob)
Given alternatives a and b (both items from the inference object) what is the probability that a beats b?
Expo.MultiType.Pairwise.Prob(a, b)
Expo.MultiType.Pairwise.Prob(a, b)
a |
list containing parameters for a |
b |
list containing parameters for b |
probability that a beats b
This is useful for performing inference tasks for NPRUM
Generate.NPRUM.Data(Estimate, n, bw = 0.1)
Generate.NPRUM.Data(Estimate, n, bw = 0.1)
Estimate |
fitted NPRUM object |
n |
number of agents that we want in our sample |
bw |
smoothing parameter to use when sampling data |
Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimate <- Estimation.RUM.Nonparametric(Data.Tiny, m = 3, iter = 3) Generate.NPRUM.Data(Estimate, 3, bw = 0.1)
Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimate <- Estimation.RUM.Nonparametric(Data.Tiny, m = 3, iter = 3) Generate.NPRUM.Data(Estimate, 3, bw = 0.1)
Given a list of parameters (generated via the Generate RUM Parameters function), generate random utilities from these models and then return their ranks
Generate.RUM.Data(Params, m, n, distribution)
Generate.RUM.Data(Params, m, n, distribution)
Params |
inference object from an Estimation function, or parameters object from a generate function |
m |
number of alternatives |
n |
number of agents |
distribution |
can be either 'normal' or 'exponential' |
a matrix of observed rankings
Params = Generate.RUM.Parameters(10, "normal") Generate.RUM.Data(Params,m=10,n=5,"normal") Params = Generate.RUM.Parameters(10, "exponential") Generate.RUM.Data(Params,m=10,n=5,"exponential")
Params = Generate.RUM.Parameters(10, "normal") Generate.RUM.Data(Params,m=10,n=5,"normal") Params = Generate.RUM.Parameters(10, "exponential") Generate.RUM.Data(Params,m=10,n=5,"exponential")
Exponential models mean parameters are drawn from a uniform distribution Normal models, mean and standard devaition parameters are drawn from a standard unifrom
Generate.RUM.Parameters(m, distribution)
Generate.RUM.Parameters(m, distribution)
m |
number of sets of parameters to be drawn |
distribution |
either 'normal' or 'exponential' |
a list of RUM parameters
Generate.RUM.Parameters(10, "normal") Generate.RUM.Parameters(10, "exponential")
Generate.RUM.Parameters(10, "normal") Generate.RUM.Parameters(10, "exponential")
Generates possible scores for a Zemel model
Generate.Zemel.Parameters(m)
Generate.Zemel.Parameters(m)
m |
Number of alternatives |
a set of scores, all whose logs sum to 1
Generate.Zemel.Parameters(10)
Generate.Zemel.Parameters(10)
Generates pairwise ranks from a Zemel model given a set of scores
Generate.Zemel.Ranks.Pairs(scores, m, n)
Generate.Zemel.Ranks.Pairs(scores, m, n)
scores |
a vector of scores |
m |
Number of alternatives |
n |
Number of pairwise alternatives to generate |
simulated pairwise comparison data
scores <- Generate.Zemel.Parameters(10)$Score Generate.Zemel.Ranks.Pairs(scores, 10, 10)
scores <- Generate.Zemel.Parameters(10)$Score Generate.Zemel.Ranks.Pairs(scores, 10, 10)
This function takes in data that has been broken up into pair format. The user is given a matrix C, where element C[i, j] represents (if normalized is FALSE) exactly how many times alternative i has beaten alternative j (if normalized is TRUE) the observed probability that alternative i beats j
generateC(Data.pairs, m, weighted = FALSE, prior = 0, normalized = TRUE)
generateC(Data.pairs, m, weighted = FALSE, prior = 0, normalized = TRUE)
Data.pairs |
the data broken up into pairs |
m |
the tot al number of alternatives |
weighted |
whether or not this generateC should use the third column of Data.pairs as the weights |
prior |
the initial "fake data" that you want to include in C. A prior of 1 would mean that you initially "observe" that all alternatives beat all other alternatives exactly once. |
normalized |
if TRUE, then normalizes entries to probabilities |
a Count matrix of how many times alternative i has beat alternative j
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") generateC(Data.Test.pairs, 5)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") generateC(Data.Test.pairs, 5)
For parametric models, plug in a pairwise function for get.pairwise.prob. For nonparametric models, set nonparametric = TRUE
generateC.model(Estimate, get.pairwise.prob = NA, nonparametric = FALSE, ...)
generateC.model(Estimate, get.pairwise.prob = NA, nonparametric = FALSE, ...)
Estimate |
inference object with a Parameter element, with a list of parameters for each alternative |
get.pairwise.prob |
(use this if its a parametric model) function that takes in two lists of parameters and computes the probability that the first is ranked higher than the second |
nonparametric |
set this flag to TRUE if this is a non-parametric model |
... |
additional arguments passed to generateC.model.Nonparametric (bandwidth) |
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") Estimate <- Estimation.Normal.GMM(Data.Test.pairs, 5) generateC.model(Estimate, Normal.Pairwise.Prob)
data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") Estimate <- Estimation.Normal.GMM(Data.Test.pairs, 5) generateC.model(Estimate, Normal.Pairwise.Prob)
Generates a matrix where entry i, j is the estimated probabiltiy that alternative i beats alternative j
generateC.model.Nonparametric(Estimate, bw = 0.1)
generateC.model.Nonparametric(Estimate, bw = 0.1)
Estimate |
fitted NPRUM object |
bw |
bandwidth used for generating the pairwise probabilites |
data(Data.Test) Estimate <- Estimation.RUM.Nonparametric(Data.Test, m = 5, iter = 3) generateC.model.Nonparametric(Estimate)
data(Data.Test) Estimate <- Estimation.RUM.Nonparametric(Data.Test, m = 5, iter = 3) generateC.model.Nonparametric(Estimate)
Calculates KL Divergence between non-diagonal entries of two matrices
KL(A, B)
KL(A, B)
A |
first matrix, this is the "true" distribution |
B |
second matrix, this is the "estimated" distribution |
KL divergence
KL(matrix(runif(25), nrow=5), matrix(runif(25), nrow=5))
KL(matrix(runif(25), nrow=5), matrix(runif(25), nrow=5))
Computes likelihood in the case that we assume no correlation structure
Likelihood.Nonparametric(Data, Estimate, race = FALSE)
Likelihood.Nonparametric(Data, Estimate, race = FALSE)
Data |
full, top partial, or subpartial data |
Estimate |
fitted NPRUM object |
race |
indicator that the data is from subpartial data |
data(Data.Test) Estimate <- Estimation.RUM.Nonparametric(Data.Test, m = 5, iter = 3) Likelihood.Nonparametric(Data.Test, Estimate)
data(Data.Test) Estimate <- Estimation.RUM.Nonparametric(Data.Test, m = 5, iter = 3) Likelihood.Nonparametric(Data.Test, Estimate)
A faster Likelihood for Plackett-Luce Model
Likelihood.PL(Data, parameter)
Likelihood.PL(Data, parameter)
Data |
ranking data |
parameter |
Mean of Exponential Distribution |
log likelihood
data(Data.Test) parameter = Generate.RUM.Parameters(5, "exponential") Likelihood.PL(Data.Test, parameter)
data(Data.Test) parameter = Generate.RUM.Parameters(5, "exponential") Likelihood.PL(Data.Test, parameter)
Likelihood for general Random Utility Models
Likelihood.RUM(Data, parameter, dist = "exp", range = NA, res = NA, race = FALSE)
Likelihood.RUM(Data, parameter, dist = "exp", range = NA, res = NA, race = FALSE)
Data |
ranking data |
parameter |
Mean of Exponential Distribution |
dist |
exp or norm |
range |
range |
res |
res |
race |
TRUE if data is sub partial, FALSE (default) if not |
log likelihood
data(Data.Test) parameter = Generate.RUM.Parameters(5, "normal") Likelihood.RUM(Data.Test,parameter, "norm")
data(Data.Test) parameter = Generate.RUM.Parameters(5, "normal") Likelihood.RUM(Data.Test,parameter, "norm")
Likelihood for Multitype Random Utility Models
Likelihood.RUM.Multitype(Data, Estimate, dist, race = FALSE)
Likelihood.RUM.Multitype(Data, Estimate, dist, race = FALSE)
Data |
n by m table of rankings |
Estimate |
Inference object from Estimation function |
dist |
Distribution of noise (exp or norm) |
race |
TRUE if data is sub partial, FALSE (default) if not |
log likelihood
Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimate <- Estimation.RUM.MultiType.MLE(Data.Tiny, K=2, iter = 1, dist= "norm") Likelihood.RUM.Multitype(Data.Tiny, Estimate, dist = "norm")
Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimate <- Estimation.RUM.MultiType.MLE(Data.Tiny, K=2, iter = 1, dist= "norm") Likelihood.RUM.Multitype(Data.Tiny, Estimate, dist = "norm")
Calculates the log-likelihood in the pairwise Zemel model
Likelihood.Zemel(Data.pairs, Estimate)
Likelihood.Zemel(Data.pairs, Estimate)
Data.pairs |
data broken up into pairwise comparisons |
Estimate |
Inference object from Estimate function |
a log-likelihood of the data under the Zemel model
Estimate <- Generate.Zemel.Parameters(10) pairs <- Generate.Zemel.Ranks.Pairs(Estimate$Score, 10, 10) Likelihood.Zemel(pairs, Estimate)
Estimate <- Generate.Zemel.Parameters(10) pairs <- Generate.Zemel.Ranks.Pairs(Estimate$Score, 10, 10) Likelihood.Zemel(pairs, Estimate)
Calculates MSE between non-diagonal entries of two matrices if the diagonal elements are 0s
MSE(A, B)
MSE(A, B)
A |
first matrix |
B |
second matrix |
MSE divergence
MSE(matrix(runif(25), nrow=5), matrix(runif(25), nrow=5))
MSE(matrix(runif(25), nrow=5), matrix(runif(25), nrow=5))
Given alternatives a and b (both items from the inference object) what is the probability that a beats b?
Normal.MultiType.Pairwise.Prob(a, b)
Normal.MultiType.Pairwise.Prob(a, b)
a |
list containing parameters for a |
b |
list containing parameters for b |
probability that a beats b
Given alternatives a and b (both items from the inference object) what is the probability that a beats b?
Normal.Pairwise.Prob(a, b)
Normal.Pairwise.Prob(a, b)
a |
list containing parameters for a |
b |
list containing parameters for b |
probability that a beats b
Given alternatives a and b (both items from the inference object) what is the probability that a beats b?
PL.Pairwise.Prob(a, b)
PL.Pairwise.Prob(a, b)
a |
list containing parameters for a |
b |
list containing parameters for b |
probability that a beats b
takes in vector of scores (with the largest score being the one most preferred) and returns back a vector of WINNER, SECOND PLACE, ... LAST PLACE
scores.to.order(scores)
scores.to.order(scores)
scores |
the scores (e.g. means) of a set of alternatives |
an ordering of the index of the winner, second place, etc.
scores <- Generate.RUM.Parameters(10, "exponential")$Mean scores.to.order(scores)
scores <- Generate.RUM.Parameters(10, "exponential")$Mean scores.to.order(scores)
This function takes a vector and returns it in a random order
scramble(x)
scramble(x)
x |
a vector |
a vector, now in random order
scramble(1:10)
scramble(1:10)
takes a matrix and returns a data frame with the columns being row, column, entry
turn_matrix_into_table(A, uppertriangle = FALSE)
turn_matrix_into_table(A, uppertriangle = FALSE)
A |
matrix to be converted |
uppertriangle |
if true, then will only convert the upper right triangle of matrix |
a table with the entries being the row, column, and matrix entry
Calculates TVD between two matrices
TVD(A, B)
TVD(A, B)
A |
first matrix |
B |
second matrix |
Total variation distance
TVD(matrix(runif(25), nrow=5), matrix(runif(25), nrow=5))
TVD(matrix(runif(25), nrow=5), matrix(runif(25), nrow=5))
Creates histograms of the empriical rank position distribution for each alternative in rank data
Visualization.Empirical(Data, ymax, ncol = 5, names = NA)
Visualization.Empirical(Data, ymax, ncol = 5, names = NA)
Data |
full, top partial, or sub partial data |
ymax |
maximum value of density to show on graph |
ncol |
number of columns visualization is displayed in |
names |
names of alternatives |
library(ggplot2) library(gridExtra) data(Data.Test) Visualization.Empirical(Data.Test, 0.5)
library(ggplot2) library(gridExtra) data(Data.Test) Visualization.Empirical(Data.Test, 0.5)
Multitype Random Utility visualizer
Visualization.MultiType(multitype.output, min, max, names, ncol)
Visualization.MultiType(multitype.output, min, max, names, ncol)
multitype.output |
output from a multitype fitter |
min |
left boundary of graphed x-axis |
max |
right boundary of graphed x-axis |
names |
names of alternatives |
ncol |
number of columns in final output |
none
library(ggplot2) library(gridExtra) Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) multitype.output <- Estimation.RUM.MultiType.MLE(Data.Tiny, iter = 1, dist = "norm", ratio = .5) names <- 1:3 #run the following code to make plots #plots <- Visualization.MultiType(multitype.output, -2, 2, names, 3)
library(ggplot2) library(gridExtra) Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) multitype.output <- Estimation.RUM.MultiType.MLE(Data.Tiny, iter = 1, dist = "norm", ratio = .5) names <- 1:3 #run the following code to make plots #plots <- Visualization.MultiType(multitype.output, -2, 2, names, 3)
Creates pairwise matrices to compare inference results with the empirical pairwise probabilities
Visualization.Pairwise.Probabilities(Data.pairs, Parameters, get.pairwise.prob, name.of.method)
Visualization.Pairwise.Probabilities(Data.pairs, Parameters, get.pairwise.prob, name.of.method)
Data.pairs |
datas broken into pairs |
Parameters |
The Parameter element of a result from an Estimation function |
get.pairwise.prob |
function that we use to generate the pairwise probability of beating |
name.of.method |
names of the alternatives |
none
library(ggplot2) library(gridExtra) data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") Parameters <- Estimation.PL.GMM(Data.Test.pairs, 5)$Parameters PL.Pairwise.Prob <- function(a, b) a$Mean / (a$Mean + b$Mean) Visualization.Pairwise.Probabilities(Data.Test.pairs, Parameters, PL.Pairwise.Prob, "PL")
library(ggplot2) library(gridExtra) data(Data.Test) Data.Test.pairs <- Breaking(Data.Test, "full") Parameters <- Estimation.PL.GMM(Data.Test.pairs, 5)$Parameters PL.Pairwise.Prob <- function(a, b) a$Mean / (a$Mean + b$Mean) Visualization.Pairwise.Probabilities(Data.Test.pairs, Parameters, PL.Pairwise.Prob, "PL")
Creates marginal random utility density plots for each alternatives given an Estimation object for a PL or Nonparameteric model
Visualization.RUMplots(RUM = "Exponential", Estimate = NA, min = -5, max = 5, ncol = 5, names = NA)
Visualization.RUMplots(RUM = "Exponential", Estimate = NA, min = -5, max = 5, ncol = 5, names = NA)
RUM |
choice of Exponential, Gumbel, or Nonparametric |
Estimate |
fitted RUM object |
min |
minimum x value to display |
max |
maximum x value to display |
ncol |
number of columns in the visualization |
names |
names of alternatives |
library(ggplot2) library(gridExtra) Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimate <- Estimation.PL.GMM(Breaking(Data.Tiny, method = "full"), m = 3) Visualization.RUMplots("Exponential", Estimate, names = 1:3)
library(ggplot2) library(gridExtra) Data.Tiny <- matrix(c(1, 2, 3, 3, 2, 1, 1, 2, 3), ncol = 3, byrow = TRUE) Estimate <- Estimation.PL.GMM(Breaking(Data.Tiny, method = "full"), m = 3) Visualization.RUMplots("Exponential", Estimate, names = 1:3)
Given alternatives a and b (both items from the inference object) what is the probability that a beats b?
Zemel.Pairwise.Prob(a, b)
Zemel.Pairwise.Prob(a, b)
a |
list containing parameters for a |
b |
list containing parameters for b |
probability that a beats b