Have you ever played the board game "Guess who?". For those who have not experienced childhood (because it might be the only reason to ignore this board game), this is a game consisting in trying to guess who the opponent player is thinking of among a list of characters - we will call the one he chooses the "chosen character". These characters have several characteristics such as gender, having brown hair or wearing glasses. To find out, you are only allowed to ask questions expecting a yes-no answer.



This game has been expanded to a further complexity through the funny and impressive website: Akinator. The software tries to guess who we are thinking of by asking yes-no questions.

With a friend/colleague/classmate of mine, Pierre Cordier, we wondered how it worked and what would be the fastest way to find the answer. In particular, is it better to ask about a very balanced characteristic, such as "Is it a female?", or a very unbalanced characteristic such as "Is it an alien?". The first question is very likely to remove fifty percent of the population, while the second question is very likely to eliminate a very small part of the population; on the other hand, if we are lucky, we can have the answer "Yes, it is an alien" and, in this case, we will directly know who is the "chosen character". In other words, the mantra of the first strategy is "No risk, no reward", while the second strategy's one is "High risk, high reward".

To answer this mysterious question, we had a neural network approach. More especially, we had a bipartite neural network approach, also known as institutional neural network. We use as a reference the book by  David Easley and  Jon Kleinberg, Networks, crowds and markets: reasoning about a highly connected world, introduced earlier in Spatial segregation in cities - An explanation by a neural network model. The idea is that every character belongs, or not, to different institutions/groups such as "wearing glasses". In term of mathematical approach, we can represent this neural network as a matrix where every column is an institution and each row is an individual. If the individual i belongs to the institution j, then the intersection of the row i and the column j would be 1, otherwise it would be 0.

We have first randomly generated this matrix, and then computed two strategies. The first one checks for the characteristic with the most balanced distribution of "yes" and "no". The second strategy finds the characteristic with the nearest distribution to 90% of "yes" for 10% of "no", or 10% of "yes" for 90% of "no".

We have done this simulation many times (considering 100 individuals for 10 institutions) to have an estimation of the number of questions needed to find the "chosen character". We have made two observations. In average, the fifty/fifty strategy is faster than the ninety/ten strategy. Besides, the number of questions needed varies less for the first one. Therefore, the first strategy is more efficient in average, and more reliable (as its variance is lower).
The dark blue line stands for the "90/10" strategy, and the other one for the "50/50". The horizontal lines represent the mean for each strategy. The "50/50" strategy is almost always beating the "90/10" strategy.

After 50 simulations, the probability distributions of the number of questions give a good insight of the spread between the two strategies.

Probability distributions

The code (R):

# Who s who

# install.packages("shape")

#functions
# creation of matrix
createMatrix = function(myNumberPeople, myNumberInstitution){
  mat = matrix(0, nrow = myNumberPeople, ncol = myNumberInstitution)
  for (i in 1:(myNumberPeople * myNumberInstitution)){
    mat[i]= rbinom(1, 1, prob = 0.5)
  }
  return(mat)
}

#calculation of proportion
proportion = function(myMatrix){
  result = matrix(0, 1,length(myMatrix[1,]))
  for(i in 1:length(result)){
    result[i] = sum(myMatrix[,i])
  }
  return(result/nrow(myMatrix))
 
}

#strategy based on proportion
closest = function(myMatrix, a, p){
  vect = proportion(myMatrix)
  for(i in 1:length(vect)){
    vect[i] = min(abs(p - vect[i]),abs((1-p) - vect[i]))
  }
  return(a[which(vect[a] == min(vect[a]))[1]])
}

belong = function(myMatrix,myGuy,a , p){
  return(myGuy[closest(myMatrix, a, p)]==1)
}

elimination = function(myMatrix,myGuy, a, p){
  myMatrix = myMatrix[(belong(myMatrix,myGuy, a, p)==myMatrix[,closest(myMatrix, a, p)]),]
  return(myMatrix)
}

strategy = function(myMatrix,myGuy, p){
  k = 0
  a = 1:numberInstitution
  while(length(a) > 0 & if(is.vector(myMatrix) != 1){length(myMatrix[!duplicated(myMatrix),])/numberInstitution != 1}else{FALSE}){
    b = a[-which(a==closest(myMatrix, a, p))]
    myMatrix = elimination(as.matrix(myMatrix), myGuy, a, p)
    a= b
    k = k+1
    #     print(dim(myMatrix))
  }
  res = list(NumberSteps=k, Candidates=myMatrix)
  return(res)
}

#random strategy
createOrder = function(){
  return(sample(1:numberInstitution, numberInstitution))
}

belongRandom = function(myMatrix,myGuy, myVariable){
  return(myGuy[myVariable]==1)
}

eliminationRandom = function(myMatrix,myGuy,a){
  myMatrix = myMatrix[(belongRandom(myMatrix,myGuy, a[1])==myMatrix[,a[1]]),]
  return(myMatrix)
}

strategyRandom = function(myMatrix,myGuy){
  k = 0
  a = createOrder()
  while(length(a) > 0 & (if(is.vector(myMatrix) != 1){length(myMatrix[!duplicated(myMatrix),])/numberInstitution != 1}else{FALSE})){
    myMatrix = eliminationRandom(as.matrix(myMatrix), myGuy,a)
    a = a[-1]
    k = k+1
  }
  res = list(NumberSteps=k, Candidates=myMatrix)
  return(res)
}

application = function(myMatrix, myGuy, myStrategy){
  #   myStrategy is in {50, random, 90}
  if(myStrategy == "random"){return(strategyRandom(myMatrix, myGuy))}
}

# initialization
numberPeople = 100
numberInstitution = 10
memory = list("fifty" = c(), "random" = c(), "ninety" = c())
for(i in 1:50){
  mat = createMatrix(numberPeople, numberInstitution)
  copyMat = mat
  guy = copyMat[sample(1:length(copyMat[,1]),1),]
  #   copy2 = mat
  #   copy3 = mat
  #   guy = mat[sample(1:numberPeople,1),]
  #   memory$fifty = c(memory$fifty, application(copy1, guy, "50")$NumberSteps)
  #   memory$random = c(memory$random, application(copy2, guy, "random")$NumberSteps)
  memory$ninety = c(memory$ninety, strategy(copyMat, guy, 0.1)$NumberSteps)
  memory$random = c(memory$random, application(copyMat, guy, "random")$NumberSteps)
  memory$fifty = c(memory$fifty, strategy(copyMat, guy, 0.5)$NumberSteps)
}


#plotting
library(shape)
png(filename="~/Chr10/post9_figure1.png", bg="white")
col <- shadepalette(9, "cyan", "blue")
plot(c(),ylim=c(4.2,10), xlim = c(3, 48), xlab ="Simulationss", ylab ="Number of questions")
polygon(c(1:50,50:1), c(memory$fifty,rev(memory$ninety)), col="#EFDECD")
a <- c(mean(memory$fifty)-var(memory$fifty),mean(memory$fifty)+var(memory$fifty))
b <- c(mean(memory$ninety)-var(memory$ninety),mean(memory$ninety)+var(memory$ninety))
lines(memory$ninety, type = 'l', col=col[3], lwd = 2)
abline(h=mean(memory$fifty), col=col[7], lwd =2)
abline(h=mean(memory$ninety), col=col[3], lwd = 2)
lines(memory$fifty, col = col[7], lwd = 2)
title("Difference between the two strategies")
dev.off()

png(filename="~/Chr10/post9_figure2.png", bg="white")
plot(table(memory$ninety)/50, ylim=c(0,0.7), type="p",col = col[3], lwd = 4,
     main="Distribution of the number of questions
for both strategies",
     xlab="Number of questions",
     ylab="Probability")
lines(table(memory$fifty)/50, type="p",col = col[7], lwd = 4)
legend(x=4, y= 0.7,legend =c("50/50", "90/10"), col=c(col[7],col[3]), pch=  1, pt.lwd = 4)
dev.off()

0

Add a comment

The financial market is not only made of stock options. Other financial products enable market actors to target specific aims. For example, an oil buyer like a flight company may want to cover the risk of increase in the price of oil.

I found a golden website. The blog of Esteban Moro. He uses R to work on networks. In particular he has done a really nice code to make some great videos of networks. This post is purely a copy of his code. I just changed a few arguments to change colors and to do my own network.

3

As you have certainly seen now, I like working on artificial neural networks. I have written a few posts about models with neural networks (Models to generate networks, Want to win to Guess Who and Study of spatial segregation).

1

I already talked about networks a few times in this blog. In particular, I had this approach to explain spatial segregation in a city or to solve the Guess Who? problem. However, one of the question is how to generate a good network.

The function apply() is certainly one of the most useful function. I was scared of it during a while and refused to use it. But it makes the code so much faster to write and so efficient that we can't afford not using it.

1

Have you ever played the board game "Guess who?".

If you want to choose randomly your next holidays destination, you are likely to process in a way which is certainly biased. Especially if you choose randomly the latitude and the longitude.

4

My previous post is about a method to simulate a Brownian motion. A friend of mine emailed me yesterday to tell me that this is useless if we do not know how to simulate a normally distributed variable.

The Brownian motion is certainly the most famous stochastic process (a random variable evolving in the time). It has been the first way to model a stock option price (Louis Bachelier's thesis in 1900).

1

The merge of two insurance companies enables to curb the probability of ruin by sharing the risk and the capital of the two companies.

For example, we can consider two insurance companies, A and B.

How to estimate PI when we only have R and the formula for the surface of a circle (Surface = PI * r * r)?

The estimation of this number has been one of the greatest challenge in the history of mathematics. PI is the ratio between a circle's circumference and diameter.

I was in a party last night and a guy was totally drunk. Not just the guy who had a few drinks and speaks a bit too loud, but the one who is not very likely to remember what he has done during his night, but who is rather very likely to suffer from a huge headache today.

I am currently doing an internship in England. Therefore, I keep alternating between French and English in my different emails and other forms of communication on the Internet. I have been surprised to see that some websites are able to recognize when I use French or when I use English.

The VIX (volatility index) is a financial index which measures the expectation of the volatility of the stock market index S&P 500 (SPX). The higher is the value of the VIX the higher are the expectations of important variations in the S&P 500 during the next month.

The Olympic Games have finished a couple of days ago. Two entire weeks of complete devotion for sport. Unfortunately I hadn’t got any ticket but I didn’t fail to watch many games on TV and internet.

Hello (New World!), 

My name is Edwin, I’m a 22 year-old French student in applied mathematics. In particular, I study probability, statistics and risk theory.

Blog Archive
Translate
Translate
Loading