Have you ever played the board game "Guess who?". For those who have not experienced childhood (because it might be the only reason to ignore this board game), this is a game consisting in trying to guess who the opponent player is thinking of among a list of characters - we will call the one he chooses the "chosen character". These characters have several characteristics such as gender, having brown hair or wearing glasses. To find out, you are only allowed to ask questions expecting a yes-no answer.



This game has been expanded to a further complexity through the funny and impressive website: Akinator. The software tries to guess who we are thinking of by asking yes-no questions.

With a friend/colleague/classmate of mine, Pierre Cordier, we wondered how it worked and what would be the fastest way to find the answer. In particular, is it better to ask about a very balanced characteristic, such as "Is it a female?", or a very unbalanced characteristic such as "Is it an alien?". The first question is very likely to remove fifty percent of the population, while the second question is very likely to eliminate a very small part of the population; on the other hand, if we are lucky, we can have the answer "Yes, it is an alien" and, in this case, we will directly know who is the "chosen character". In other words, the mantra of the first strategy is "No risk, no reward", while the second strategy's one is "High risk, high reward".

To answer this mysterious question, we had a neural network approach. More especially, we had a bipartite neural network approach, also known as institutional neural network. We use as a reference the book by  David Easley and  Jon Kleinberg, Networks, crowds and markets: reasoning about a highly connected world, introduced earlier in Spatial segregation in cities - An explanation by a neural network model. The idea is that every character belongs, or not, to different institutions/groups such as "wearing glasses". In term of mathematical approach, we can represent this neural network as a matrix where every column is an institution and each row is an individual. If the individual i belongs to the institution j, then the intersection of the row i and the column j would be 1, otherwise it would be 0.

We have first randomly generated this matrix, and then computed two strategies. The first one checks for the characteristic with the most balanced distribution of "yes" and "no". The second strategy finds the characteristic with the nearest distribution to 90% of "yes" for 10% of "no", or 10% of "yes" for 90% of "no".

We have done this simulation many times (considering 100 individuals for 10 institutions) to have an estimation of the number of questions needed to find the "chosen character". We have made two observations. In average, the fifty/fifty strategy is faster than the ninety/ten strategy. Besides, the number of questions needed varies less for the first one. Therefore, the first strategy is more efficient in average, and more reliable (as its variance is lower).
The dark blue line stands for the "90/10" strategy, and the other one for the "50/50". The horizontal lines represent the mean for each strategy. The "50/50" strategy is almost always beating the "90/10" strategy.

After 50 simulations, the probability distributions of the number of questions give a good insight of the spread between the two strategies.

Probability distributions

The code (R):

# Who s who

# install.packages("shape")

#functions
# creation of matrix
createMatrix = function(myNumberPeople, myNumberInstitution){
  mat = matrix(0, nrow = myNumberPeople, ncol = myNumberInstitution)
  for (i in 1:(myNumberPeople * myNumberInstitution)){
    mat[i]= rbinom(1, 1, prob = 0.5)
  }
  return(mat)
}

#calculation of proportion
proportion = function(myMatrix){
  result = matrix(0, 1,length(myMatrix[1,]))
  for(i in 1:length(result)){
    result[i] = sum(myMatrix[,i])
  }
  return(result/nrow(myMatrix))
 
}

#strategy based on proportion
closest = function(myMatrix, a, p){
  vect = proportion(myMatrix)
  for(i in 1:length(vect)){
    vect[i] = min(abs(p - vect[i]),abs((1-p) - vect[i]))
  }
  return(a[which(vect[a] == min(vect[a]))[1]])
}

belong = function(myMatrix,myGuy,a , p){
  return(myGuy[closest(myMatrix, a, p)]==1)
}

elimination = function(myMatrix,myGuy, a, p){
  myMatrix = myMatrix[(belong(myMatrix,myGuy, a, p)==myMatrix[,closest(myMatrix, a, p)]),]
  return(myMatrix)
}

strategy = function(myMatrix,myGuy, p){
  k = 0
  a = 1:numberInstitution
  while(length(a) > 0 & if(is.vector(myMatrix) != 1){length(myMatrix[!duplicated(myMatrix),])/numberInstitution != 1}else{FALSE}){
    b = a[-which(a==closest(myMatrix, a, p))]
    myMatrix = elimination(as.matrix(myMatrix), myGuy, a, p)
    a= b
    k = k+1
    #     print(dim(myMatrix))
  }
  res = list(NumberSteps=k, Candidates=myMatrix)
  return(res)
}

#random strategy
createOrder = function(){
  return(sample(1:numberInstitution, numberInstitution))
}

belongRandom = function(myMatrix,myGuy, myVariable){
  return(myGuy[myVariable]==1)
}

eliminationRandom = function(myMatrix,myGuy,a){
  myMatrix = myMatrix[(belongRandom(myMatrix,myGuy, a[1])==myMatrix[,a[1]]),]
  return(myMatrix)
}

strategyRandom = function(myMatrix,myGuy){
  k = 0
  a = createOrder()
  while(length(a) > 0 & (if(is.vector(myMatrix) != 1){length(myMatrix[!duplicated(myMatrix),])/numberInstitution != 1}else{FALSE})){
    myMatrix = eliminationRandom(as.matrix(myMatrix), myGuy,a)
    a = a[-1]
    k = k+1
  }
  res = list(NumberSteps=k, Candidates=myMatrix)
  return(res)
}

application = function(myMatrix, myGuy, myStrategy){
  #   myStrategy is in {50, random, 90}
  if(myStrategy == "random"){return(strategyRandom(myMatrix, myGuy))}
}

# initialization
numberPeople = 100
numberInstitution = 10
memory = list("fifty" = c(), "random" = c(), "ninety" = c())
for(i in 1:50){
  mat = createMatrix(numberPeople, numberInstitution)
  copyMat = mat
  guy = copyMat[sample(1:length(copyMat[,1]),1),]
  #   copy2 = mat
  #   copy3 = mat
  #   guy = mat[sample(1:numberPeople,1),]
  #   memory$fifty = c(memory$fifty, application(copy1, guy, "50")$NumberSteps)
  #   memory$random = c(memory$random, application(copy2, guy, "random")$NumberSteps)
  memory$ninety = c(memory$ninety, strategy(copyMat, guy, 0.1)$NumberSteps)
  memory$random = c(memory$random, application(copyMat, guy, "random")$NumberSteps)
  memory$fifty = c(memory$fifty, strategy(copyMat, guy, 0.5)$NumberSteps)
}


#plotting
library(shape)
png(filename="~/Chr10/post9_figure1.png", bg="white")
col <- shadepalette(9, "cyan", "blue")
plot(c(),ylim=c(4.2,10), xlim = c(3, 48), xlab ="Simulationss", ylab ="Number of questions")
polygon(c(1:50,50:1), c(memory$fifty,rev(memory$ninety)), col="#EFDECD")
a <- c(mean(memory$fifty)-var(memory$fifty),mean(memory$fifty)+var(memory$fifty))
b <- c(mean(memory$ninety)-var(memory$ninety),mean(memory$ninety)+var(memory$ninety))
lines(memory$ninety, type = 'l', col=col[3], lwd = 2)
abline(h=mean(memory$fifty), col=col[7], lwd =2)
abline(h=mean(memory$ninety), col=col[3], lwd = 2)
lines(memory$fifty, col = col[7], lwd = 2)
title("Difference between the two strategies")
dev.off()

png(filename="~/Chr10/post9_figure2.png", bg="white")
plot(table(memory$ninety)/50, ylim=c(0,0.7), type="p",col = col[3], lwd = 4,
     main="Distribution of the number of questions
for both strategies",
     xlab="Number of questions",
     ylab="Probability")
lines(table(memory$fifty)/50, type="p",col = col[7], lwd = 4)
legend(x=4, y= 0.7,legend =c("50/50", "90/10"), col=c(col[7],col[3]), pch=  1, pt.lwd = 4)
dev.off()

0

Add a comment

The financial market is not only made of stock options. Other financial products enable market actors to target specific aims. For example, an oil buyer like a flight company may want to cover the risk of increase in the price of oil. In this case it is possible to buy on the financial market what is known as a "Call" or a "Call Option".

A Call Option is a contract between two counterparties (the flight company and a financial actor). The buyer of the Call has the opportunity but not the obligation to buy a certain  quantity of a certain product (called the underlying) at a certain date (the maturity) for a certain price (the strike).

I found a golden website. The blog of Esteban Moro. He uses R to work on networks. In particular he has done a really nice code to make some great videos of networks. This post is purely a copy of his code. I just changed a few arguments to change colors and to do my own network.

To create the network, I used the  Barabási-Albert algorithm that you can find at the end of the post on the different algorithms for networks. Igraph is the library which has been used.
3

As you have certainly seen now, I like working on artificial neural networks. I have written a few posts about models with neural networks (Models to generate networks, Want to win to Guess Who and Study of spatial segregation).

Unfortunately, I missed so far a nice and pleasant aspect of networks : its graphical approach. Indeed, plots of neural networks are often really nice and really useful to understand the network.

Sometimes such a graph can point out some characteristics of the network.
1

I already talked about networks a few times in this blog. In particular, I had this approach to explain spatial segregation in a city or to solve the Guess Who? problem. However, one of the question is how to generate a good network. Indeed, I aim to study strategy to split a network, but I need first to work with a realistic neural network. I could have downloaded data of a network, but I'd rather study the different models proposed to generate neural networks.

The function apply() is certainly one of the most useful function. I was scared of it during a while and refused to use it. But it makes the code so much faster to write and so efficient that we can't afford not using it. If you are like me, that you refuse to use apply because it is scary, read the following lines, it will help you. You want to know how to use apply() in general, with a home-made function or with several parameters ? Then, go to see the following examples.
1

Have you ever played the board game "Guess who?". For those who have not experienced childhood (because it might be the only reason to ignore this board game), this is a game consisting in trying to guess who the opponent player is thinking of among a list of characters - we will call the one he chooses the "chosen character". These characters have several characteristics such as gender, having brown hair or wearing glasses.

If you want to choose randomly your next holidays destination, you are likely to process in a way which is certainly biased. Especially if you choose randomly the latitude and the longitude. A bit like they do in this lovely advertising (For those of you who do not speak French, this is about a couple who have won the national gamble prize and have to decide their next travel. The husband randomly picks Australia and the wife is complaining : "Not again!").
4

My previous post is about a method to simulate a Brownian motion. A friend of mine emailed me yesterday to tell me that this is useless if we do not know how to simulate a normally distributed variable.

My first remark is: use the rnorm() function if the quality of your simulation is not too important (Later, I'll try to explain you why the R "default random generation" functions are not perfect). However, it may be fun to generate a normal distribution from a simple uniform distribution.

The Brownian motion is certainly the most famous stochastic process (a random variable evolving in the time). It has been the first way to model a stock option price (Louis Bachelier's thesis in 1900).

The reason why is easy to understand, a Brownian motion is graphically very similar to the historical price of a stock option.
1

The merge of two insurance companies enables to curb the probability of ruin by sharing the risk and the capital of the two companies.

For example, we can consider two insurance companies, A and B. A is a well known insurance company with a big capital and is dealing with a risk with a low variance. We will assume that the global risk of all its customers follow a chi-square distribution with one degree of freedom.
Blog Archive
Translate
Translate
Loading