Have you ever played the board game "Guess who?". For those who have not experienced childhood (because it might be the only reason to ignore this board game), this is a game consisting in trying to guess who the opponent player is thinking of among a list of characters - we will call the one he chooses the "chosen character". These characters have several characteristics such as gender, having brown hair or wearing glasses. To find out, you are only allowed to ask questions expecting a yes-no answer.



This game has been expanded to a further complexity through the funny and impressive website: Akinator. The software tries to guess who we are thinking of by asking yes-no questions.

With a friend/colleague/classmate of mine, Pierre Cordier, we wondered how it worked and what would be the fastest way to find the answer. In particular, is it better to ask about a very balanced characteristic, such as "Is it a female?", or a very unbalanced characteristic such as "Is it an alien?". The first question is very likely to remove fifty percent of the population, while the second question is very likely to eliminate a very small part of the population; on the other hand, if we are lucky, we can have the answer "Yes, it is an alien" and, in this case, we will directly know who is the "chosen character". In other words, the mantra of the first strategy is "No risk, no reward", while the second strategy's one is "High risk, high reward".

To answer this mysterious question, we had a neural network approach. More especially, we had a bipartite neural network approach, also known as institutional neural network. We use as a reference the book by  David Easley and  Jon Kleinberg, Networks, crowds and markets: reasoning about a highly connected world, introduced earlier in Spatial segregation in cities - An explanation by a neural network model. The idea is that every character belongs, or not, to different institutions/groups such as "wearing glasses". In term of mathematical approach, we can represent this neural network as a matrix where every column is an institution and each row is an individual. If the individual i belongs to the institution j, then the intersection of the row i and the column j would be 1, otherwise it would be 0.

We have first randomly generated this matrix, and then computed two strategies. The first one checks for the characteristic with the most balanced distribution of "yes" and "no". The second strategy finds the characteristic with the nearest distribution to 90% of "yes" for 10% of "no", or 10% of "yes" for 90% of "no".

We have done this simulation many times (considering 100 individuals for 10 institutions) to have an estimation of the number of questions needed to find the "chosen character". We have made two observations. In average, the fifty/fifty strategy is faster than the ninety/ten strategy. Besides, the number of questions needed varies less for the first one. Therefore, the first strategy is more efficient in average, and more reliable (as its variance is lower).
The dark blue line stands for the "90/10" strategy, and the other one for the "50/50". The horizontal lines represent the mean for each strategy. The "50/50" strategy is almost always beating the "90/10" strategy.

After 50 simulations, the probability distributions of the number of questions give a good insight of the spread between the two strategies.

Probability distributions

The code (R):

# Who s who

# install.packages("shape")

#functions
# creation of matrix
createMatrix = function(myNumberPeople, myNumberInstitution){
  mat = matrix(0, nrow = myNumberPeople, ncol = myNumberInstitution)
  for (i in 1:(myNumberPeople * myNumberInstitution)){
    mat[i]= rbinom(1, 1, prob = 0.5)
  }
  return(mat)
}

#calculation of proportion
proportion = function(myMatrix){
  result = matrix(0, 1,length(myMatrix[1,]))
  for(i in 1:length(result)){
    result[i] = sum(myMatrix[,i])
  }
  return(result/nrow(myMatrix))
 
}

#strategy based on proportion
closest = function(myMatrix, a, p){
  vect = proportion(myMatrix)
  for(i in 1:length(vect)){
    vect[i] = min(abs(p - vect[i]),abs((1-p) - vect[i]))
  }
  return(a[which(vect[a] == min(vect[a]))[1]])
}

belong = function(myMatrix,myGuy,a , p){
  return(myGuy[closest(myMatrix, a, p)]==1)
}

elimination = function(myMatrix,myGuy, a, p){
  myMatrix = myMatrix[(belong(myMatrix,myGuy, a, p)==myMatrix[,closest(myMatrix, a, p)]),]
  return(myMatrix)
}

strategy = function(myMatrix,myGuy, p){
  k = 0
  a = 1:numberInstitution
  while(length(a) > 0 & if(is.vector(myMatrix) != 1){length(myMatrix[!duplicated(myMatrix),])/numberInstitution != 1}else{FALSE}){
    b = a[-which(a==closest(myMatrix, a, p))]
    myMatrix = elimination(as.matrix(myMatrix), myGuy, a, p)
    a= b
    k = k+1
    #     print(dim(myMatrix))
  }
  res = list(NumberSteps=k, Candidates=myMatrix)
  return(res)
}

#random strategy
createOrder = function(){
  return(sample(1:numberInstitution, numberInstitution))
}

belongRandom = function(myMatrix,myGuy, myVariable){
  return(myGuy[myVariable]==1)
}

eliminationRandom = function(myMatrix,myGuy,a){
  myMatrix = myMatrix[(belongRandom(myMatrix,myGuy, a[1])==myMatrix[,a[1]]),]
  return(myMatrix)
}

strategyRandom = function(myMatrix,myGuy){
  k = 0
  a = createOrder()
  while(length(a) > 0 & (if(is.vector(myMatrix) != 1){length(myMatrix[!duplicated(myMatrix),])/numberInstitution != 1}else{FALSE})){
    myMatrix = eliminationRandom(as.matrix(myMatrix), myGuy,a)
    a = a[-1]
    k = k+1
  }
  res = list(NumberSteps=k, Candidates=myMatrix)
  return(res)
}

application = function(myMatrix, myGuy, myStrategy){
  #   myStrategy is in {50, random, 90}
  if(myStrategy == "random"){return(strategyRandom(myMatrix, myGuy))}
}

# initialization
numberPeople = 100
numberInstitution = 10
memory = list("fifty" = c(), "random" = c(), "ninety" = c())
for(i in 1:50){
  mat = createMatrix(numberPeople, numberInstitution)
  copyMat = mat
  guy = copyMat[sample(1:length(copyMat[,1]),1),]
  #   copy2 = mat
  #   copy3 = mat
  #   guy = mat[sample(1:numberPeople,1),]
  #   memory$fifty = c(memory$fifty, application(copy1, guy, "50")$NumberSteps)
  #   memory$random = c(memory$random, application(copy2, guy, "random")$NumberSteps)
  memory$ninety = c(memory$ninety, strategy(copyMat, guy, 0.1)$NumberSteps)
  memory$random = c(memory$random, application(copyMat, guy, "random")$NumberSteps)
  memory$fifty = c(memory$fifty, strategy(copyMat, guy, 0.5)$NumberSteps)
}


#plotting
library(shape)
png(filename="~/Chr10/post9_figure1.png", bg="white")
col <- shadepalette(9, "cyan", "blue")
plot(c(),ylim=c(4.2,10), xlim = c(3, 48), xlab ="Simulationss", ylab ="Number of questions")
polygon(c(1:50,50:1), c(memory$fifty,rev(memory$ninety)), col="#EFDECD")
a <- c(mean(memory$fifty)-var(memory$fifty),mean(memory$fifty)+var(memory$fifty))
b <- c(mean(memory$ninety)-var(memory$ninety),mean(memory$ninety)+var(memory$ninety))
lines(memory$ninety, type = 'l', col=col[3], lwd = 2)
abline(h=mean(memory$fifty), col=col[7], lwd =2)
abline(h=mean(memory$ninety), col=col[3], lwd = 2)
lines(memory$fifty, col = col[7], lwd = 2)
title("Difference between the two strategies")
dev.off()

png(filename="~/Chr10/post9_figure2.png", bg="white")
plot(table(memory$ninety)/50, ylim=c(0,0.7), type="p",col = col[3], lwd = 4,
     main="Distribution of the number of questions
for both strategies",
     xlab="Number of questions",
     ylab="Probability")
lines(table(memory$fifty)/50, type="p",col = col[7], lwd = 4)
legend(x=4, y= 0.7,legend =c("50/50", "90/10"), col=c(col[7],col[3]), pch=  1, pt.lwd = 4)
dev.off()

0

Add a comment

  1. The financial market is not only made of stock options. Other financial products enable market actors to target specific aims. For example, an oil buyer like a flight company may want to cover the risk of increase in the price of oil. In this case it is possible to buy on the financial market what is known as a "Call" or a "Call Option".

    A Call Option is a contract between two counterparties (the flight company and a financial actor). The buyer of the Call has the opportunity but not the obligation to buy a certain  quantity of a certain product (called the underlying) at a certain date (the maturity) for a certain price (the strike).

    If the flight company doesn't want to suffer from the risk of an increase of oil on the market within the next year, a Call Option of maturity 1 year, with a certain strike K may be bought. In this case, if the oil price on the market is over the strike, the flight company will use the Call option to buy oil for the price of the strike. If the price of the oil is lower than the strike, then the flight will not use the option and will buy oil on the market.

    As you can see, this contract is not symmetric  The flight company has an option (to buy or not to buy) while the second counterparty just follows the decision of the flight company. Therefore, the flight company will have to pay something to the second counterparty. One of the great question is how much should it be charged.

    We will therefore use a pricing algorithm to estimate the price of this Call Option. We first build an algorithm to simulate the value of an asset in the model of Black-Scholes. Then we simulate the historical value of this asset in order to simulate the final value of the Call Option.

    Indeed, if we know the value of the asset at the maturity (A(T)), we get directly the value of the Call Option of strike K at maturity : max(A(T) - K , 0).

    Our asset is randomly distributed as:
    A(t+dt) = A(t) (1 + r dt + v(B(t+dt) -B(t))

    Where r is the interest rate, v the volatility of the asset and B a Brownian process (for more detail you can look at the post on Brownian motion).

    After repeating the process many times, we estimate the mean of the Call Option for different strike in order to estimate the price of the Call Option.


    As we can see the price of the Call decreases with the strike. Actually it converges towards 0. Indeed, the higher the strike is, the lower the chance are such that the asset goes over the strike.

    What is done here can be done for many, many, many other financial products in order to price them by Monte Carlo.



    The code (R):


    sample.size <- 365
    mu <- 0.1
    sigma <- 0.2

    a0 <- 1
    Asset <- function(sample.size = 365, mu = 0.1, sigma = 0.2, a0 = 1){
      dt <- 1/sample.size
      sdt =sigma*sqrt(dt)
      gauss <- rnorm(sample.size)
      asset <- NULL
      asset[1] <- a0 + mu + sigma * gauss[1]
      test.default <- FALSE
      for(i in 2:365){
        if (!test.default){
          asset[i] <- asset[i-1] * (1 + dt*mu + sdt * gauss[i])
        }
        else{
          asset[i] <- 0
        }
        if (asset[i] <= 0){
          asset[i] <- 0
          test.default <- TRUE
        }
      }
      return(asset)
    }


    PriceEstimation <- function(t, tf = 365, r = 0.1, strike = 1, n = 1000, mu = 0.1, sigma = 0.2, a0 = 1){
      mean <- 0
      for(i1 in 1:n){
        mean <- mean + exp(-r*(tf-t)/tf) * max(0, (Asset(tf, mu, sigma, a0)-strike))
      }
      mean <- mean/n
      return(mean)
    }

    res <- NULL
    for(i in 1:length(seq(0, 3, 0.05))){
      res[i] <- PriceEstimation(t = 0, tf = 365, r = 0.1, strike = seq(0, 3, 0.05)[i], n = 1000, mu = 0.1, sigma = 0.2, a0 = 1)
    }

    plot(seq(0,3, 0.05), res,type = 'l', xlab = "Strike Value",  ylab = "Price of the Call")



    0

    Add a comment

  2. I found a golden website. The blog of Esteban Moro. He uses R to work on networks. In particular he has done a really nice code to make some great videos of networks. This post is purely a copy of his code. I just changed a few arguments to change colors and to do my own network.

    To create the network, I used the  Barabási-Albert algorithm that you can find at the end of the post on the different algorithms for networks. Igraph is the library which has been used.

    In order to make a video from the .png I used a software called Ffmpeg. It took me a bit of time to use it but you can find some tutorials on Internet.

    Here is the kind of result you can expect :

    The code (R) : 


    n <- 300
    data <- matrix(0, ncol = 3, nrow = n-1)
    data[1,2] <- 1
    data[1:(n-1),1] <- 2:n
    data[, 3] <- 1:(n-1)
    weight <- NULL
    weight[1] <- 1
    weight[2] <- 1
    for(i1 in 2:(n-1)){
      link = sample(c(1:(i1)), size = 1, prob = weight)
      data[i1, 2] <- link
      weight[i1+1] <- 1
      weight[link] <- weight[link] + 1
    }

    install.packages("igraph")
    library(igraph)

    #generate the full graph
    g <- graph.edgelist(as.matrix(data[,c(1,2)]),directed=F)
    E(g)$time <- data[,3]

    #generate a cool palette for the graph
    YlOrBr <- c(hsv(0.925, 0.20, 0.7), hsv(0.925, 0.40, 0.7), hsv(0.925, 0.60, 0.7), hsv(0.925, 0.80, 0.7), hsv(0.925,1, 0.7))
    YlOrBr.Lab <- colorRampPalette(YlOrBr, space = "Lab")
    #colors for the nodes are chosen from the very beginning
    vcolor <- rev(YlOrBr.Lab(vcount(g)))

    #time in the edges goes from 1 to 300. We kick off at time 3
    ti <- 3
    #weights of edges formed up to time ti is 1. Future edges are weighted 0
    E(g)$weight <- ifelse(E(g)$time < ti,1,0)
    #generate first layout using weights.
    layout.old <- layout.fruchterman.reingold(g,params=list(weights=E(g)$weight))


    #total time of the dynamics
    total_time <- max(E(g)$time)
    #This is the time interval for the animation. In this case is taken to be 1/10
    #of the time (i.e. 10 snapshots) between adding two consecutive nodes
    dt <- 0.1
    #Output for each frame will be a png with HD size 1600x900 :)
    png(file="example%04d.png", width=1600,height=900)
    nsteps <- max(E(g)$time)
    #Time loop starts
    for(ti in seq(3,total_time,dt)){
      #define weight for edges present up to time ti.
      E(g)$weight <- ifelse(E(g)$time < ti,1,0)
      #Edges with non-zero weight are in gray. The rest are transparent
      E(g)$color <- ifelse(E(g)$time < ti,"black",rgb(0,0,0,0))
      #Nodes with at least a non-zero weighted edge are in color. The rest are transparent
      V(g)$color <- ifelse(graph.strength(g)==0,rgb(0,0,0,0),vcolor)
      #given the new weights, we update the layout a little bit
      layout.new <- layout.fruchterman.reingold(g,params=list(niter=10,start=layout.old,weights=E(g)$weight,maxdelta=1))
      #plot the new graph
      plot(g,layout=layout.new,vertex.label="",vertex.size=1+2*log(graph.strength(g)),vertex.color=V(g)$color,edge.width=1.5,asp=9/16,margin=-0.15)
      #use the new layout in the next round
      layout.old <- layout.new
    }
    dev.off()
    3

    View comments

  3. As you have certainly seen now, I like working on artificial neural networks. I have written a few posts about models with neural networks (Models to generate networks, Want to win to Guess Who and Study of spatial segregation).

    Unfortunately, I missed so far a nice and pleasant aspect of networks : its graphical approach. Indeed, plots of neural networks are often really nice and really useful to understand the network.

    Sometimes such a graph can point out some characteristics of the network. For example, we see in the graph below that there is one "central" node linked with many other nodes. It would not be that obvious if we were looking at a simple print() of a network matrix!

    So if you like this representation, let me introduce you to the package network.

    Let's first download the package and use a home-made function to generate networks randomly (if you want to see more details about this function, see here): 

    install.packages('network')
    library(network) 

    generateBA = function(n = 100, n0 = 2){
      mat = matrix(0, nrow= n, ncol = n)
      for(i in 1:n0){
        for(j in 1:n0){
          if(i != j){
            mat[i,j] = 1
            mat[j,i] = 1
          }
        }
      }
      for(i in n0:n){
        list = c()
        for(k in 1:(i-1)){
          list = c(list, sum(mat[,k]))
        }
        link = sample(c(1:(i-1)), size = 1, prob = list)
        mat[link,i] = 1
        mat[i,link] = 1
      }
      return(mat)
    }


    artificialNet= generateBA(200)

    To create an object network from a matrix, the package uses the function network():

    a = network(artificialNet, directed = FALSE)
    # The parameter directed is specified as FALSE because in our case, artificialNet is a symetrical matrix.
    # It is really convenient to plot a nice network
    plot(a)
    #plot.network() is the function to use if we want to change some parameters





    There are many other useful function in this package. Let's start with the function to study the properties of the network. We can compute the density of the network ("The density of a network is defined as the ratio of extant edges to potential edges."), the number of nodes and the number of edges.

    network.density(a)
    network.size(a)
    network.edgecount(a)


    Besides, this package offers the possibility to create and manage operations on the object network. The function network.initialize(n) creates a network with n nodes and no edges.

    net = network.initialize(10)
    plot(net)

    Adding a edge (and therefore define our whole network) is then really simple : 

    net[1,2] = 1
    plot(net)

    But the real power of this package is in the definition of operators. Here are the operators (source : the library) :

    + : An (i; j) edge is created in the return graph for every (i; j) edge in each of the input graphs.

    - : An (i; j) edge is created in the return graph for every (i; j) edge in the first input that is not
    matched by an (i; j) edge in the second input; if the second input has more (i; j) edges than
    the first, no (i; j) edges are created in the return graph.

    * : An (i; j) edge is created for every pairing of (i; j) edges in the respective input graphs.

    %c% : An (i; j) edge is created in the return graph for every edge pair (i; k); (k; j) with the first edge
    in the first input and the second edge in the second input.

    ! : An (i; j) edge is created in the return graph for every (i; j) in the input not having an edge.

    | : An (i; j) edge is created in the return graph if either input contains an (i; j) edge.

    & :  An (i; j) edge is created in the return graph if both inputs contain an (i; j) edge.



    These operators enable to do some really convenient operations between two networks.

    If you want to find more information about this wonderful package, you can read the pdf of the package.
    1

    View comments

  4. I already talked about networks a few times in this blog. In particular, I had this approach to explain spatial segregation in a city or to solve the Guess Who? problem. However, one of the question is how to generate a good network. Indeed, I aim to study strategy to split a network, but I need first to work with a realistic neural network. I could have downloaded data of a network, but I'd rather study the different models proposed to generate neural networks.



    I will explain and generate the three most famous models of neural networks:
    - The Erdős-Rényi model;
    - The Watts and Strotgatz model (small world model); 
    - The Barabási-Albert preferential attachment model.

    We represent each model with a matrix of acquaintance. The intersection of the column i and the row j is a 1 if and only if the nodes i and the node j know each other. Since we simulate reciprocal neural network (i.e. if i knows j then j knows i), we can work on a triangle matrix and not worry about the lower triangle of our matrices. Here, I use the R function image() to represent these matrices. In red are the 0, in white are the 1.

    The Erdős-Rényi model.

    This model is certainly the simplest of the three models. Only two parameters are required to compute this model. N is the number of nodes we consider and p, is the probability for every couple of nodes to be linked by an edge.

    This model assumes that the existence of a link between two nodes is independent to the other link of the graph. According to Daniel A Spielman, this model has not been created to represent any realistic graph. However, this model has some very interesting properties. The average path is of length log(N) which is relatively short.
    Besides, if p < 1, for N great enough, the clustering coefficient converges toward 0 (almost surely). The clustering coefficient for one point, is in simple word the ratio between all the existing edges between the neighbors of this point to all the possible edges of these neighbors.

    On this figure the clustering coefficient of A is 1/3, there are 3 possible edges between the neighbors of A (X-Y, Y-Z, Z-X) and only one (Z-Y) is linked.




    The Watts and Strotgatz model (small world model).

    This model is really interesting, it assumes that you know a certain number of persons (k) and that your are more likely to know your closest neighbors. The algorithm though more complicated than the Erdős-Rényi model's is simple. We have 3 parameters. The number of the population (N), the number of close neighbors (k) and a probability p. For any variable, for every close neighbor, the probability to be linked with it is (1-p). For every close neighbor not linked with, we choose randomly in the further neighbors an other link.


    Because this model generates some conglomerates of people knowing each other, it is really easy to be linked indirectly (and with a very few number of steps) with anyone in the map. This is why we call this kind of model a small world model. This is, in the three we describe here the closest from the realistic social network of friendship.

    The Barabási-Albert preferential attachment.

    This model is computing with a recursive algorithm. Two parameters are needed, the initial number of nodes (n0) and the total number of node (N). At the beginning, every initial node (the n0 first nodes) knows the other ones, then, we create, one by one the other node. At the creation of a new node, this node is linked randomly to an already existing node. The probability that the new node is linked to a certain node is proportional to the number of edges this node already has. In other word, the more links you have, the more likely new nodes will be link to you.

    This model is really interesting, it is the model for any neural network respecting the idea of "rich get richer". The more friends one node has, the more likely the new nodes will be friend with him. This kind of model is relevant for internet network. Indeed, the more famous is the website, the more likely this website will be known by other websites. For example Google is very likely to be connected with many websites, while it is very unlikely that my little and not known blog is connected to many websites.


    The code (R) : 

    ###############################################################
    # ER model
    ###############################################################

    generateER = function(n = 100, p = 0.5){
      map = diag(rep(1, n))
      link = rbinom(n*(n-1)/2, 1,p)
      t = 1
      for(j in 2:n){
        for(i in 1:(j-1)){
          map[i,j] = link[t]
          t = t + 1
        }
      }
      return(map)
    }


    ###############################################################
    # WS model
    ###############################################################
    f = function(j, mat){
      return(c(mat[1:j, j], mat[j,(j+1):length(mat[1,])]))
    }

    g = function(j, mat){
      k = length(mat[1,])
      a = matrix(0, nrow = 2, ncol = k)
      if(j>1){
        for(i in 1:(j-1)){
          a[1,i] = i
          a[2,i] = j
        }
      }
      if(j<k){
        for(i in (j+1):k){
          a[1,i] = j
          a[2,i] = i
        }
      }
      a = a[,-j]
      return(a)
    }
    g(1, map)
    callDiag = function(j, mat){
      return(c(diag(mat[g(j,mat)[1, 1:(length(mat[1,])-1)], g(j,mat)[2, 1:(length(mat[1,])-1)]])))
    }

    which(callDiag(4,matrix(runif(20*20),20,20)) <0.1)

    generateWS = function(n = 100, k = 4 , p = 0.5){
      map = matrix(0,n,n)
      down = floor(k/2)
      up = ceiling(k/2)
      for(j in 1:n){
          map[(((j-down):(j+up))%%n)[-(down + 1)],j] = 1
      }
      map = map|t(map)*1
     
      for(j in 2:n){
        list1 = which(map[(((j-down):(j))%%n),j]==1)
        listBusy = which(map[(((j-down):(j))%%n),j]==1)
        for(i in 1:(j-1)){
          if((j-i<=floor(k/2))|(j-i>= n-1-up)){
            if(rbinom(1,1,p)){
              map[i,j] = 0
              samp = sample(which(callDiag(j, map) == 0), 1)
              map[g(j, map)[1, samp], g(j, map)[2, samp]] = 1
            }
          }
        }
      }
     
      return(map*1)
    }


    ###############################################################
    # BA model
    ###############################################################

    generateBA = function(n = 100, n0 = 2){
      mat = matrix(0, nrow= n, ncol = n)
      for(i in 1:n0){
        for(j in 1:n0){
          if(i != j){
            mat[i,j] = 1
            mat[j,i] = 1
          }
        }
      }
      for(i in n0:n){
        list = c()
        for(k in 1:(i-1)){
          list = c(list, sum(mat[,k]))
        }
        link = sample(c(1:(i-1)), size = 1, prob = list)
        mat[link,i] = 1
        mat[i,link] = 1
      }
      return(mat)
    }


    ###############################################################
    # Graphs
    ###############################################################

    image(generateER(500))
    image(generateWS(500))
    image(generateBA(500))
    0

    Add a comment

  5. The function apply() is certainly one of the most useful function. I was scared of it during a while and refused to use it. But it makes the code so much faster to write and so efficient that we can't afford not using it. If you are like me, that you refuse to use apply because it is scary, read the following lines, it will help you. You want to know how to use apply() in general, with a home-made function or with several parameters ? Then, go to see the following examples.




    The function apply() is known as extremely useful to improve the speed of our code when we want to operate some functions on a matrix or a vector. I use it, but I do not really know how to use it.

    What is it?

    apply() is a R function which enables to make quick operations on matrix, vector or array. The operations can be done on the lines, the columns or even both of them.

    How does it work?

    The pattern is really simple : apply(variable, margin, function).
    -variable is the variable you want to apply the function to.
    -margin specifies if you want to apply by row (margin = 1), by column (margin = 2), or for each element (margin = 1:2). Margin can be even greater than 2, if we work with variables of dimension greater than two.
    -function is the function you want to apply to the elements of your variable.

    Because I think example is clearer than anything else, here is the most important example to understand the function apply().


    Introduction:

    #the matrix we will work on:
    a = matrix(c(1:15), nrow = 5 , ncol = 3)

    #will apply the function mean to all the elements of each row
    apply(a, 1, mean)
    # [1]  6  7  8  9 10

    #will apply the function mean to all the elements of each column
    apply(a, 2, mean)
    # [1]  3  8 13

    #will apply the function mean to all the elements of each column and of each row, ie each element
    apply(a, 1:2, mean)
    #     [,1] [,2] [,3]
    # [1,]    1    6   11
    # [2,]    2    7   12
    # [3,]    3    8   13
    # [4,]    4    9   14
    # [5,]    5   10   15

    We have just worked on the different margins to show the basic possibilities. But as I said we can also work on other variables such as an array of dimension 3:

    #apply() for array of other dimension :
    a = array(1:8, dim = c(2,2,2))
    apply(a, 3, sum)

    # , , 1
    #
    #       [,1] [,2]
    # [1,]    1    3
    # [2,]    2    4
    #
    # , , 2
    #
    #        [,1] [,2]
    # [1,]    5    7
    # [2,]    6    8


    Use a home-made function:

    We can also use our own function. For example, we reproduce the function sum (absolutely useless but let's keep it simple!).

    f1 = function(x){
      return(sum(x))
    }

    apply(a, 1, f1)

    Several parameters:
    A function with several parameters. Here is the main reason why I was never using apply(), I did not know how to do when I had several parameters in a function. This is quite simple, we just have to specifiy in the parameter the variable which is constant.


    f2 = function(x1,x2){
      x1-x2
    }
    b = 2

    #with the second parameter as an option
    apply(a,1,f2,x2=b)

    #       [,1] [,2] [,3] [,4] [,5]
    # [1,]   -1    0    1    2    3
    # [2,]    4    5    6    7    8
    # [3,]    9   10   11   12   13
    # [3,]   22   24   26   28   30

    #with the first parameter as an option
    apply(a,1,f2,x1=b)

    # [,1] [,2] [,3] [,4] [,5]
    # [1,]    1    0   -1   -2   -3
    # [2,]   -4   -5   -6   -7   -8
    # [3,]   -9  -10  -11  -12  -13

    I hope these examples will help you. Personally I use the function apply() since I went through these few examples. Of course there are many other details which can be useful, but these first examples deal with the main useful possibilities of apply().
    1

    View comments

  6. Have you ever played the board game "Guess who?". For those who have not experienced childhood (because it might be the only reason to ignore this board game), this is a game consisting in trying to guess who the opponent player is thinking of among a list of characters - we will call the one he chooses the "chosen character". These characters have several characteristics such as gender, having brown hair or wearing glasses. To find out, you are only allowed to ask questions expecting a yes-no answer.



    This game has been expanded to a further complexity through the funny and impressive website: Akinator. The software tries to guess who we are thinking of by asking yes-no questions.

    With a friend/colleague/classmate of mine, Pierre Cordier, we wondered how it worked and what would be the fastest way to find the answer. In particular, is it better to ask about a very balanced characteristic, such as "Is it a female?", or a very unbalanced characteristic such as "Is it an alien?". The first question is very likely to remove fifty percent of the population, while the second question is very likely to eliminate a very small part of the population; on the other hand, if we are lucky, we can have the answer "Yes, it is an alien" and, in this case, we will directly know who is the "chosen character". In other words, the mantra of the first strategy is "No risk, no reward", while the second strategy's one is "High risk, high reward".

    To answer this mysterious question, we had a neural network approach. More especially, we had a bipartite neural network approach, also known as institutional neural network. We use as a reference the book by  David Easley and  Jon Kleinberg, Networks, crowds and markets: reasoning about a highly connected world, introduced earlier in Spatial segregation in cities - An explanation by a neural network model. The idea is that every character belongs, or not, to different institutions/groups such as "wearing glasses". In term of mathematical approach, we can represent this neural network as a matrix where every column is an institution and each row is an individual. If the individual i belongs to the institution j, then the intersection of the row i and the column j would be 1, otherwise it would be 0.

    We have first randomly generated this matrix, and then computed two strategies. The first one checks for the characteristic with the most balanced distribution of "yes" and "no". The second strategy finds the characteristic with the nearest distribution to 90% of "yes" for 10% of "no", or 10% of "yes" for 90% of "no".

    We have done this simulation many times (considering 100 individuals for 10 institutions) to have an estimation of the number of questions needed to find the "chosen character". We have made two observations. In average, the fifty/fifty strategy is faster than the ninety/ten strategy. Besides, the number of questions needed varies less for the first one. Therefore, the first strategy is more efficient in average, and more reliable (as its variance is lower).
    The dark blue line stands for the "90/10" strategy, and the other one for the "50/50". The horizontal lines represent the mean for each strategy. The "50/50" strategy is almost always beating the "90/10" strategy.

    After 50 simulations, the probability distributions of the number of questions give a good insight of the spread between the two strategies.

    Probability distributions

    The code (R):

    # Who s who

    # install.packages("shape")

    #functions
    # creation of matrix
    createMatrix = function(myNumberPeople, myNumberInstitution){
      mat = matrix(0, nrow = myNumberPeople, ncol = myNumberInstitution)
      for (i in 1:(myNumberPeople * myNumberInstitution)){
        mat[i]= rbinom(1, 1, prob = 0.5)
      }
      return(mat)
    }

    #calculation of proportion
    proportion = function(myMatrix){
      result = matrix(0, 1,length(myMatrix[1,]))
      for(i in 1:length(result)){
        result[i] = sum(myMatrix[,i])
      }
      return(result/nrow(myMatrix))
     
    }

    #strategy based on proportion
    closest = function(myMatrix, a, p){
      vect = proportion(myMatrix)
      for(i in 1:length(vect)){
        vect[i] = min(abs(p - vect[i]),abs((1-p) - vect[i]))
      }
      return(a[which(vect[a] == min(vect[a]))[1]])
    }

    belong = function(myMatrix,myGuy,a , p){
      return(myGuy[closest(myMatrix, a, p)]==1)
    }

    elimination = function(myMatrix,myGuy, a, p){
      myMatrix = myMatrix[(belong(myMatrix,myGuy, a, p)==myMatrix[,closest(myMatrix, a, p)]),]
      return(myMatrix)
    }

    strategy = function(myMatrix,myGuy, p){
      k = 0
      a = 1:numberInstitution
      while(length(a) > 0 & if(is.vector(myMatrix) != 1){length(myMatrix[!duplicated(myMatrix),])/numberInstitution != 1}else{FALSE}){
        b = a[-which(a==closest(myMatrix, a, p))]
        myMatrix = elimination(as.matrix(myMatrix), myGuy, a, p)
        a= b
        k = k+1
        #     print(dim(myMatrix))
      }
      res = list(NumberSteps=k, Candidates=myMatrix)
      return(res)
    }

    #random strategy
    createOrder = function(){
      return(sample(1:numberInstitution, numberInstitution))
    }

    belongRandom = function(myMatrix,myGuy, myVariable){
      return(myGuy[myVariable]==1)
    }

    eliminationRandom = function(myMatrix,myGuy,a){
      myMatrix = myMatrix[(belongRandom(myMatrix,myGuy, a[1])==myMatrix[,a[1]]),]
      return(myMatrix)
    }

    strategyRandom = function(myMatrix,myGuy){
      k = 0
      a = createOrder()
      while(length(a) > 0 & (if(is.vector(myMatrix) != 1){length(myMatrix[!duplicated(myMatrix),])/numberInstitution != 1}else{FALSE})){
        myMatrix = eliminationRandom(as.matrix(myMatrix), myGuy,a)
        a = a[-1]
        k = k+1
      }
      res = list(NumberSteps=k, Candidates=myMatrix)
      return(res)
    }

    application = function(myMatrix, myGuy, myStrategy){
      #   myStrategy is in {50, random, 90}
      if(myStrategy == "random"){return(strategyRandom(myMatrix, myGuy))}
    }

    # initialization
    numberPeople = 100
    numberInstitution = 10
    memory = list("fifty" = c(), "random" = c(), "ninety" = c())
    for(i in 1:50){
      mat = createMatrix(numberPeople, numberInstitution)
      copyMat = mat
      guy = copyMat[sample(1:length(copyMat[,1]),1),]
      #   copy2 = mat
      #   copy3 = mat
      #   guy = mat[sample(1:numberPeople,1),]
      #   memory$fifty = c(memory$fifty, application(copy1, guy, "50")$NumberSteps)
      #   memory$random = c(memory$random, application(copy2, guy, "random")$NumberSteps)
      memory$ninety = c(memory$ninety, strategy(copyMat, guy, 0.1)$NumberSteps)
      memory$random = c(memory$random, application(copyMat, guy, "random")$NumberSteps)
      memory$fifty = c(memory$fifty, strategy(copyMat, guy, 0.5)$NumberSteps)
    }


    #plotting
    library(shape)
    png(filename="~/Chr10/post9_figure1.png", bg="white")
    col <- shadepalette(9, "cyan", "blue")
    plot(c(),ylim=c(4.2,10), xlim = c(3, 48), xlab ="Simulationss", ylab ="Number of questions")
    polygon(c(1:50,50:1), c(memory$fifty,rev(memory$ninety)), col="#EFDECD")
    a <- c(mean(memory$fifty)-var(memory$fifty),mean(memory$fifty)+var(memory$fifty))
    b <- c(mean(memory$ninety)-var(memory$ninety),mean(memory$ninety)+var(memory$ninety))
    lines(memory$ninety, type = 'l', col=col[3], lwd = 2)
    abline(h=mean(memory$fifty), col=col[7], lwd =2)
    abline(h=mean(memory$ninety), col=col[3], lwd = 2)
    lines(memory$fifty, col = col[7], lwd = 2)
    title("Difference between the two strategies")
    dev.off()

    png(filename="~/Chr10/post9_figure2.png", bg="white")
    plot(table(memory$ninety)/50, ylim=c(0,0.7), type="p",col = col[3], lwd = 4,
         main="Distribution of the number of questions
    for both strategies",
         xlab="Number of questions",
         ylab="Probability")
    lines(table(memory$fifty)/50, type="p",col = col[7], lwd = 4)
    legend(x=4, y= 0.7,legend =c("50/50", "90/10"), col=c(col[7],col[3]), pch=  1, pt.lwd = 4)
    dev.off()

    0

    Add a comment

  7. If you want to choose randomly your next holidays destination, you are likely to process in a way which is certainly biased. Especially if you choose randomly the latitude and the longitude. A bit like they do in this lovely advertising (For those of you who do not speak French, this is about a couple who have won the national gamble prize and have to decide their next travel. The husband randomly picks Australia and the wife is complaining : "Not again!"). So let me help you to choose your next travel!


    If we were able to generate uniformly distributed variables on [0 ; 1], we could easily generate variables on other spaces such as [0 ; 1] x [0 ; 1], which is a square of side 1. It can be simply done by using two independant variables X1 and X2, uniformly distributed on [0 ; 1], and by considering X = (X1 , X2). Then X is a random variable evenly distributed on the square [0 ; 1] x [0 ; 1]

    However, this generation may be a bit more complicated if we work with more complicated shapes, such as a sphere for example. Indeed, for a sphere of a given radius, say R = 1, it is possible to project the variables of a square on the sphere.

    The first method I thought of, is wrong. I was tempted to generate two variables Y1 and Y2 where Y1 follows a uniform on  [0 ; Pi] and Y2 follows a uniform on [-Pi ; Pi]. In other words :
    - Y1 = Pi * X1
    - Y2 = 2Pi * (X2 - 0.5).
     Then I wanted to consider (Y1, Y2) as the spherical coordinated of the sphere.

    However, this method does NOT generate a uniform on the sphere. Indeed, it has a tendency to over generate north and south poles. The reason is simple, this method generates, in average the same amount of variable in each latitude. North and South poles are of smaller area than Equator. Therefore, the closer we are from the Equator latitude the less variables there are.

    On the following graph, we observe a high density of point in the "middle" of the sphere, this is the North Pole. It shows that this method does not offer a uniform distribution. By the way, the graph is computed with the library rgl, which provides a display device for 3 dimensions objects. And then a little code provided in the rgl help allows to move automatically the device and take a snapshot for each view. You can eventually generate a GIF on gifmake (like for the map in Spatial segregation in cities - An explanation by a neural network model).
    A wrong method to generate evenly distributed variables on the sphere. The North Pole and the South Pole are over represented.

    Many methods have been proposed to generate evenly distributed random variables on a sphere. We propose one of them here. We consider the couple z = (u,v) defined as :

    - u = 2 * Pi * X1
    - v = arc-cos(2 * (X2 - 0.5))

    In this case, a theorem shows that z = (u , v) generates evenly distributed variables. It can be observed on the next graph. There is no irregularity in the distribution of the random variables.
    A correct simulation of a uniform distribution on a sphere. There is no over represented area.

    Other methods exist. I like this one since it is really simple and uses a uniform distribution at the beginning. The idea of this post is to show that generating a uniform distribution can be adapted to many shapes and cases. However, to do so, a previous analytical study has to be done to find the correct transformation. 

    So this method would help you to avoid being too many times in the chilly places such as North Pole and South Pole since it does not overrepresent the extreme latitudes.

    The program (R) :

    # import package to plot in 3D
    install.packages("rgl", dependencies = TRUE)
    library(rgl)

    ################################################################
    # Uniform distribution in a square
    ################################################################

    size = 10000
    x1 = runif(size)
    x2 = runif(size)

    # the option pch = '.' change the symbol for the graph into dot.
    # cex = 2 doubles the size of the dots
    plot(x1,x2, col = 'blue', pch = '.', cex = 2)

    ################################################################
    # Wrong solution for the sphere
    ################################################################

    y1 = pi * x1
    y2 = 2* pi * (x2-0.5)
    y = matrix(0, nrow = 2, ncol = size)
    y[1,] = y1
    y[2,] = y2
    plot(y1, y2)

    # This function transform the spherical coordinates into cartesian coordinates
    sphereToCartesian = function(matrice){
      x= matrix(0,nrow = 3, ncol = length(matrice[1,]))
      x[1,] = sin(matrice[2,]) * cos(matrice[1,])
      x[2,] = sin(matrice[2,]) * sin(matrice[1,])
      x[3,] = cos(matrice[2,])
      return(x)
    }

    a = sphereToCartesian(y)
    plot3d(x = a[1,], y = a[2,], z = a[3,])

    #you should enlarge the device window, before running this, if you want to have a meaningful graph
    rgl.bringtotop()
    rgl.viewpoint(0,20)

    for (i in 1:45) {
      rgl.viewpoint(i,20)
      filename <- paste("pic",i,".png",sep="")
      rgl.snapshot(filename, fmt="png")
    }


    ################################################################
    # Correct solution for the sphere
    ################################################################

    uniformSphere = function(length){
      x1 = runif(length, 0,1)
      x2 = runif(length, 0,1)
      u = 2*pi*x1
      v = acos(2*x2- 1)
      z =matrix(0, ncol = length, nrow = 2)
      z[1,] = u
      z[2,] = v
      return(z)
    }

    z = uniformSphere(size)
    b = sphereToCartesian(z)
    plot3d(x = b[1,], y = b[2,], z = b[3,])


    4

    View comments

  8. My previous post is about a method to simulate a Brownian motion. A friend of mine emailed me yesterday to tell me that this is useless if we do not know how to simulate a normally distributed variable.

    My first remark is: use the rnorm() function if the quality of your simulation is not too important (Later, I'll try to explain you why the R "default random generation" functions are not perfect). However, it may be fun to generate a normal distribution from a simple uniform distribution. So, yes, I lied, I won't create the variable from scratch but from a uniform distribution. 

    The method proposed is really easy to implement and this is why I think it is a really good one. Besides, the result is far from being trivial and is really unexpected. This method is called the Box-Muller method. You can find the proof of this method here. The proof is not very complicated, however, you will need a few mathematical knowledges to understand it.

    Let u and v be two independent variables uniformly distributed.  Then we can define:

    x = sqrt(-2log(u))sin(2 PI v)
    y = sqrt(-2log(u))cos(2 PI v)

    x and y are two independent and normally distributed variables. The interest of this method is its extreme simplicity in term of programming (We only need 9 lines if we don't want to test the normality of the new variables neither plot the estimation of the density).

    We can obtain a vector of variables normally distributed. The Lillie test doesn't reject the null hypothesis of normal distribution. Besides, we can plot the estimation of the density of the variables. We obtain the following plot that looks indeed similar to the Gaussian density.



    The program (R):

     # import the library to test the normality of the distribution
    library(nortest)

    size = 100000

    u = runif(size)
    v = runif(size)

    x=rep(0,size)
    y=rep(0,size)

    for (i in 1:size){
      x[i] = sqrt(-2*log(u[i]))*cos(2*pi*v[i])
      y[i] = sqrt(-2*log(u[i]))*sin(2*pi*v[i])
    }

    #a test for normality
    lillie.test(c(x,y))

    #plot the estimation of the density
    plot(density(c(x,y)))


    0

    Add a comment

  9. The Brownian motion is certainly the most famous stochastic process (a random variable evolving in the time). It has been the first way to model a stock option price (Louis Bachelier's thesis in 1900).


    The reason why is easy to understand, a Brownian motion is graphically very similar to the historical price of a stock option.
    A Brownian motion generated with R (n = 1000)


    The historical price of the index FTSE MID 250, source Yahoo Finance.



    Even though it is not any more considered as the real distribution of stock options, it is still used in finance and methods to simulate a Brownian motion are really useful. As you will see the code is strikingly easy. We have used the Donsker's principle.

    If Y(i) is a serie of k variables normally distributed, then we can  create X(i) = (Y(1) + ... + Y(i))sqrt(i/k). Then X(i) is a proxy of the Brownian motion.

    The code (R) : 

    size = 1000
    y = rnorm(size)
    x = y

    for (i in 1:size){
      x[i] = 1/sqrt(size)*(sum(y[1:i])*sqrt(i))
    }

    plot(x, type = 'l', xlab = 'Time', ylab = 'Profit')
    1

    View comments

  10. The merge of two insurance companies enables to curb the probability of ruin by sharing the risk and the capital of the two companies.

    For example, we can consider two insurance companies, A and B. A is a well known insurance company with a big capital and is dealing with a risk with a low variance. We will assume that the global risk of all its customers follow a chi-square distribution with one degree of freedom. Besides, we assume that the initial capital of A is 3 and the premium charged to its customers is 1.05. The reason why it is charged 1.05 is a bit intricate and is not of great importance for this post, I may explain it later, in another post. The idea is that A charges a premium slightly over the expectation of a chi-square with one degree of freedom.

    The second firm B is in a very different situation. This is a very new insurance company with a very low capital (let's say 0.1). Its customers have claims which follow a chi square distribution with two degrees of freedom, which means that the insured product is more risky and the variance of the risk is higher too. B is likely to charge its customers with a higher premium than A. Indeed, B has a low capital, the expectation of the claim as well as the variance are higher. So we consider a premium of 3.

    To sum up:


    Capital Premium Claims
    Firm A k1 = 3 premium1 = 1.05 x1 ~ χ(1)
    Firm B k2 = 0.1 premium2 = 3 x2 ~ χ(2)


    Finally we do a last assumption which is the independence of the claims. This assumption which seems relevant is not such a common place in real insurance problem and is a real source of interesting issues.

    Every day, the firms A and B receive their premiums and pay back the claims to their customers.

    The probability of ruin is, in risk management, and in the current project of risk management rules for insurances (solvency 2), the measure of interest. For a certain number of days, the company A is in a ruin situation if there has been at least one day where the total amount of claims has been greater than the sum of the initial capital and the total amount of premiums. To estimate such a probability we can easily (at least in this case) simulate the model. The probability of ruin is the probability to be in such a situation. For example, in the next plot, we can see that in a particular simulation for the firm A, A experienced a ruin as soon as the day number 7.

    To estimate the probability of ruin for a given number of days, we simulate by Monte Carlo method this situation. For 100 days with a sample of 100,000 simulations, the estimation of the probability of ruin of the company A is 0.66, for the company B, it is 0.41. What is interesting is that if the two companies were merging, which means if they were sharing their claims, theirs premiums and their initial capital, the estimated probability of ruin of the merge company is 0.23. We can see, that merging the company reduce the probability of ruin.

    A limit could be pointed out. The problem in merging two companies is that if the merged company experiences a ruin, it is a very serious problem. Indeed, it may be considered as better to have one company ruined often, and very rarely both of them ruined, than a situation where the merged company is sometimes ruined... In our particular case, and this is why I choose two different profile of firms, this is not a relevant problem. The estimated probability of simultaneous ruin of both A and B is 0.26 which is slightly greater than the estimator of the ruin probability of the merged company. 

    The last plot is an example of a situation where the merged company would not have been ruined while both A and B would have been ruined within 100 days. 

    The red line is the company B while the black line is A and the purple line is the merged company. Both A and B have been ruined, while the merged company has not been ruined.

    The code (R):

    premium1 = 1.05
    premium2 = 3

    horizon = 100
    length = 100000
    # sum1 is the number of time company 1 is ruined
    sum1 = 0
    # sum2 is the number of time company 2 is ruined
    sum2 = 0
    # sum3 is the number of time companies 1 and 2 are ruined
    sum3 = 0
    # sum is the number of time the merge company is ruined
    sum = 0

    for(j in 1:length){
      k1 = rep(0, horizon+1)
      k2 = rep(0, horizon+1)
      k = rep(0, horizon+1)
      k1[1] = 3
      k2[1] = 0.1
      k[1] = k1[1]+k2[1]
      x1 = rchisq(horizon, 1)
      x2 = rchisq(horizon, 2)
     
      ruinX1 = FALSE
      ruinX2 = FALSE
      ruinX = FALSE
     
      for(i in 1:horizon){
        k1[i+1] = premium1 + k1[i] - x1[i]
        k2[i+1] = premium2 + k2[i] - x2[i]
        k[i+1] = premium1 + premium2 + k[i] - x1[i] - x2[i]
        if(k1[i+1]<0){
          ruinX1 = TRUE
        }
        if(k2[i+1]<0){
          ruinX2 = TRUE
        }
        if(k[i+1]<0){
          ruinX = TRUE
        }
      }
      if(ruinX1 & ruinX2){
        sum3 = sum3 + 1
        sum1 = sum1 + 1
        sum2 = sum2 + 1
      }
      else if(ruinX1){
        sum1 = sum1 + 1
      }
      else if(ruinX2){
        sum2 = sum2 + 1
      }
      if(ruinX){
        sum = sum + 1
      }
    }

    # prob of ruin of A
    prob1 = sum1 / length
    # prob1
    # prob of ruin of B
    prob2 = sum2 / length
    # prob2
    # prob of ruin of the merged firm
    prob = sum / length
    # prob
    # prob of ruin of A AND B
    prob3 = sum3 / length
    # prob3

    plot(k1, type = 'l', xlab = 'days', ylab = 'available capital', ylim =c(min(k2, k, k1),max(k2, k, k1)))
    lines(k2, col = 'red')
    lines(k, col = 'purple')
    abline(0, 0)
    0

    Add a comment

Blog Archive
Translate
Translate
Loading