I already talked about networks a few times in this blog. In particular, I had this approach to explain spatial segregation in a city or to solve the Guess Who? problem. However, one of the question is how to generate a good network. Indeed, I aim to study strategy to split a network, but I need first to work with a realistic neural network. I could have downloaded data of a network, but I'd rather study the different models proposed to generate neural networks.



I will explain and generate the three most famous models of neural networks:
- The Erdős-Rényi model;
- The Watts and Strotgatz model (small world model); 
- The Barabási-Albert preferential attachment model.

We represent each model with a matrix of acquaintance. The intersection of the column i and the row j is a 1 if and only if the nodes i and the node j know each other. Since we simulate reciprocal neural network (i.e. if i knows j then j knows i), we can work on a triangle matrix and not worry about the lower triangle of our matrices. Here, I use the R function image() to represent these matrices. In red are the 0, in white are the 1.

The Erdős-Rényi model.

This model is certainly the simplest of the three models. Only two parameters are required to compute this model. N is the number of nodes we consider and p, is the probability for every couple of nodes to be linked by an edge.

This model assumes that the existence of a link between two nodes is independent to the other link of the graph. According to Daniel A Spielman, this model has not been created to represent any realistic graph. However, this model has some very interesting properties. The average path is of length log(N) which is relatively short.
Besides, if p < 1, for N great enough, the clustering coefficient converges toward 0 (almost surely). The clustering coefficient for one point, is in simple word the ratio between all the existing edges between the neighbors of this point to all the possible edges of these neighbors.

On this figure the clustering coefficient of A is 1/3, there are 3 possible edges between the neighbors of A (X-Y, Y-Z, Z-X) and only one (Z-Y) is linked.




The Watts and Strotgatz model (small world model).

This model is really interesting, it assumes that you know a certain number of persons (k) and that your are more likely to know your closest neighbors. The algorithm though more complicated than the Erdős-Rényi model's is simple. We have 3 parameters. The number of the population (N), the number of close neighbors (k) and a probability p. For any variable, for every close neighbor, the probability to be linked with it is (1-p). For every close neighbor not linked with, we choose randomly in the further neighbors an other link.


Because this model generates some conglomerates of people knowing each other, it is really easy to be linked indirectly (and with a very few number of steps) with anyone in the map. This is why we call this kind of model a small world model. This is, in the three we describe here the closest from the realistic social network of friendship.

The Barabási-Albert preferential attachment.

This model is computing with a recursive algorithm. Two parameters are needed, the initial number of nodes (n0) and the total number of node (N). At the beginning, every initial node (the n0 first nodes) knows the other ones, then, we create, one by one the other node. At the creation of a new node, this node is linked randomly to an already existing node. The probability that the new node is linked to a certain node is proportional to the number of edges this node already has. In other word, the more links you have, the more likely new nodes will be link to you.

This model is really interesting, it is the model for any neural network respecting the idea of "rich get richer". The more friends one node has, the more likely the new nodes will be friend with him. This kind of model is relevant for internet network. Indeed, the more famous is the website, the more likely this website will be known by other websites. For example Google is very likely to be connected with many websites, while it is very unlikely that my little and not known blog is connected to many websites.


The code (R) : 

###############################################################
# ER model
###############################################################

generateER = function(n = 100, p = 0.5){
  map = diag(rep(1, n))
  link = rbinom(n*(n-1)/2, 1,p)
  t = 1
  for(j in 2:n){
    for(i in 1:(j-1)){
      map[i,j] = link[t]
      t = t + 1
    }
  }
  return(map)
}


###############################################################
# WS model
###############################################################
f = function(j, mat){
  return(c(mat[1:j, j], mat[j,(j+1):length(mat[1,])]))
}

g = function(j, mat){
  k = length(mat[1,])
  a = matrix(0, nrow = 2, ncol = k)
  if(j>1){
    for(i in 1:(j-1)){
      a[1,i] = i
      a[2,i] = j
    }
  }
  if(j<k){
    for(i in (j+1):k){
      a[1,i] = j
      a[2,i] = i
    }
  }
  a = a[,-j]
  return(a)
}
g(1, map)
callDiag = function(j, mat){
  return(c(diag(mat[g(j,mat)[1, 1:(length(mat[1,])-1)], g(j,mat)[2, 1:(length(mat[1,])-1)]])))
}

which(callDiag(4,matrix(runif(20*20),20,20)) <0.1)

generateWS = function(n = 100, k = 4 , p = 0.5){
  map = matrix(0,n,n)
  down = floor(k/2)
  up = ceiling(k/2)
  for(j in 1:n){
      map[(((j-down):(j+up))%%n)[-(down + 1)],j] = 1
  }
  map = map|t(map)*1
 
  for(j in 2:n){
    list1 = which(map[(((j-down):(j))%%n),j]==1)
    listBusy = which(map[(((j-down):(j))%%n),j]==1)
    for(i in 1:(j-1)){
      if((j-i<=floor(k/2))|(j-i>= n-1-up)){
        if(rbinom(1,1,p)){
          map[i,j] = 0
          samp = sample(which(callDiag(j, map) == 0), 1)
          map[g(j, map)[1, samp], g(j, map)[2, samp]] = 1
        }
      }
    }
  }
 
  return(map*1)
}


###############################################################
# BA model
###############################################################

generateBA = function(n = 100, n0 = 2){
  mat = matrix(0, nrow= n, ncol = n)
  for(i in 1:n0){
    for(j in 1:n0){
      if(i != j){
        mat[i,j] = 1
        mat[j,i] = 1
      }
    }
  }
  for(i in n0:n){
    list = c()
    for(k in 1:(i-1)){
      list = c(list, sum(mat[,k]))
    }
    link = sample(c(1:(i-1)), size = 1, prob = list)
    mat[link,i] = 1
    mat[i,link] = 1
  }
  return(mat)
}


###############################################################
# Graphs
###############################################################

image(generateER(500))
image(generateWS(500))
image(generateBA(500))
0

Add a comment

The financial market is not only made of stock options. Other financial products enable market actors to target specific aims. For example, an oil buyer like a flight company may want to cover the risk of increase in the price of oil.

I found a golden website. The blog of Esteban Moro. He uses R to work on networks. In particular he has done a really nice code to make some great videos of networks. This post is purely a copy of his code. I just changed a few arguments to change colors and to do my own network.

3

As you have certainly seen now, I like working on artificial neural networks. I have written a few posts about models with neural networks (Models to generate networks, Want to win to Guess Who and Study of spatial segregation).

1

I already talked about networks a few times in this blog. In particular, I had this approach to explain spatial segregation in a city or to solve the Guess Who? problem. However, one of the question is how to generate a good network.

The function apply() is certainly one of the most useful function. I was scared of it during a while and refused to use it. But it makes the code so much faster to write and so efficient that we can't afford not using it.

1

Have you ever played the board game "Guess who?".

If you want to choose randomly your next holidays destination, you are likely to process in a way which is certainly biased. Especially if you choose randomly the latitude and the longitude.

4

My previous post is about a method to simulate a Brownian motion. A friend of mine emailed me yesterday to tell me that this is useless if we do not know how to simulate a normally distributed variable.

The Brownian motion is certainly the most famous stochastic process (a random variable evolving in the time). It has been the first way to model a stock option price (Louis Bachelier's thesis in 1900).

1

The merge of two insurance companies enables to curb the probability of ruin by sharing the risk and the capital of the two companies.

For example, we can consider two insurance companies, A and B.

How to estimate PI when we only have R and the formula for the surface of a circle (Surface = PI * r * r)?

The estimation of this number has been one of the greatest challenge in the history of mathematics. PI is the ratio between a circle's circumference and diameter.

I was in a party last night and a guy was totally drunk. Not just the guy who had a few drinks and speaks a bit too loud, but the one who is not very likely to remember what he has done during his night, but who is rather very likely to suffer from a huge headache today.

I am currently doing an internship in England. Therefore, I keep alternating between French and English in my different emails and other forms of communication on the Internet. I have been surprised to see that some websites are able to recognize when I use French or when I use English.

The VIX (volatility index) is a financial index which measures the expectation of the volatility of the stock market index S&P 500 (SPX). The higher is the value of the VIX the higher are the expectations of important variations in the S&P 500 during the next month.

The Olympic Games have finished a couple of days ago. Two entire weeks of complete devotion for sport. Unfortunately I hadn’t got any ticket but I didn’t fail to watch many games on TV and internet.

Hello (New World!), 

My name is Edwin, I’m a 22 year-old French student in applied mathematics. In particular, I study probability, statistics and risk theory.

Blog Archive
Translate
Translate
Loading