Betweenness centrality is a slow calculation. The algorithm used by networkx is O (VE) where V is the number of vertices and E the number of edges. In your case VE = 10^13. I expect importing the graph to take O (V+E) time, so if that is taking long enough that you can tell it's not instantaneous, then O (VE) is going to be painful.
Example 1.3 Betweenness and Closeness Centrality for Computer Network Topology
Consider a small network of 10 computers spread out across an office. Let a node represent a computer, and let a link represent a direct connection between the machines. For this example, consider the links as Ethernet connections that enable data to transfer between computers. If two computers are not connected directly, then the information must flow through other connected machines. Consider a topology as shown in Figure 1.142. This is an example of the well-known kite network, which was popularized by David Krackhardt (1990) for better understanding of social networks in the workplace.
Figure 1.142: Office Computer Network
Define the link data set as follows:
To better understand the topology of the computer network, calculate the degree, closeness, and betweenness centrality. It is also interesting to look for articulation points in the computer network to identify places of vulnerability. All of these calculations can be done in one call to PROC OPTGRAPH as follows:
Output 1.3.1 shows the resulting node data set
NodeSetOut
sorted by closeness.
Output 1.3.1: Node Closeness and Betweenness Centrality, Sorted by Closeness
node | centr_degree_out | centr_close_unwt | centr_between_unwt | artpoint |
---|---|---|---|---|
C | 5 | 0.64286 | 0.23148 | 0 |
F | 5 | 0.64286 | 0.23148 | 0 |
D | 6 | 0.60000 | 0.10185 | 0 |
H | 3 | 0.60000 | 0.38889 | 1 |
B | 4 | 0.52941 | 0.02315 | 0 |
E | 4 | 0.52941 | 0.02315 | 0 |
A | 3 | 0.50000 | 0.00000 | 0 |
G | 3 | 0.50000 | 0.00000 | 0 |
I | 2 | 0.42857 | 0.22222 | 1 |
J | 1 | 0.31034 | 0.00000 | 0 |
Output 1.3.2 shows the resulting node (
NodeSetOut
) and link data sets (LinkSetOut
) sorted by betweenness.
Output 1.3.2: Node Closeness and Betweenness Centrality, Sorted by Betweenness
Obs | node | centr_degree_out | centr_close_unwt | centr_between_unwt | artpoint |
---|---|---|---|---|---|
1 | H | 3 | 0.60000 | 0.38889 | 1 |
2 | C | 5 | 0.64286 | 0.23148 | 0 |
3 | F | 5 | 0.64286 | 0.23148 | 0 |
4 | I | 2 | 0.42857 | 0.22222 | 1 |
5 | D | 6 | 0.60000 | 0.10185 | 0 |
6 | E | 4 | 0.52941 | 0.02315 | 0 |
7 | B | 4 | 0.52941 | 0.02315 | 0 |
8 | A | 3 | 0.50000 | 0.00000 | 0 |
9 | G | 3 | 0.50000 | 0.00000 | 0 |
10 | J | 1 | 0.31034 | 0.00000 | 0 |
Obs | from | to | biconcomp | centr_between_unwt |
---|---|---|---|---|
1 | H | I | 2 | 0.44444 |
2 | C | H | 3 | 0.29167 |
3 | F | H | 3 | 0.29167 |
4 | I | J | 1 | 0.25000 |
5 | E | F | 3 | 0.12963 |
6 | B | C | 3 | 0.12963 |
7 | A | C | 3 | 0.12500 |
8 | F | G | 3 | 0.12500 |
9 | C | D | 3 | 0.09259 |
10 | D | F | 3 | 0.09259 |
11 | A | D | 3 | 0.08333 |
12 | D | G | 3 | 0.08333 |
13 | C | F | 3 | 0.07407 |
14 | B | E | 3 | 0.07407 |
15 | B | D | 3 | 0.05093 |
16 | D | E | 3 | 0.05093 |
17 | A | B | 3 | 0.04167 |
18 | E | G | 3 | 0.04167 |
The computers with the highest closeness centrality are C and F, because they have the shortest paths to all other nodes. These computers are key to the efficient distribution of information across the network. Assuming that the entire office has some centralized data that should be shared with all computers, machines C and F would be the best candidates for storing the data on their local hard drives. The computer with the highest betweenness centrality is H. Although machine H has only three connections, it is one of the most important machines in the office because it serves as the only way to reach computers I and J from the other machines in the office. Notice also that machine H is an articulation point because removing it would disconnect the office network. In this setting, computers with high betweenness should be carefully maintained and secured with UPS (uninterruptible power supply) systems to ensure they are always online.
$begingroup$
I am doing some calculations about centrality in graphs, and I need to calculate Betweenness Centrality in all nodes in this graph. I only have the correct answers, not any way how to do it. Been searching around for like two days, but none of my calculations are correct. Could anyone help me on HOW to calculate this?
BetweennessCentrality GraphDon't mind the links out of the cloud, they are not counted as a part of the network.
CorneliaCornelia
$endgroup$