Social Network Analysis
Posted by Oleg Solovyev on Aug 14, 2011
One of the newest fields in data mining is Social Network Analysis (SNA). The task is to find out your friends (first circle), then friends of your friends (second circle) etc. Mathematicians call it “to develop a graph” made of nodes (the people) and edges (ties between people).
For example in Telecom graphs can be built using phone calls data. The people you call are your first circle. They are relatives, colleagues or friends. You value those people and listen to their opinions. If one of your friends uses mobile internet the telecom operator can offer this service to you with a high probability of purchase.
Social networks like Facebook can find out your first circle using your “friends list” or monitoring the personal pages you visit. The advertising you saw on Facebook could be shown to you because one of your friends clicked on it earlier.
Social Networks are also important in debt collection. The colleagues and friends can influence the debtor and make him pay the debt. This is why banks and collection agencies do actively collect contact information of your friends, neighbors and colleagues. It sometimes happens that the debt is payed by the friends or relatives, not the debtor.
For software companies their experts are the most valued asset. Every person in the company should have access to the expert’s knowledge. The social network graph can show whether the expert is actively helping other colleagues or is he isolated from others. This graph can use data on internal mail and phone conversations.
For example the graph above is based on internet forum SQL.ru → OLAP and DWH. The nodes are the forum members and the edges show whether the member took part in other member’s thread. Every edge has a weight that equals the number of member A posts in member’s B thread plus number of member B posts in member’s A thread.
At first I made a list of all the 3 000+ forum members and added edges. The graph looked like a black spot on a monitor. I removed all the edges with the weights less than 10 and deleted all the members left without edges. That is the last graph in the video. Then I continued to delete edges with a minimal weights till there was only one edge left. That is the first graph in the video. Then I put the graphs in the video in the reverse order, starting with the smallest graph to the biggest one.
The video bellow is based on the forum SQL.ru → Просто треп (just chat).