Sunday, June 20, 2010

Detecting cohesive subgroups through SNA

Social Network Analysis (SNA) is a formal method, or rather a family of methods, that can be used to examine differentiated patterns of interaction between actors. Most social research methods work with attributional data. They measure the individual actor’s personal attributes (such as age, gender, socio-economic status, etc.), then try to group together individuals possessing similar profiles of attributes, and conclude that social behavior is the result of these common attributes. By contrast, SNA works with relational data. It adopts as its unit of analysis the ties or relations between actors, and tries to explain social behavior as a result of the patterns of strong and weak ties between actors, and the resulting constraints to social behavior. Once ties are defined and measured no assumption needs to be made about the spatial position or other characteristics of individual actors.

The fact that SNA only requires relational data made it well-suited to analyze newsgroups, where personal attributes of participants are invisible and not readily disclosed. But one thing that was easy to observe in newsgroups is who posted to who. Even after just following a newsgroup for a couple of days, patterns started to emerge: some people posted a lot, some people's posts were highly commented (for better or for worse), some people seemed to be ignored, etc. Since my research interest was to detect stable communities within newsgroups, I decided I would try to detect subsets of participants that displayed high levels of interaction between them. In other words, using SNA terminology, I was looking for cohesive subgroups, within the total population of the newsgroup.

To do this, I needed a complete listing (or at least a large sample) of messages posted to the newsgroup, and I needed to record all the combinations of one actor replying to another's post. Fortunately, Usenet messages have a standard format that made this task relatively easy. I used the From and References headers of each message. The first one told me who had posted the message; the second one told me to which message it was a follow-up. Using a simple BASIC routine, I imported these headers into an Access database, and then used a query to obtain a listing of all From-To combinations. In effect, this listing was the social network participants had created over a specific period of time through their online interactions. I then imported this into Ucinet, an SNA program, and was able to identify those people who formed the most cohesive subgroups in the 19 newsgroups I was focusing on: in effect who were the most active members of the virtual community in the newsgroup, and just how active they were.

I recently wrote a book chapter describing this technique and included as an appendix the BASIC routine I used. The book will come out later this year, here is the reference:

Murillo, E. (2010) Using social network analysis to guide theoretical sampling in an ethnographic study of a virtual community. In Ben Kei Daniel (Ed.), Handbook of Research on Methods and Techniques for Studying Virtual Communities: Paradigms and Phenomena. Hershey, PA: IGI Global.

No comments:

Post a Comment