Using data science to probe the New York Social Diary and find which people appear more in pictures and which people tend to occur together..

  1. The New York Social Diary ( provides a fascinating lens onto New York's socially well-to-do. The data forms a natural social graph for New York's social elite. 

  2. I crawled the website for party pictures and used python to parse html from pictures over the last 10 years. Python's natural language processing libarary (NLTK) was used to isolate names from picture captions.

  3. Graph theory was used to create nodes (names) and edges (did they appear in picture together?) using the networx library in python.

  4. Here we look at who appears the most in NY party pictures and which celebrities tend to appear together in pictures.

  5. Check out the code for this project at 

