Linklog - Week 9

Open Data and Civic Apps

  • I found a interesting company called Data Made, from Chicago. A civic technology company working in projects related to open data.
  • They are also the creators of Dedupe, a Python project to perform data deduplication. It uses a similar approach to the one we use for hotel deduplication @TrustYou.

Colombian Software Developers

Recently two interesting articles about software developers in Colombia were written:

The first one is from Alexander Torrenegra and the other was written by Juan Pablo Buritica. Juan Pablo somehow rebuts Alexander some of Alexander's claims. I think Both make interesting points and I think they have the experience and the authority to talk about the subject. I share a bit more the opinion of Alexander as I think Computer Science education in Colombia is not good and need an overhaul. I would love to write a small piece on my experience as I graduated from a top Colombian university and I believe the education on computer science related topics is precarious.

Social Network Extravaganza

Thanks to the Mining Masive Data Sets class on Coursera and that I started getting curious curious about it. I wanted to explore a bit more and I downloaded the my friends dataset from Facebook and imported into Gephi, a tool for Graph visualization and analysis.

Gephi interface.

Below a couple of interesting links for people that want to do the same:

Install Gephi on Mac OSX Yosemite

There some issues with installing Gephi on the last version of OSX. This helped me solving the issue. http://sumnous.github.io/blog/2014/07/24/gephi-on-mac/

Extract your Friend Network in Facebook

I nice tool that helped me to extract my Facebook friend graph in a Gephi compatible file format. http://snacourse.com/getnet

Storify

https://storify.com/ I found this tool to be really nice to keep a log of event and add social network based stories.

SimString

http://www.chokkan.org/software/simstring/ I have been taken the MDDS class on Coursera. Challenging but super interesting. I wanted to use LSH to do a sort of hashing of near duplicate short sentences. It is a small dataset so I believe this algorithm will work actually better.

Delete Training whitespace on Save and compact Empty Line on Emacs

http://ergoemacs.org/emacs/elisp_compact_empty_lines.html Awesome snippet, specially for Python development on Emacs

Comments

Comments powered by Disqus