Baseball Data


What is this? I am finding tools for data science and trying to find the best way to synthesize them. I built the smallest project I could so I could focus on gathering ideas and streamlining my process for the future.

What tools am I using?

  • Authorea - Build collaborative and sharable project reports

  • Evernote - Save, annotate, and share notes

  • Trello - Create linkboards for quicks access to project materials

  • Wakari - Create and share analysis using IPython

  • Plotly - Create and share plots

  • Gistbox - Save, share, and access code snippets

  • Github - Share project files

  • Cloud9 - Cloudbased IDE for project files.

  • Wordpress - Post summary and links to shared resources as blog post

  • Asana - Streamline processes by creating repeatable checklists

  • Google Drive - Create and share documents

  • OpenRefine - clean data before analysis

  • BigQuery - Large scale SQL.

Why these tools?

  • Cheap/free

  • collaborative/shareable

  • Cloud/browser based

What did I learn?

I had to use most of these programs for the first time and come up with systems to integrate them. It’s a very unimpressive end project but I now have the foundations to quickly distribute projects in the future.

What is next?

Layout checklists, create templates, and build projects. Refine system a few times before trying to extend to any other tools.