WORKING DRAFT authorea.com/40798
Main Data History
Export
Show Index Toggle 0 comments
  •  Quick Edit
  • Baseball Data

    Introduction

    What is this? I am finding tools for data science and trying to find the best way to synthesize them. I built the smallest project I could so I could focus on gathering ideas and streamlining my process for the future.

    What tools am I using?

    • Authorea - Build collaborative and sharable project reports

    • Evernote - Save, annotate, and share notes

    • Trello - Create linkboards for quicks access to project materials

    • Wakari - Create and share analysis using IPython

    • Plotly - Create and share plots

    • Gistbox - Save, share, and access code snippets

    • Github - Share project files

    • Cloud9 - Cloudbased IDE for project files.

    • Wordpress - Post summary and links to shared resources as blog post

    • Asana - Streamline processes by creating repeatable checklists

    • Google Drive - Create and share documents

    • OpenRefine - clean data before analysis

    • BigQuery - Large scale SQL.

    Why these tools?

    • Cheap/free

    • collaborative/shareable

    • Cloud/browser based

    What did I learn?

    I had to use most of these programs for the first time and come up with systems to integrate them. It’s a very unimpressive end project but I now have the foundations to quickly distribute projects in the future.

    What is next?

    Layout checklists, create templates, and build projects. Refine system a few times before trying to extend to any other tools.