Mastery- Sarah, Justin E., Emily Jane (Wild Watermelon)

Oct 5, 2012 • Emily Jane McTavish

Wild Watermelon

We found that our questions covered 5 main categories: trusting code/debugging, efficiency, code sharing/reuse, data storage and help.
Trusting code/debugging
How do I know that my results and processes are “correct”?

  1. Novice: Check entire, finished code against a few known test situations.
  2. Intermediate: Test individual functions before moving on. Use common patterns and good organization to minimize errors.
  3. Advanced: Use test suites and testing frameworks to test continuously.  Participate in code reviews.

Efficiency
How do I improve my efficiency and automate the mundane parts of my research?

  1. Novice: Use code developed by others. Write some single use scripts.
  2. Intermediate: Write libraries of reusable code.
  3. Advanced: Develop computational pipelines for analysis. Share these with other researchers.

Code sharing/reuse
How can I become more confident in sharing / publishing my code?

  1.  Novice: hack it together and hope it works, use some comments, provide on request.
  2.  Intermediate: Use version control, organize code better: functions, classes, object oriented programming.
  3.  Advanced: Version control, GitHub, actively promote/publish.

Data storage
How can I deal with data?

  1.  Novice: store data in a series of excel spreadsheets, when error-checking, go through data line by line or using ‘sort’ for data that doesn’t look right. Use sort to subset data to use for analyses
  2.  Intermediate: Flat files accessed by scripts or SQL or other relational database. Use scripts to flag questionable data (e.g., boundary cases). Keep a README file to describe the data.
  3.  Advanced: SQL or other relational database that is connected to a backup system. Optimize database design for use by other researchers and to minimize redundancies. Keep clear metadata. Actively promote/publish.

Help
How can I look for help and develop more skills?

  1. Novice: Take a basic programming class or tutorials online.
  2. Intermediate: Apply skills to real problems/questions. Do science. Read other people’s code.
  3. Advanced: Teaching, collaborative code writing, hackathon, join a software carpentry study group!