QBS101 – Syllabus
Foundations of Programming for Data Scientists
Summer term
General Information
Teaching and Learning
This course is broken up in 3 distinct phases: lectures, office/TA hours, asynchronous learning
Lectures
● In-person, synchronous: 1 unit – 2x 90 min weekly
● All lectures are recorded and recordings are available to currently enrolled students
Office/TA Hours
● Times: 2-3x 1-hour sessions weekly
● Format: Synchronous, online and/or in-person, drop-in sessions
Asynchronous Learning
● Times: on your own time, week-to-week
● Format: asynchronous, computational notebooks, articles (video, reading, etc.)
Tagline
This course covers the essential concepts of programming for students who desire to understand computational approaches to problem solving using live code examples and in-class exercises in Python, Bash scripting and High Performance Computing (HPC) environments.
Description
This 1 unit course covers the essential concepts of computer programming to an audience with little to no prior programming experience but a desire to understand computational approaches to problem solving. It is fully geared to use live code examples and in-class exercises — bringing the ideas to life, but without bogging down too much in computer idiosyncrasies. We recommend that you bring a laptop or tablet to lecture each week to follow along with the work.
Expectations and Goals
This course is divided in 3 main aspects: Foundations of Programming, Foundations of Computational Data Science. Our new Data Scientists will get comfortable with a myriad of programming/scripting languages and technologies, and learn to use them to solve the problem at hand. Therefore, this course will be a healthy blend of Python, Bash scripting and High Performance Computing (HPC), along with modeling techniques.
Prerequisites and Course Materials
Prerequisites
During the first half (5 weeks) of the therm, students must independently familiarize themselves with the following concepts using the provided materials and optional in-person training opportunities (workshops, seminars, etc.):
● the Linux (UNIX) operating system, file system and the BASH command line interpreter
○ Software Carpentry lesson
○ Youtube playlist (based on Software Carpentry lesson)
● Dart FS – high-performance networked research data storage at Dartmouth
○ Knowledge Base Article (KBA), incl.
○ GlobalProtect VPN Overview, Access DartFS from Macintosh,Access DartFS from Windows, Creating a web site in your DartFS account
○ Youtube playlist
● High-Performance Computing at Dartmouth – Andes, Polaris, and Discovery
○ Knowledge Base Article (KBA)
Textbook and software
There is an optional free on-line textbook for the course, Project Python. This is the textbook used as CS1 lecture notes. Reading the text and doing the exercises in it is encouraged but not necessary to do well in this course. More material as needed during the term.
For Python, we will be using “notebook environments”, either local (Jupyter Notebooks via Anaconda) or online (Dartmouth JHub). For Bash scripting and to access HPC environments, you will be required to use FastX. Windows PC users will also want to install MobaXterm.
Course Schedule
Foundations of Programming and Computational Data Science
Topics covered:
● Variables and expressions. Scopes.
● Lists, tuples, dictionaries
● Functions, parameters, return values, libraries, abstraction. Recursion.
● Flow control: loops and conditions. Nesting.
● Debugging (basics), errors and exceptions.
● Defensive programming
● Basic algorithms – search, sort
● Concepts of Object Oriented programming – using objects
● Command line input / BASH scripting
● File input/output
● Data frames
● Plotting (basics)
● Data Science packages
● Mixing languages BASH + Python