Introduction to Linux In-Person

Linux was originally developed as an open source (freely available and everyone could contribute to the code) alternatively to Unix. Yet, it becomes the most ubiquitous operating system (OS) and can be found on various electonic devices, from large computing cluster and desktop/laptop to smartphone, car/airplane/train computing system and smart device (Internet of things a.k.a. IoT)

Many software and tools utilized in data science (or big data), such as those for bioinformatics or artificial intelligent, are developed and run on Linux-based workstation or high-performance computing (HPC) platform. Thus, experience with Linux and its command line environment (a.k.a. shell) is a fundamental and essential skill set for data scientist.

This self-guided tour is developed as an online resource within a very short time frame. It is based on a combination of video available from LinkedIn Learning (formerly Lynda, a resource accessible to all members of Tufts University) and short supplementing note for the videos. There are 5 modules derived from several LinkedIn Learning courses listed in this tour and each module takes 1.5 to 2 hours.

The first video serves to guide you through the basic in working with Linux shell, using one of the most common variants, the BASH. The second optional module contains vidoes which that provide some fundamental programming concept, which will enhance your learning experience withe third module. The third module concerns bash scripting (a somewhat different way to say programming), which allow you to build automated and/or interactive program that allow you to perfrom repeatitive, multi-step and sometime complicated task. For instance, one could create a script that ask for input files (e.g. sequence data and reference geneome) and the script will automatically (and in an unsurpervised manner) performs alignment and differential expression analysis on HPC. The last two optional modules focus on two powerful features, awk and sed, are useful for users who will be working with large text-based table (e.g. tab-delimited table) or large text file.

This "Introduction to Linux" self-guided tour is a pre-requirement for several other DISC courses, including "Introduction to Tufts High Performance Compute (HPC)", "Bioinformatics for RNAseq", "Introduction to NGS Bioinformatics" and possibly several others that are being planned. This tour will start with a two hour Zoom meeting to get all the participant setup and to answer all initial questions. A few additioanl Zoom meetings for Q&A will be made available as needed. The registration will be closed 24 hours before the start of event. 

