Research, Workshops, Talks, and Tutoring

Research

Research at the CI Lab centers around data science, machine learning, and artificial intelligence. Current work includes building a Quantum Machine Learning Classifier (QMLC). The QMLC uses the mathematics of quantum computing in a deep neural network. The development work uses R. A. Fisher’s “Iris Data” (Fisher, 1936) to find and classify the specific flower type of the three different iris flower species. The quantum-computing machine learning classifier performs significantly better than one using classical deep learning neural network methods, taking fewer epochs to train. Upcoming work includes the use of optimizers, encapsulation, and quantum entanglement and superposition. An extension of the QMLC work titled Quantum Machine Learning Classifier for Respiratory Disease Screening by Phone has been proposed to the NSF for funding.

The CI Lab is here to collaborate with you and your team.

Workshops

The CI Lab provides a place for workshops and tutorials crossing the scope of data analytics. Organized by tools and processes, the workshops cover data collection and mungling, exploratory data analysis and visualization, and predictive modeling. Students will learn about these key pillars of data analysis using Python and R. Foundational workshops include topics from probability theory and statistics, modeling and data, and statistical computing.

Click on the modules below to access the workshop videos.

Webscraping

Learn how to extract data from a website. What is webscraping? Why do we need it and how do we do it? Mihir Zala shows us how to do it and why.

Webscraping Slides

Twitter Application Programming Interface (API) - Part 1

Learn how to use Twitter API to get data from tweets. Nayana Mahajan walks us through the details in this two part tutorial.

Twitter API tutorial Slides

Twitter API Jupyter Notebook

Twitter API Notebook – Empty Shell

Twitter Application Programming Interface (API) - Part 2

Cotinue working with Nayana on Twitter API in part 2 of this topic. (The slides for this video are the same as part 1.)

Exploratory Data Analysis

Before any model building can begin, a data analyst must get to know the data. Exploratory Data Analysis (EDA) is a key step in the process, where the analyst gets to see and understand the possible relationships in the data before attempting to build a model. Neha Mathur takes us through this critical first-step in data analysis.

Exploratory Data Analysis Slides

Exploratory Data Analysis Jupyter Notebook

EDA-Shell Try it!!

Data Cleaning

One outcome from EDA is to discover that the data you have are not ideal. In fact, real data are never pristine as they are in textbooks! Cleaning data and data mungling (reshaping and manipulating data for analysis) are essential, and although not the exciting part of a data scientist’s day, they are key to good research. In this module, Gio Abou Jaoude illustrates the highs and lows of data cleansing.

Data Cleaning Slides

Monte Carlo Simulation

Monte Carlo simulation is at the core of many numerical methods used in statistical computing. It can be used to generate random processes and is central to Bayseian analysis, Markov chain Monte Carlo, and Gibbs sampling. Gio Abou Jaoude explores Monte Carlo methods in this video through its application in several settings.

Monte Carlo Slides

Open Source Computer Vision - Overview with Feature detection

This module covers Python’s Open Computer Vision library cv2. First Nayana Mahajan walks through the functionality of the cv2 library, then moves on to feature detection and matching, and finally illustrates image classification.

Open Source Computer Vision with Python Slides

OpenCV Notebook

Talks

Fall 2023 semester: TBA

Fall 2022 semester:

- Oct 20 – Women in Data Science – “What Talent Looks for in Their Data Scientist Candidates” Jasmine Ewing, Executive Search – Data Science, Data Engineering and Consumer Insights at Netflix and Jennifer Haley, Global Talent Acquisition at PepsiCo. Skills, characteristics, and “basic knowledge” hiring managers look for in their candidates. Tips for international students who face the additional challenge of attaining a visa – how they can stand out. “Behind-the-scenes” facts and hacks to better manage your career.

Tutoring

At the CI-Lab you can get help with a variety of topics. We offer tutoring in Python, R, Git, and data science.