Research, Workshops, Talks, and Tutoring
Research
Research at the CI Lab centers around data science, machine learning, and artificial intelligence. Current work includes building a Quantum Machine Learning Classifier (QMLC). The QMLC uses the mathematics of quantum computing in a deep neural network. The development work uses R. A. Fisher’s “Iris Data” (Fisher, 1936) to find and classify the specific flower type of the three different iris flower species. The quantum-computing machine learning classifier performs significantly better than one using classical deep learning neural network methods, taking fewer epochs to train. Upcoming work includes the use of optimizers, encapsulation, and quantum entanglement and superposition. An extension of the QMLC work titled Quantum Machine Learning Classifier for Respiratory Disease Screening by Phone has been proposed to the NSF for funding.
The CI Lab is here to collaborate with you and your team.
Workshops
The CI Lab provides a place for workshops and tutorials crossing the scope of data analytics. Organized by tools and processes, the workshops cover data collection and mungling, exploratory data analysis and visualization, and predictive modeling. Students will learn about these key pillars of data analysis using Python and R. Foundational workshops include topics from probability theory and statistics, modeling and data, and statistical computing.
Click on the modules below to access the workshop videos.
Webscraping
Learn how to extract data from a website. What is webscraping? Why do we need it and how do we do it? Mihir Zala shows us how to do it and why.
Twitter Application Programming Interface (API) - Part 1
Learn how to use Twitter API to get data from tweets. Nayana Mahajan walks us through the details in this two part tutorial.
Twitter Application Programming Interface (API) - Part 2
Cotinue working with Nayana on Twitter API in part 2 of this topic. (The slides for this video are the same as part 1.)
Exploratory Data Analysis
Before any model building can begin, a data analyst must get to know the data. Exploratory Data Analysis (EDA) is a key step in the process, where the analyst gets to see and understand the possible relationships in the data before attempting to build a model. Neha Mathur takes us through this critical first-step in data analysis.
Exploratory Data Analysis Slides
Exploratory Data Analysis Jupyter Notebook
Data Cleaning
One outcome from EDA is to discover that the data you have are not ideal. In fact, real data are never pristine as they are in textbooks! Cleaning data and data mungling (reshaping and manipulating data for analysis) are essential, and although not the exciting part of a data scientist’s day, they are key to good research. In this module, Gio Abou Jaoude illustrates the highs and lows of data cleansing.
Monte Carlo Simulation
Monte Carlo simulation is at the core of many numerical methods used in statistical computing. It can be used to generate random processes and is central to Bayseian analysis, Markov chain Monte Carlo, and Gibbs sampling. Gio Abou Jaoude explores Monte Carlo methods in this video through its application in several settings.
Open Source Computer Vision - Overview with Feature detection
This module covers Python’s Open Computer Vision library cv2. First Nayana Mahajan walks through the functionality of the cv2 library, then moves on to feature detection and matching, and finally illustrates image classification.
Open Source Computer Vision with Python Slides
Talks
Fall 2023 semester: TBA
Fall 2022 semester:
-
- Oct 20 – Women in Data Science – “What Talent Looks for in Their Data Scientist Candidates” Jasmine Ewing, Executive Search – Data Science, Data Engineering and Consumer Insights at Netflix and Jennifer Haley, Global Talent Acquisition at PepsiCo. Skills, characteristics, and “basic knowledge” hiring managers look for in their candidates. Tips for international students who face the additional challenge of attaining a visa – how they can stand out. “Behind-the-scenes” facts and hacks to better manage your career.
Tutoring
At the CI-Lab you can get help with a variety of topics. We offer tutoring in Python, R, Git, and data science.