Research, Workshops, Talks, and Tutoring
Research at the CI Lab centers around data science, machine learning, and artificial intelligence. Current work includes building a Quantum Machine Learning Classifier (QMLC). The QMLC uses the mathematics of quantum computing in a deep neural network. The development work uses R. A. Fisher’s “Iris Data” (Fisher, 1936) to find and classify the specific flower type of the three different iris flower species. The quantum-computing machine learning classifier performs significantly better than one using classical deep learning neural network methods, taking fewer epochs to train. Upcoming work includes the use of optimizers, encapsulation, and quantum entanglement and superposition.
We’re also working with outside partnerships on research to enhance STEM education in middle schools. The CI Lab is here to collaborate with you and your team.
The CI Lab provides a place for workshops and tutorials crossing the scope of data analytics. Organized by tools and processes, the workshops cover data collection and mungling, exploratory data analysis and visualization, and predictive modeling. Students will learn about these key pillars of data analysis using Python and R. Foundational workshops include topics from probability theory and statistics, modeling and data, and statistical computing.
Click on the modules below to access the workshop videos.
Learn how to extract data from a website. What is webscraping? Why do we need it and how do we do it? Mihir Zala shows us how to do it and why.
Twitter Application Programming Interface (API) - Part 2
Cotinue working with Nayana on Twitter API in part 2 of this topic. (The slides for this video are the same as part 1.)
Exploratory Data Analysis
Before any model building can begin, a data analyst must get to know the data. Exploratory Data Analysis (EDA) is a key step in the process, where the analyst gets to see and understand the possible relationships in the data before attempting to build a model. Neha Mathur takes us through this critical first-step in data analysis.
One outcome from EDA is to discover that the data you have are not ideal. In fact, real data are never pristine as they are in textbooks! Cleaning data and data mungling (reshaping and manipulating data for analysis) are essential, and although not the exciting part of a data scientist’s day, they are key to good research. In this module, Gio Abou Jaoude illustrates the highs and lows of data cleansing.
Monte Carlo Simulation
Monte Carlo simulation is at the core of many numerical methods used in statistical computing. It can be used to generate random processes and is central to Bayseian analysis, Markov chain Monte Carlo, and Gibbs sampling. Gio Abou Jaoude explores Monte Carlo methods in this video through its application in several settings.
We have the following talks scheduled for the Spring 2022 semester:
- April 11, 2022 – Seidenberg Lounge 12:10pm-1:10pm. Gio Abou Jaoude (MS in CS May ’22): “Quantum Machine Learning Classifier.”
- April 27, 2022 – Seidenberg Lounge 12:10pm-1:10pm. Profs. Yegin Genc and Frank Parisi “Data Science and Computational Intelligence.”
At the CI-Lab you can get help with a variety of topics. We offer tutoring in Python, R, Git, and data science.