Course DescriptionThis course is an introduction to learning big data tools such as Hadoop and advanced SQL techniques. Students will gain a clear understanding of Hadoop concepts and technologies landscape and market trends. They will construct SQL queries of moderate to high complexity to retrieve data from a relational database. Note: Tools taught Hive, Pig, Oozie, LAMBDA, Gigraph and GraphLab.
What Will You Learn?
Develop a comprehensive understanding of big data and its industrial and sectoral applications.
Learn how to:
- Engage in big data and AI computing (cloud computing) and their industrial applications.
- Utilize Hadoop ecosystem for big data.
- Employ Linux file systems, bash commands, and regular expressions.
- Write complex queries on big data using Apache Hive to query data stored in various databases and file systems that integrate with Hadoop.
- Write scripts and analyze data using Apache Spark to efficiently execute streaming and machine learning on big data.
- Leverage network analyses and their use cases.
The deadline to enroll in CIND 719 for Summer term is June 21, 2022.
Students will also not be allowed to swap between sections of the Data Analytics courses after the above date.
You must download the X2Go Client in order to access the software needed to complete the requirements for this course. Prior to your first class, you are strongly advised to test the computer you plan to use, as machines operated using a third-party administrator (such as laptops provided by a workplace) may not allow access to the required software/download(s).
International students should use their own virtual private network (VPN) software to connect to University resources.
Department consent may be provided if the student has specific professional experience.
- Practical Data Science and Machine Learning : Required