Course : Big Data Analytics with Python

Big Data Analytics with Python

4,4 / 5

INTER

IN-HOUSE

CUSTOM

Practical course in person or remote class
Disponible en anglais, à la demande

Ref. BDA

4d - 28h00

Price : 2920 CHF E.T.

Dates and registration

Download in PDF format

Share this course by email

OBJECTIVES
PROGRAMME
DATES

Teaching objectives

At the end of the training, the participant will be able to:

	Understanding the principle of statistical modeling
	Choosing regression and classification depending on data type
	Evaluating an algorithm’s predictive performance
	Creating selections and classifications in large volumes of data to reveal trends

Practical details

Hands-on work

Developing/conducting analysis in Python, with the modules pandas, NumPy, SciPy, MatPlotLib, seaborn, scikit-learn, and statsmodels.

Course schedule

1
Introduction to modeling

Introduction to the Python language.
Introduction to the Jupiter Notebook software.
Steps for building a model.
Supervised and unsupervised algorithms.
Choosing between regression and classification.

Hands-on work

Installing Python 3, Anaconda, and Jupiter Notebook.

2
Model evaluation procedures

Techniques for resampling in training, validation and testing sets.
Learning data representativeness test.
Predictive model performance measurements.
Confusion and cost matrix and AUC-ROC curve.

Hands-on work

Setting up data set sampling. Conducting evaluation tests on multiple provided models.

3
Supervised algorithms.

The principle of univariate linear regression.
Multivariate regression.
Polynomial regression.
Regularized regression.
Naive Bayes.
Logistic regression.

Hands-on work

Implementing regressions and classifications on multiple data types.

4
Unsupervised algorithms

Hierarchical clustering.
Non-hierarchical clustering.
Mixed approaches.

Hands-on work

Handling unsupervised clusters in multiple datasets.

5
Component analysis

Principal component analysis.
Correspondence analysis.
Multiple correspondence analysis.
Factor analysis for mixed data.
Hierarchical classification of principal components.

Hands-on work

Reducing the number of variables and identifying underlying factors of dimensions associated with significant variability.

6
Text data analysis

Collecting and preprocessing text data.
Extracting primary entities, named entities, and reference resolution.
Grammatical tagging, syntactical analysis, semantic analysis.
Lemmatization.
Text vectorization.
TF-IDF weighting.
Word2Vec.

Hands-on work

Explore the contents of a text base using latent semantic analysis.

Customer reviews

4,4 / 5

Customer reviews are based on end-of-course evaluations. The score is calculated from all evaluations within the past year. Only reviews with a textual comment are displayed.

Dates and locations

Dernières places

Date garantie en présentiel ou à distance

Session garantie

From 17 to 20 June 2025 *

Remote class

Registration

From 19 to 22 August 2025

Remote class

Registration

From 19 to 22 August 2025

Remote class

Registration

From 16 to 19 September 2025 *

Remote class

Registration

From 28 to 31 October 2025

Remote class

Registration

From 28 to 31 October 2025

Remote class

Registration

From 17 to 20 November 2025

Remote class

Registration

From 2 to 5 December 2025

Remote class

Registration

From 16 to 19 December 2025

Remote class

Registration

Course : Big Data Analytics with Python

Big Data Analytics with Python

1 Introduction to modeling

2 Model evaluation procedures

3 Supervised algorithms.

4 Unsupervised algorithms

5 Component analysis

6 Text data analysis

1
Introduction to modeling

2
Model evaluation procedures

3
Supervised algorithms.

4
Unsupervised algorithms

5
Component analysis

6
Text data analysis