Do you have substantial amounts of business-generated data that is dormant and never used? Do you want to know the diverse types of customers just by analysing their purchases?  We can help you with that.

We can show you how to extract data from your existing sources such as spreadsheets, relational databases and data warehouses. This will then lead to a process of understanding this data (data exploration) and data cleaning (removing any inconsistencies and prepare it for data analysis). After this process, we can show you how to prepare dynamic reports and dashboards based on this data. This can offer C-Level management a quick overview of the company’s operations highlighting areas that need attention.

If you want to take it further, we can use the power of machine learning where additional techniques such as regression and classification can help with prediction of sales, trend analysis and more!

While we will typically work with you to tailor the training to your needs, some key topics may include:

  • Online data sets – Existing data sets and their availability.
  • Relational databases – Gathering data from an existing relational database.
  • Data warehouse  – Gathering data from an existing data warehouse.
  • Various file formats – Gathering data from various file formats, such as CSV, TXT, XML and JSON.
  • Exploratory research – Explore gathered data to understand cleaning requirements.
  • Data cleaning – Cleaning data in order to ensure consistency and prepare for data analysis. Perform necessary adjustments to data such as scaling, null removal, and factoring for better analysis.
  • Visualising data – Basic data visualisation to better understand data and tackle accordingly.
  • Unsupervised learning – Clustering techniques are applied to data in order to identify distinct groups of data.
  • Supervised learning – Classification and regression techniques are applied to order to correctly classify data based on knowledge extracted from existing data.
  • Understanding results – Understanding results returned by various machine learning and data mining techniques.
  • Reproducible results – Understanding the importance of documentation and reproducibility of results.