Skip to main content

Introduction to ML

A Beginner's Guide to Machine Learning

Machine Learning is a fascinating field of AI that empowers computers to learn from data and improve their performance over time without being human intervention.

Machine Learning is a subfield of Artificial Intelligence(AI) that involves training machines or computers to learn from data and make decisions or predictions without being explicitly programmed.

πŸ’‘
It's about giving computers the ability to learn and improve on their own.

How does a Machine Learn?

Machine Learning algorithms are trained on large datasets, identify patterns, evaluate and predict the new data.

It revolves around training a model using data then using the model to make predictions or decisions.

Dataset

So what is a dataset? A dataset in Machine Learning (ML) is a collection of structured (like spreadsheet) or unstructured (like text, image, audio etc) data that serves as the foundation for training, validating, and testing ML models.

Dataset usually contains features (inputs) and labels (outputs) for supervised learning, or just features for unsupervised learning.

Dataset Format

Datasets come in various formats depending on the type of data.

Fig: Dataset formats

Where does this dataset come from ?

Dataset comes from anywhere as we can manually collect data through surveys, interviews, or experiments and also from internet such as -

Open source datasets (publicly available), platforms like Kaggle, UCI Machine Learning Repository, or Google Dataset Search. For example, Titanic dataset, MNIST dataset.

It can be from company or organizational dataset such as Internal databases, logs, or CRM systems. Examples like Customer purchase history, server logs in a company.

Also web scraping is one of the most popular data collected from websites using tools like BeautifulSoup or Scrapy.

APIs is another method where data can be pulled from APIs like Twitter API, OpenWeatherMap API, or Google Maps API. Also IoT Devices and Sensors where data is collected from smart devices or machinery.