What is Machine Learning?
Machine learning is nothing short of phenomenal. A vision once deemed impossible is now becoming a reality and is changing the world at an unprecedented pace. Imagine if a computer could learn like a human. No programming, simply learning just like us. Billions of people are interacting with machine learning daily without even realizing it. Let us go through and understand how we got to this point and how Machine Learning has revolutionized the world. In an overview, machine learning is essentially a subset of Artificial intelligence. Artificial intelligence is a branch of computer science that focuses on making machines adaptive and responsive in an active setting.
Fundamentals of Learning Algorithms
I am now going to introduce you to the very fundamentals of how machines can learn without being explicitly programmed. In machine learning, we can use different types of learning to train machines. There are three techniques for learning algorithms which are: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning trains the machine learning model by mapping the input to output features to one another. The machines job, therefore, is to understand the path that leads from the input to the output and find any relevant patterns. The machine does this by analyzing the input with respect to the output. In neural networks, each neuron is essentially estimating what its prediction is and passing it on to the next neuron. The neural network is known as the hidden layer. All neurons have weightings for their predictions which vary as more training samples are used. The hidden layer can pose a threat in high steak environments for example healthcare. Since the layer hides a phenomenon called the “black-box theory” is observed, it is hard to understand the computation that resulted in an output. Therefore, the applications of AI are limited in some aspects due to its integrity.
A low steaks example of supervised learning could be a model that predicts housing prices based on multiple different inputs (e.g location, square footage, number of rooms) with respect to output price. The machine can, therefore, understand the importance and weighting of each input variable with regards to output. Weighting ensures that a model can fairly evaluate a situation that is true to life where some factors (e.g locality) can have more of a weighting on price than others (e.g age of property).
In this example, if we have numerous features as inputs we can have hundreds of neurons. We can therefore develop a large neural network with significant performance benefits. Eventually, all neurons feed information to another in the next layer. Each neuron is responsible for computing different variables. All neurons eventually feed to a single neuron which evaluates all the outcomes.
Reinforcement learning is quite an unspoken type of learning algorithm compared to supervised learning and unsupervised learning. Essentially the model is given the desired outcome and it needs to try to find a way to achieve it. Steps that prevent the model from reaching its goal are penalized, whereas steps that bring a model closer to its objective are encouraged (given a reward). The system, therefore, understands the actions needed to ensure the greatest reward. An example of this is a video game system. Essentially, we can give a program a desired outcome for the game, and it needs to go through multiple trials and errors to find out the best way to ensure the objectives of the game. Falling off the map would be a penalty, whereas getting to the desired location would be a reward (e.g 500 points).
Moving on to unsupervised learning, this is when we feed an AI predictive system unlabeled or unstructured data. We then view the different patterns and information within the data which allow it to learn. Examples of unstructured data is videos, images, audio files and more. The algorithms that power unsupervised learning are vastly different from supervised. Whereas supervised may use regression or ReLu(Rectified linear activation functions), unsupervised uses algorithms such as k-means clustering, neural networks and more. Unstructured data can find patterns in data that may be overlooked by humans.
To summarize, Supervised learning is essentially training off labelled data, whereas unsupervised is training off unlabeled data where the model is expected to find patterns. Neural networks can be used both in the approach of supervised and unsupervised learning. Reinforcement learning is a way where the device learns itself by means of reward and penalty.
However, this is where the importance of clean data comes into play which we will explore in the next paragraph.
Importance of Clean Data
Due to the rise of big data, there is a multitude of data that can be collated, analyzed, and therefore used as training data on a scale that has never been seen before. A model can train on large datasets and have a high level of accuracy and a wide range to interpolate data from due to the abundance of data. However, this leads me to a crucial point. The data that you train your model on dictates the result of your model. We have a saying in machine learning “garbage in, garbage out”, meaning if you train the machine learning model on a dataset that is not representative of the real world, your machine learning model will show poor performance in practice. Therefore, you must verify the way the data was collected is effective and reputable, otherwise, the model that is created will have low generalizability and therefore little to no utility in a practical real-world setting.
References:
Reinforcement Learning: What is, Algorithms, Applications, Example
Reinforcement Learning: What is, Algorithms, Applications, Example (2021). Available at: https://www.guru99.com/reinforcement-learning-tutorial.html (Accessed: 14 April 2021).
Reinforcement learning – GeeksforGeeks
Reinforcement learning – GeeksforGeeks (2018). Available at: https://www.geeksforgeeks.org/what-is-reinforcement-learning/ (Accessed: 14 April 2021).
networks, P., Done, B., Visualization, T., Analytics, A. and Use Of Segmentation, P. I. M. F. B. A. B. W. T. D.
networks, P. et al. (2020) Prediction using neural networks, Express Analytics. Available at: https://expressanalytics.com/blog/prediction-using-neural-networks/#:~:text=Neural%20networks%20work%20better%20at,the%20way%20a%20human%20does. (Accessed: 14 April 2021).
(2021) Monkeylearn.com. Available at: https://monkeylearn.com/unstructured-data/#:~:text=Unstructured%20data%20(also%20known%20as,images%2C%20video%2C%20and%20audio. (Accessed: 14 April 2021).
What is Supervised Learning?
What is Supervised Learning? (2021). Available at: https://searchenterpriseai.techtarget.com/definition/supervised-learning (Accessed: 14 April 2021).
Stanford University- Coursera
(2021) Coursera. Available at: https://www.coursera.org/learn/machine-learning? (Accessed: 14 April 2021).
Featured image retrieved from: https://www.fieldfisher.com/en/services/technology-outsourcing-and-privacy