logo

Applied Machine Learning in Python

Introduction ¶.

This book aims to provide an accessible introduction into applying machine learning with Python, in particular using the scikit-learn library. I assume that you’re already somewhat familiar with Python and the libaries of the scientific Python ecosystem. If you find that you have a hard time following along some of the details of numpy, matplotlib and pandas, I highly recommend you look at Jake VanderPlas’ Python Data Science handbook .

Scope and Goals ¶

After reading this book, you will be able to do exploratory data analysis on a dataset, evaluate potential machine learning solutions, implement, and evaluate them. The focus of the book is on tried-and-true methodology for solving real-world machine learning problems. However, we will not go into the details of productionizing and deloying the solutions. We will mostly focus on what’s know as tabular data, i.e. data that would usually be represented as a pandas DataFrame, Excel Spreadsheet, or CSV file. While we will discuss working with text-data in Chapter, there are many more advanced techniques, for which I’ll point you towards Dive into Deep Learning by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola. We will not look at image recognition, video or speech data, or time series forecasting, though many of the core concepts described in this book will also apply there.

What is machine learning? ¶

Machine learning, also known as predictive modeling in statistics, is a research field and a collection of techniques to extract knowledge from data, often used to automate decision-making processes. Applications of machine learning are pervasive in technology, in particular in complex websites such as facebook, Amazon, youtube or Google. These sites use machine learning to personalize the experience, show relevant content, decide on advertisements, and much more. Without machine learning, none of these services would look anything like they do today. Outside the web, machine learning has also become integral to commercial applications in manifacturing, logistics, material design, financial markets and many more. Finally, over the last years, machine learning has also become essential to research in practically all data-driven sciences, including physics, astronomy, biology, medicine, earth sciences and social sciences.

There are three main sub-areas of machine learning, supervised learning, unsupervised learning, and reinforcement learning, each of which applies to a somewhat different setting. We’ll discuss each in turn, and give some examples of how they can be used.

Supervised Learning ¶

Supervised learning is by far the most commonly used in practice. In supervised learning, a model is built from a dataset of input-output pairs, where the input is known as features or independent variables, which we’ll denote by \(x\) , and the output is known as target or label, which we’ll denote by \(y\) . The input here is a representation of an entity of interest, say a customer to your online shop, represented by their age, location and shopping history. The output is a quantity of interest that we want our model to predict, say whether they would buy a particular product if we recommend it to them. To build a model, we need to collect many such pair, i.e. we need to build records of many customers and their decisions about whether or not they bought the product after an recommendation was shown to them. Such a set of input-output pairs for the purpose of building a supervised machine learning model is called a training set .

(customers of bank easier? because discrete products?)

Once we collected this dataset, we can (attempt to) build a supervised machine learning model that will make a prediction for a new user that wasn’t included in the training dataset. That might enable us to make better recommendations, i.e. only show recommendations to a user that’s likely to buy.

(given an example?)

The name supervised learning comes from the fact that during learning, the dataset contains the correct targets, which acts as a supervisor for the model training.

For both regression and classification, it’s important to keep in mind the concept of generalization. Let’s say we have a regression task. We have features, that is data vectors x_i and targets y_i drawn from a joint distribution. We now want to learn a function f, such that f(x) is approximately y, not on this training data, but on new data drawn from this distribution. This is what’s called generalization, and this is a core distinction to function approximation. In principle we don’t care about how well we do on x_i, we only care how well we do on new samples from the distribution. We’ll go into much more detail about generalization in about a week, when we dive into supervised learning.

Classification and Regression ¶

\(^1\) There are many other kinds of supervised learning tasks such as ranking or probability estimation, however, we will focus on classification and regression, the most commonly used tasks, in this book.

There are two main kinds of supervised learning tasks, called classification and regression \(^1\) . If the target of interest \(y\) that we want to predict is a quantity, the task is a regression problem. If it is discrete, i.e. one of several distinct choices, then it is a classification problem. For example, predicting the time it will take a patient to recover from an illness is a regression task, say measured in days. We might want our model to predict whether a patient will be ready to leave a hospital 3.5 days after admission or 5 or 10. This is regression becaues the time is clearly a continuous quantity, and there is a clear sense of ordering and distance between the different possible predictions. If the correct prediction is that the patient can leave after 4.5 days, but instead we predict 5, that might not be exactly correct, but it might still be a useful prediction. Even 6 might be somewhat useful, while 20 would be totally wrong.

\(^2\) This might be more naturally formulated as a multi-label task, which is basically a series of binary classification tasks. There could be more than one medication that leads to success, so this could be phrased as a yes/no question for each candidate.

An example for a classification task would be which of a set of medications the patient would respond best to \(^2\) . Here, we have a fixed set of disjoint candidates that are known a-priori, and there is usually no order or sense of distance between the classes. If medication A is the best, then predicting any other medication is a mistake, so we need to predict the exact right outcome for the prediction to be accurate. A very common instance of classification is the special case of binary classification, where there are exactly two choices. Often this can be formulated as a “yes/no” question to which you want to predict an answer. Examples of this are “is this email spam?”, “is there a pedestrian on the street”, “will this customer buy this product” or “should we run an X-ray on this patient”.

The distinction into classification is important, as it will change the algorithms we will use, and the way we measure success. For classification, a common metric is accuracy , the fraction of correctly classified examples, i.e. the fraction of times the model predictied the right class. For regression on the other hand, a common metric is mean squared error, which is the squared average distance from the prediction to the correct answer. In other words, in regression, you want the prediction to be close to the truth, while in classification you want to predict exactly the correct class. In practice, the difference is a bit more subtle, and we will discuss model evaluation in depth in chapter TODO.

Usually it’s quite clear whether a task is classification or regression, but there are some cases that could be solved using either approach. A somewhat common example is ratings in the 5-star rating system that that’s popular on many online platforms. Here, the possible ratings are one start, two starts, three stars, four stars and five stars. So these are discrete choices, and you could apply a classification algorithm. On the other hand, there is a clear ordering, and if the real answer is one star, predicting two stars is probably better than predicting 5 stars, which means it might be more appropriate to use regression. Here, which one is more appropriate depends on the particular algorithm you’re using and how to integrate into your larger workflow.

Generalization ¶

When building a model for classification or regression, keep in mind that what we’re interested in is applying the model to new data for which we do not know the outcome. If we build a model for detecting spam emails, but it only works on emails in the training set, i.e. emails the model has seen during building of the model, it will be quite useless. What we want from a spam detection algorithm is to predict reasonably well whether an email is spam or not for a new email that was not included in the training set. The ability for a supervised model to make accurate predictions on new data is called generalization and is the core goal of supervised learning. Whithout asking for generalization, an algorithm could solve the spam detection task on the training data by just storing all the data, and when presented with one of these emails, look up what the correct answer was. This approach is known as memorization, but it’s impossible to apply to new data.

Conditions for success ¶

For a supervised learning model to generalize well, i.e. for it to be able to learn to make accurate prediction on new data, some key assumptions must be met:

First, the necessary information for making the correct prediction actually needs to be encoded in the training data . For example, if I try to learn to predict a fair coin flip before the coin is tossed, iI won’t be able to build a working machine learning model, no matter what I choose as the input features. The process is very (or entirely?) random, and the information to make a prediction is just not available. More technically, one might say the process has high intrinsic randomness that we can not overcome by building better models. While you’re unlikely to encounter a case as extreme (and obvious) as a coin toss, many processes in the real world are quite random (such as the behavior of people) and it’s impossible to make entirely accurate predictions for them.

In other cases, a prediction might be possible in principle, but we might not have provided the right information to the model. For example, it might be possible for a machine learning model to learn to diagnose pneumonia in a patient, but not if the only information about the patient that we represent to them is their shopping habbits and wardrobe. If we use a chest x-ray as a representation of the patient, together with a collection of symptoms, we will likely have better success. Even if the information is represented in the input, learning might still fail if the model is unable to extract the information. For example, visual stimuli are very easy to interpret for humans, but in general much harder to understand for machine learning algorithms. Consequently, it would be much harder for a machine to determine if a graffiti is offensive by presenting it with a photograph, than if the same information was represented as a text file.

Secondly, the training dataset needs to be large and varied enough to capture the variability of the process . In other words, the training data needs to be representative of the whole process, not only representing a small portion of it. Humans are very good at abstracting properties, and a child will be able to understand what a car is after seeing only a handfull. Machine learning algorithms on the other hand require a lot of variability to be present. For example, to learn the concept of what a car looks like, an algorithm likely needs to see pictures of vans, of trucks, of sedans, pictures from the front, the side and above, pictures parking and in traffic, pictures in rain and in sunshine, in a garage and outdoors, maybe even pictures taken by a phone camera and pictures taken by a news camera. As we said before, the whole point of supervised learning is to generalize, so we want our model to apply to new settings. However, how new a setting can be depends on the representation of the data and the algorithm in question. If the algorithm has only ever seen trucks, it might not recognize a sedan. If the algorithm has never seen a snow-covered car, it’s unlikely it will recognize it. Photos (also known as natural images in machine learning) are a very extreme example as they have a lot of variability, and so often require a lot of training data. If your data has a simple structure, or the relationship between your features and your target are simple, then only a handful of training examples might be enough.

example of simple training task

Third and finally, the data that the model is applied to needs to be generated from the same process as the data the model was trained on . A model can only generalize to data that in essence adheres to the same rules and has the same structure. If I collect data about public transit ridership in Berlin, and use it to make predictions in New York, my model is likely to perform poorly. While I might be able to measure the same things, say number of people at stations, population density, holidays etc, there are so many differences between data collected in Berlin and data collected in New York that it’s unlikely a model trained on one could predict well on the other. As another example, let’s say you train an image recognition model for recognizing hot dogs on a dataset of stock photos, and you want to deploy it to an app using a phone camera. This is also likely to fail, as stock photography doesn’t resemble photos taken by users pointing their phone. Stock photography is professionally produced and well-lit, the angles are carefully chosen, and often the food is altered to show it in it’s best light (have you noticed how food in a restaurant never looks like in a commercial?). However, machine learning requires you to use a training dataset that was generated by the same process as the data the model will be applied to.

Mathematical Background

From a mathematical standpoint, supervised learning assumes that there is a joint distribution \(p(x, y)\) and that the training dataset consists for independent, identically distributed (i.i.d.) samples from this joint distribution. The model is then applied to new data sampled from the same distribution, but \(y\) is unknown. The model is then used to estimate \(p(y | x)\) , or more commonly the mode of this distribution, i.e. the most likely value for \(y\) to take given the \(x\) we observed. In the case of learning to predict a coin flip, you could actually learn a very accurate model of \(p(y | x)\) , that predicts heads and tails with equal probability. There is no way to predict the particular outcome itself, though.

The third requirement for success is easily expressed as saying that the test data is sampled i.i.d. from the same distribution \(p(x, y)\) that the training data was generated from.

Unsupervised Learning ¶

In unsupervised machine learning, we are usually just given data points \(x\) , and the goal is to learn something about the structure of the data. This is usually a more open-ended task than what we saw in supervised learning. This kind of task is called unsupervised, because even during training, there is no “supervision” providing a correct answer. There are several sub-categories of unsupervised learning that we’ll discuss in Chapter 3, in particular clustering, dimensionality reduction, and signal decomposition. Clustering is the task of finding coherent groups within a dataset, say subgroups of customers that behave in a similar way, say “students”, “new parents” and “retirees”, that each have a distinct shopping pattern. However, here, in contrast to classification, the groups are not pre-defined. We might not know what the groups are, how many groups there are, or even if there is a coherent way to define any groups. There might also be several different ways the data could be grouped: say you’re looking at portraits. One way to group them could be by whether the subject has classes or not. Another way to group them could be by the direction they are facing. Yet another might be hair color or skin color. If you tell an algorithm to cluster the data, you don’t know which aspect it will pick up on, and usually manually inspecting the groups or clusters is the only way to interpret the results.

Two other, related, unsupervised learning tasks are dimensionality reduction and signal decomposition. In these, we are not looking at groups in the data, but underlying factors of variance, that are potentially more semantic than the original representation. Going back to the example of portraits, an algorithm might find that head orientation, lighting and hair color are important aspects of the image that vary independently. In dimensionality reduction, we are usually looking for a representation that is lower-dimensional, i.e. that has less variables than the original feature space. This can be particularly useful for visualizing dataset with many features, by projecting them into a two-dimensional space that’s easily plotted. Another common application of signal decomposition is topic modeling of text data. Here, we are trying to find topics among a set of documents, say news articles, or court documents, or social media posts. This is related to clustering, though with the difference that each document can be assigned multiple topics, i.e. topics in the news could be politics, religion, sports and economics, and an article could be about both, politics and economics.

Both, clustering and signal decomposition, are most commonly used in exploratory analysis, where we are trying to understand the data. They are less commonly used in production systems, as they lend themselves less easily to automating a decision process. Sometimes signal decomposition is used as a mechanism to extract more semantic features from a dataset, on top of which a supervised model is learned. This can be particularly useful if there is a large amount of data, but only a small amount of annotated data, i.e. data for which the outcome \(y\) is known.

Reinforcement Learning ¶

The third main family of machine learning tasks is reinforcement learning, which is quite different from the other two. Both supervised and unsupervised learning basically work on a dataset that was collected and stored, from which we then build a model. Potentially, this model is then applied to new data in the future. In reinforcement learning, on the other hand, there is no real notion of a dataset. Instead, reinforcement learning is about a program (usually known as an agent) interacting with a particular environment. Through this interaction, the agent learns to achieve a particular goal. A good example of this is a program learning to play a video game. Here, the agent would be an AI playing the game, while the environment would be the game itself, i.e. the world in which it plays out. The agent is presented with the environment, and has choices of actions (say moving forward and backward and jumping) and each of these actions will result in the environment being in a new state (i.e. with the agent placed a bit forward, or backward, or falling in a hole). Given the new environment, the agent again can choose an action and the environment will be in a new state as a consequence.

../_images/reinforcement_cycle.png

Fig. 1 The reinforcement learning cycle. ¶

The learning in reinforcement learning happens with so-called rewards , which need to be specified by the data scientist building the system. The agent is trained to seek rewards (hence the name reinforcement learning), and will find series of actions that maximize its reward. In a game, a reward could be given to the environment every time they score points, or just once when the agent wins the game. In the second case, there might be a long delay between the agent taking an action, and the agent winning the game, and one of the main challenges in reinforcement learning is dealing with such settings (this is known as credit attribution problem: which of my actions should I give credit for me winning the game).

Compared with supervised learning, reinforcement learning is a much more indirect way to specify the learning problem: we don’t provide the algorithm with the correct answer (i.e. the correct sequence of actions to win the game), instead we only reward the agent once they achieve a goal. Suprisingly, this can work surprisingly well in practice. This is like learning a game without someone ever telling you the rules, or what the goal of the game is, but only telling you whether you lost or won at the end. As you might expect, it might take you many many tries to figure out the game.

However, algorithms are notoriously patient, and researchers have been able to use reinforcement learning to create programs that can play a wide variety of complex games. Potentially one of the most suprising and impressive feats was learning to play the ancient chinese boardgame of Go at a superhuman level.

citations etc, numbers of games, years…

When this was publicided in TODO, many researchers in the area were shocked, as the game was known to be notoriously hard, and many believed it could not be learned by any known algorithms. While the initial work used some human knowledge, later publications learned to play the game from scratch, i.e. without any rewards other than for winning the game, by the agent repeatedly playing the game against itself. The resulting programs are now playing at superhuman level, meaning that they are basically unbeatable, even by the best human players in the world. Similar efforts are now underway for other games, in particular computer games like StarCraft II and DOTA.

Algorithms achieved super-human performance in the game of chess long before this, in the year TODO with the famous play of Kasparov against Deep Blue. Chess has much fewer possible moves and games are much shorter sequences of actions than in Go or StarCraft, which makes it much easier to devise algorithms to play chess.

Reinforcement learning also has a long history in other areas, in particular robotics, where it is used for learning and tuning behaviors, such as walking or grasping. While many impressive achievements have been made with reinforcement learning, there are several aspects that limit it’s broad usefulness. A potential application of reinforcement learning could be self-driving cars. However, as mentioned above, reinforcement learning usually requires many attempts or iterations before it learns a specific task. If I wanted to train a car to learn to park, it might fail thousands or hundreds of thousands of times first. Unfortunately, in the real world this is impractical: self-driving cars are very expensive, and we don’t want to crash them over and over again. It might also be risky for the person conducting the experiment. With thousands of attempts, even if the car doesn’t crash, the gas will run out, and the person having to reset the experiment every time will probably get very tired very quickly. Therefore, reinforcement learning is most successful when there is a good way to simulate the environment, as is the case with games, and with some aspects of robotics. For learning how to park, a simulation might actually work well, as the sensing of other cars and the steering of the car can be simulated well. However, for really learning how to drive, a car would need to be able to deal with a variety of situations, such as different weather conditions, crowded streets, people running on the strees, kids chasing balls, navigating detours and many other scenarios. Simulating these in a realistic way is very hard, and so reinforcement learning is much harder to apply in the physical world.

A setting that has attracted some attention, and might become more relevant soon, is online platforms that are not games. You could think of a social media timeline as an agent that gets rewarded for you looking at it. Right now, this is often formulated as a supervised learning task (TODO or more acurately active learning). However, your interactions with social media are not usually indepentent events, but your behavior online is shaped by what is being presented to you, and what was shown to you in the past might influence what is shown to you in the future. A maybe somewhat cynical analogy would be to think of this as a timeline being an agent, playing you, winning whenever you stay glued to the screen (or click an ad or buy a product). I’m not aware that this has been implemented anywhere, but as computational capacity increase and algorthms become more sophisticated, it is a natural direction to explore.

Reinforcement learning is a fascinating topic, but much beyond the scope of this book. For an introduction, see TODO Sutten Barto. For an overview of modern approaches, see TODO.

As you might have notices in the table of contents, this book mostly concerns itself with supervised and unsupervised learning, and we will not discuss reinforcement learning any further. As a matter of fact, the book heavily emphasizes supervised learning, which has found the larges success among the three in practical applications so far. While all three of these areas are interesting in their own right, when you see an application of machine learning, or when someone says they are using machine learning for something, chances are they mean supervised learning, which is arguably the most well-understood, and the most easy to productionize and analyze.

Isn’t this just statistics? ¶

A very common question I get is ‘is machine learning not just statistics?’ and I want to quickly address how the approach in this book differs from the approach taken in a statistics class or textbook. The machine learning community and the statistics community have some historical differences (ML being born much later and from within computer science), but study many of the same subjects. So I don’t think it makes sense to say that one thing is statistics and the other thing is machine learning. However, there is usually a somewhat different emphasis in the kinds of problems and questions that are addressed in either, and I think it’s important to distinguish these tasks. Much of statistics often deals with inference , which means that given a dataset, we want to make statements that hold for the dataset (often called population in statistics) as a whole. Machine learning on the other hand often emphasizes prediction , which means we are looking to make statements about each sample, in other words indivdual level statements. Asking “do people that take zinc get sick less often”” is an inference question, as it asks about whether something happens on average over the whole population. A related prediction question would be “will this particular person get sick if they take zinc?”. The answer for the inference question would be either “yes” or “no”, and using hypothesis testing methodology this statement could have an effect size and a significance level attached to it. The answer for the prediction question would be a prediction for each sample of interest, or maybe even a program that can make prediction given information about a new patient.

As you can see, these are two fundamentally different kinds of questions, and require fundamentally different kinds of tools to answer them. This book solely looks at the prediction task, and we consider a model a good model if it can make good predictions. We do not claim that the model allows us to make any statistical or even causal statements that hold for the whole population, or the process that generated the dataset.

There are some other interesting differences between the kind of prediction questions studied in supervised machine learning compared to inference questions that are traditionally studied in statistics; in particular, machine learning usually assumes that we have access to data that was generated from the process we want to model, and that all samples are created equal (i.i.d.). Statistical inference usually makes no such assumptions, and instead assumes that we have some knowledge about the structure of the process that generated the data. As an example, consider predicting a presidential election outcome. As of this writing, there’s 58 past elections to learn from. For a machine learning task, this is by no means enough observations to learn from. But even worse, these samples are not created equally. The circumstances of the first election are clearly different than the will be for the next election. The economic and societal situation will be different, as will be the candidates. So really, we have no examples whatsoever from the process that we’re interested in. However, understanding all the differences to previous elections, we might still be able to make accurate forecasts using statistical modeling.

is this math? should this been in a math section?

If you’re interested in a discussion of prediction vs inference and how they relate, I highly recommend the seminal paper Statistical Modeling: The Two Cultures by Leo Breiman.

Some of my favorite machine learning textbooks are written by statisticians (the subfield is called predictive modeling), and there are certainly machine learning researchers that work on inference questions, so I think making a distinction between statistics and machine learning is not that useful. However, if you look at how a statistics textbook teaches, say, logistic regression, the intention is likely to be inference, and so the methods will be different from this book, where the emphasis is on prediction, and you should keep this in mind.

This is not to say that one is better than the other in any sense, but that it’s important to pick the right tool for the job. If you want to answer an inference question, the tools in this book are unlikely to help you, but if you want to make accurate predictions, they likely will.

The bigger picture ¶

This book is mostly technical in nature, with an emphasis on practial programming techniques. However, there are some important guiding principles for developing machine learning solutions that are often forgotten by practitioners who find themselves deep in the technical aspects. In this section, I want to draw your attention to what I think are crucial aspects of using machine learning in applications. It might seem a bit dry for now, but I encourage you to keep these ideas in mind while working through the rest of the book, and maybe come back here in the end, once you’ve got your toes a bit wet.

The machine learning process ¶

Outside of the depicted process is the formulation of the problem, and the definition of measures, both of which are critical, but usually not part of the loop. The actual machine learning process itself starts with data collection, which might mean mining historical data, labeling data by hand, or running simulations or even performing actual physical experiments. Once the data is collected, it needs to be processed into a format suitable for machine learning, which we’ll discuss in more detail in Chapter TODO. Before building the model expoloratory data analysis and visualization are essential to form or confirm intuition on the structure of the data, to spot potential data quality issues, to select suitable candidate models, and potentially generate new features. The next step, model building, usually involves building several candidate models, tweaking them, and comparing them. Once a model is selected, it is usually evaluated first in an off-line manner, that is using already collected data.

Unclear, should we talk about shadow-running models?

Then, potentially it is further validated in a live setting with current data. Finally, the model is deployed into the production environment. For a web app, deployment might be deployment in the software sense: deploying a service that takes user data, runs the model, and renders some outcome on your website. For industrial applications, deployment could mean integrating your defect detection into an assembly line and discarding defect; if your model is for evaluating real-estate, deployment might mean buying highly valued properties.

This process is depicted as a circle, as deployment usually generates new data, or informs future data collection, and restarts the process. While I drew a circle, actually this is more than one loop, in fact this is a fully connected graph, where after each step, you might decide to go back to previous steps and start over, improving your model or your process. At any point, you might find data quality issue, figure out new informative ways to represent the data, or find out that your model doesn’t perform as well as you thought. Each time, you might decide to improve any of the previously taken steps. Usually there are many iterations before reaching integration and deployment for the first time, as using an unsuitable model might represent substantial risk to your project.

The rest of the book will focus on model building and evaluation, which are at the core of machine learning. However, for a successful project, all of the steps in the process are important. Formulating the problem, collecting data, and establishing success metrics are often at least as crucial as selecting the right model and tweaking it. Given the technical nature of the material presented in this book, it’s easy to lose sight of how critical all the steps of the process are. We will discuss some of these in a bit more detail now.

The role of data ¶

Clearly the data used for building and evaluating a machine learning model are crucial ingredients. Data collection is often overlooked in machine learning education, where students usually look at fixed datasets, and the same is true for online competitions and platforms such as kaggle . However, in practice, data collection is usually part of building any machine learning application, and there is usually a choice to collect additional data, or to change the data collection. Having more data can be the difference between a model that’s not working, and a model that outperforms human judgement, in particular if you can collect data that covers the variablity that you will encounter in prediction. Sometimes it might be possible to collect additional features that make the task much easier, and selecting what data to collect is often as critical as selecting the right model. Usually it’s easier to throw away data later than to add new fields to the data collection. It’s common for data scientist to start working on a model only to discover that a critical aspect of the process was not logged, and a task that could have been easy becomes extremely hard.

data quality?

Potentially one of the most ingenious ways to capture labeled training data is ReCAPTCHA . It provides a service to verify that web user is not a bot by solving visual tasks. These are then used as ground truth annotation for training machine learning models.

Depending on the problem you’re tackling, the effort and cost of data collection can vary widely. In some settings, the data is basically free and endless . Say you want to predict how much attention a post will receive on social media. As long as your post is similar to other posts on the platform, you can obtain arbitrary amounts of training data by looking at existing posts, and collect the number of likes and comments and other engagement. This data rich situation often appears when you are tying to predict the future, and you can observe the labels of past data simply by waiting, i.e. seeing how many people like a photo. In some cases the same might be true for showing ads or recommendations, where you are able to observe past behavior of users, or in content moderation, where users might flag offending content for you. This assumes that the feedback loop is relatively small and the events repeat often, though. If you work in retail, the two data points that are most crucial (at least in the US) are Black Friday and Christmas. And while you might be able to observe them, you can only observe them once a year, and if you make a bad decision, you might go out of business before observing them again.

Another common situation is automating a business process that before has been done manually. Usually collecting the answers is not free in this setting, but it’s often possible to collect additional data by manually annotation . The price of collecting more data then depends on the level of qualification required to create accurate labels, and the time involved. If you want to detect personal attacks in your online community, you can likely use a crowd-sourcing platform or a contractor to get reasonable labels. If your decision requires expert knowledge, say, which period a painting was created in, hiring an expert might be much more expensive or even impossible. In this situation, it’s often intersting to ask yourself what is more cost-effective: spending time building and tuning a complex machine learning model, or collecting more data and potentially getting results with less effort. We will discuss how to make this decision in TODO.

Finally, there are situations where getting additional data is infeasible or impossible; in this situations, people speak of precious data . Examples of this could be the outcome of a drug-trial, which is lengthy and expensive and where collecting additional data might not be feasible. Or the simulation of a complex physical system, or observations on a scientific measurement. Maybe each sample corresponds to a new microchip architecture for which you want to model energy efficiency. These settings are those where tweaking your model and diving deep into the problem might pay off, but these situations are overall rather rare in practice.

say why machines are good at some things and bad at others? like medical imaging?

Feedback loops in data collection ¶

One particular aspect that is often neglected in data collection is that the effect of deploying a machine learning model might change the process generating the data. A simple example for this would be a spammer, who, once a model is able to flag their content, would change their strategy or content so as to be no longer detected. Clearly, the data here changed as a consequence of deploying the model, and the model that might have been able to accurately identify spam in an offline setting might not work in practice. In this example, there is an adverserial intent and the spammers intentionally try to defeat the model. However, similar changes might happen indicentally, but still invalidate a previous model. For example, when building systems for product recommentation, the model often relies on data that was collected using some other recommendation scheme, and the choice of this scheme clearly influences what data will be collected. If a streaming platform never suggests a particular movie, it’s unlikely to be seen by many users, and so it will not show up in user data that’s collected, and so a machine learning algorithm will not recommend it, creating a feedback loop that will lead to the movie being ignored. There is a whole subset of machine learning devoted to this kind of interactive data collection, called active learning , where the data that is collected is closely related to the model that’s being build. This area also has a close relation to reinforcement learning.

Given the existence of these feedback loops, it’s important to ensure that your model performs well, not only in an offline test, but also in a production environment. Often this is hard to simulate, as you might not be able to anticipate the reaction of your users to deploying an algorithm. In this case, using A/B testing might be a way to evaluate your system more rigourously.

A/B testing

A particular nefarious example of this feedback loop has been observed (TODO citation) in what is known as predictive policing . The premise of predictive policing is to send police patrols to neighborhoods where they expect to observe crime, at times that they expect to observe crime. However, if police is sent to a neighborhood, they are likely to find criminal activity there (even if it might be minor); and clearly they will not find criminal activity in neighborhoods they did not patrol. Historically, police patrols in certain US cities have focused on non-white neighborhoods, and given this historical data, predictive policing methods steered patrols to these same neighborhoods. Which then lead them to observe more crime there, leading to more data showing crime in these neighborhoods, leading to more patrols being send there, and so on.

Metrics and evaluation ¶

One of the most important parts of machine learning is defining the goal, and defining a way to measure that goal. The first part of this is having the right data for evaluating your model, data that will reflect the way the model will be used in production. Equally important is establishing a measure of impact for your task. Usually your application is driven by some ultimate goal, such as user engagement, revenue, keeping patients healthy or any number of possible motivations. The question is now how your machine learning solution will impact this goal. It’s important to note that the goal is rarely if ever to make accurate predictions. It’s not some evaluation metrics that counts, but it’s the real world impact of the decision that are made by your model.

\(^3\) Here and in the following I will talk of your goals in terms of a business, however, if your goal is scientific discovery, health or justice, the same principles apply.

There are some common hurdles in measuring the impact of your model. Often, the effect on the bottom line might only be very indirect. If you’re removing fake news from your social media platform, this will not directly increase your ad revenue, and removing a particular fake news article will probably have no measurable impact. However, curating your platform will help maintain a brand image and might drive users to your platform, which in turn will create more revenue. But this effect is likely to be mixed in with many other effects, and much delayed in time. So often data scientists rely on surrogate metrics, measures that relate to intermediate business goals \(^3\) that can be measured more directly, such as user engagement or click-through-rate.

The problem with such surrogate metrics is that they might not capture what you assume they capture. I heard an (if not true, then at least illustrative) anectote about optimizing the placement of an ad on a shopping website. An opptimization algorithm placed it right next to the search button, with the same color as the search button, which resulted in the the most clicks. However, when analyzing the results more closely, the team found that the clicks were caused by users missing the search button and accidentally clicking the ad, not resulting in any sales, but resulting in irritated users that had to go back and search again.

There is usually a hierachy of measurements, from accuracy of a model on an offline holdout dataset, which is easy to calculate but can be misleading in several ways, to more business specific metrics that can be evaluated on an online system, to the actual business goal. Moving from evaluating just the model to the whole process and then to how the process integrates into your business makes evaluation more complex and more risky. Usually, evaluation on all levels is required if possible: if a model does well in offline tests, it can be tried in a user-study. If the user study is promising, it can be deployed more widely, and potentially an outcome on the actual objective can be observed. However, often we have to be satisfied with surrogate metrics on some level, as it’s unlikely that each model will have a measurable impact on the bottom line of a complex product.

One aspect that I find is often overlooked by junior data scientists is to establish a baseline. If you are employing machine learning in any part of your process, you should have a baseline of not employing machine learning. What are your gains if you do? What if your replace your deep neural network with the simplest heuristic that you can come up with? How will it affect your users? There are cases in which the difference between 62% accuracy and 63% accuracy can have a big impact on the bottom line, but more often than not, small improvements in the model will not drastically alter the overall process and result.

When developing any machine learning solution, always keep in mind how your model will fit in the overal process, what consequences your predictions have, and how to measure the overall impact of the decisions made by the model.

When to use and not to use machine learning ¶

As you might be able to tell by the existance of this book, I’m excited about machine learning and an avid advocate. However, I think it is cruicial to not fall victim to hype, and carefully consider whether a particular situation calls for a machine learning solution. Many machine learning practitioners get caught up in the (fascinating) details of algorithms and datasets, but lose perspective of the bigger picture. To the data scientist with a machine learning hammer, too often everything looks like a classification nail. In general, I would recommend restricting yourself to supervised learning in most practical settings; in other words, if you do not have a training dataset for which you know the outcome, it will be very hard to create an effective solution. As mentioned before, machine learning will be most useful for making individual-level predictions, not for inference. I also already laid out some prerequisits for using supervised learning in the respective section above. Let’s assume all of these criteria are met and you carefully chose your business-relevant metrics. This still doesn’t mean machine learning is the right solution for your problem. There are several aspects that need to be balanced; on the one hand there is the positive effect a successful model can have. On the other hand there is the cost of developing the initial solution. Is it worth your time as a data scientist to attack this problem, or are there problems where you can have a bigger impact? There is also the even greater cost of maintaining a machine learning solution in a production environment [SHG+14] . Machine learning models are often intransparent and hard to maintain. The exact behavior depends on the training data, and if the data changes (maybe the trends on social media change, or the political climate changes or a new competitor appeared), the model needs to be adjusted. A model might also make unexpected predictions, potentially leading to costly errors, or annoyed customers. All of these issues need to be weight against the potential benefits of using a model.

Your default should be not to use machine learning, unless you can demonstrate that your solution improves the overall process and impacts relevant business goals, while being robust to possible changes in the data and potentially even to adverserial behavior. Try hard to come up with heuristics to outperform any model you develop, and always compare your model to the simplest approach and the simplest model that you can think of. And keep in mind, don’t evaluate these via model accuracy, evaluate them on something relevant to your process.

Ethical aspects of machine learning ¶

One aspect of machine learning that only recently is getting quite a bit of attention is ethics. The field of ethics in technology is quite broad, and machine learning and data science have many of the same questions that are associated with any use of technology. However, there are many situations where machine learning quite directly impacts individuals, for example when hiring decision, credit approvals or even risk assessments in the criminal justice [BHJ+18] system are powered by machine learning. Given the complexity of machine learning algorithms, and the intricate dependencies on the training data, together with the potentials for feedback loops, it is often hard to assess the potential impact that the deployment of an algorithm or model can have on individuals. However, that by no means relieves data scientists of the responsibility to investigate potential issues of bias and discrimination in machine learning. There is a growing community that investigates fairness, accountability and transparency in machine learning and data science, providing tools to detect and address issues in algorithms and datasets. On the other hand, there are some that question algorithmic solutions to ethical issues, and ask for a broader perspective on the impact of data science and machine learning on society [KHD19] [FL20] . This is a complex topic, and so far, there is little consensus on best practices and concrete steps. However, most researchers and practitioners agree that fairness, accountability and transparancy are essential principles for the future of machine learning in society. While approaches to fair machine learning are beyond the scope of this book, I want to encourage you to keep the issues of bias and discrimination in mind. Real-world examples, such as the use of predictive policing, racial discrimination in criminal risk assessment or gender discrimination in ML-driven hiring unfortunately abound. If your applications involves humans in any capacity (and most do), make sure to pay special attention to these topics, and research best practices for evaluating your process, data, and modeling.

Scaling up ¶

This book focuses on using Python and scikit-learn for machine learning. One of the main limitations of scikit-learn that I’m often asked about is that it is usually restricted to using a single machine, not a cluster (though there are some ways around this in some cases). My standpoint on this is that for most applications, using a single machine is often enough, easier, and potentially even faster [RND+12] . Once your data is processed to a point where you can start your analysis, few applications require more than at most several gigabites of data, and many applications only require megabytes. These workloads are easily done on modern machines: even if your machine does not have enough memory, it’s quick and cheap to rent machines with hundreds of GB of RAM from a cloud provider, and do all your machine learning in memory on a single machine. If this is possible, I would encourage you to go with this solution. The interactivity and simplicity that comes from working on a single machine is hard to beat, and the amount of libraries available is far greater for local computations. If your raw data, say user logs, is many terrabytes, it might still be possible that after extracting the data needed for machine learning is only in the hundreds of megabytes, and so after preparing the data in a distributed environment such as Spark, you can then transition to a single machine for your machine learning workflow. Clearly there are situations when a single machine is not enough; large tech companies often use their own bespoke in-house systems to learn models on immense datastreams. However, most projects don’t operate on the scale of the facebook timeline or google searches, and even if your production environment requires truely large amounts of data, prototyping on a subset on a single machine can be helpful for a quick exploratory analysis or a prototype. Avoid premature optimization and start small–where small these days might mean hundreds of gigabytes.

Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. Fairness in criminal justice risk assessments: the state of the art. Sociological Methods & Research , pages 0049124118782533, 2018.

Sina Fazelpour and Zachary C Lipton. Algorithmic fairness from a non-ideal perspective. In AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES) 2020 . 2020.

Os Keyes, Jevan Hutson, and Meredith Durbin. A mulching proposal: analysing and improving an algorithmic system for turning the elderly into high-nutrient slurry. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing systems , 1–11. 2019.

Antony Rowstron, Dushyanth Narayanan, Austin Donnelly, Greg O’Shea, and Andrew Douglas. Nobody ever got fired for using hadoop on a cluster. In Proceedings of the 1st International Workshop on Hot Topics in Cloud Data Processing , 1–5. 2012.

David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. Machine learning: the high interest credit card of technical debt. In SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop) . 2014.

Your browser is ancient! Upgrade to a different browser to experience this site.

Applied Machine Learning in Python

Description.

This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit through a tutorial. The issue of dimensionality of data will be discussed, and the task of clustering data, as well as evaluating those clusters, will be tackled. Supervised approaches for creating predictive models will be described, and learners will be able to apply the scikit learn predictive modelling methods while understanding process issues related to data generalizability (e.g. cross validation, overfitting). The course will end with a look at more advanced techniques, such as building ensembles, and practical limitations of predictive models. By the end of this course, students will be able to identify the difference between a supervised (classification) and unsupervised (clustering) technique, identify which technique they need to apply for a particular dataset and need, engineer features to meet that need, and write python code to carry out an analysis.

This course should be taken after Introduction to Data Science in Python and Applied Plotting, Charting & Data Representation in Python and before Applied Text Mining in Python and Applied Social Analysis in Python.

based on 7398 ratings

applied machine learning in python coursera assignment 2

Applied Data Science with Python

U-M Credit Eligible

applied machine learning in python coursera assignment 2

Kevyn Collins-Thompson

Associate Professor

School of Information

Know someone who would like this course? Share it with them!

Share on Facebook

Share on Twitter

Share on LinkedIn

  • Find a Course
  • For Business
  • For Educators
  • Product News

Hands-on, applied learning on Coursera

June 7, 2021

applied machine learning in python coursera assignment 2

By Melody Liu, Senior Product Manager, Coursera Labs

Demand for people with data science and artificial intelligence (AI) skills continues to rise. Data science roles are growing by 37% annually, and for artificial intelligence jobs, the annual growth rate is 74% . These are specialized skills that require hands-on learning to master, but fortunately, opportunities to build and practice in-demand data science and AI skills like PyTorch, SQL, and Python are more accessible than they’ve ever been before.

We’ve seen that with hands-on programming assignments, you can increase your skill development by up to 30% . The ability to practice the skills you’re learning in real-time—on real-world projects—can positively impact the rate at which you learn as well as the depth of your mastery. You’ll also build career confidence through knowing how to apply your newfound skills on the job.

Below we highlight three types of hands-on learning you can find on Coursera. We also share details on how you can get unlimited access to these learning programs with Coursera Plus and confidently grow your in-demand career skills.

Assess and advance your technical skills with Lab Assignments

You can develop your technical skills in courses that contain Lab Assignments. With Lab Assignments, you can work on a coding challenge in a pre-configured environment directly in your internet browser with all of the instruction files already uploaded—no additional setup required. You’ll get real-time feedback from autograders as you work on your assignment so you can easily track your progress and monitor your skill development. 

applied machine learning in python coursera assignment 2

There are nearly 200 courses that feature Lab Assignments. Start exploring some of the Specializations below or view the list of all courses with Lab Assignments .

Here are a few of the most popular Specializations on Coursera with Lab Assignments: 

  • Applied Data Science with Python Specialization from University of Michigan
  • Excel to MySQL: Analytic Techniques for Business Specialization from Duke University
  • Mathematics for Machine Learning from Imperial College of London
  • Python 3 Programming from University of Michigan
  • Machine Learning and Reinforcement Learning in Finance Specialization from New York University 

Practice Programming with Lab Sandbox

Many top technical courses feature Lab Sandbox which is an in-browser cloud workspace pre-configured with the tools and packages that are relevant to the course. With Lab Sandbox, you can practice coding, work on programming assignments, run test cases, play with demo examples, and even build your own project. 96% of learners we surveyed who tried Lab Sandbox were highly satisfied with the experience. Your work is not graded, there are no instruction files, and you don’t need to set up or configure a local coding environment—you just go straight to learning and practicing job-relevant skills. 

Ready to try it out? We’ve curated a collection of courses so you can easily identify which courses have Lab Sandbox. You’ll have access as soon as you enroll and pay for your course. We’ll continue adding Lab Sandbox to more technical courses in the coming year, so you’ll continue to have an expanded selection of opportunities to take advantage of this feature. 

applied machine learning in python coursera assignment 2

Learn job-relevant skills in under two hours with Guided Projects

With Guided Projects , you can get hands-on practice with some of the most in-demand skills in under two hours. Whether you want to learn a new tool for a project you’re leading at work or quickly build a particular skill for an upcoming job interview, Guided Projects offer ideal hands-on learning experiences that are short, focused, and effective.

When you enroll in a Guided Project, you get access to a virtual cloud desktop with the required software and data pre-installed and configured. You’ll experience a side-by-side interface in a web browser where you can follow along with video instructions on one side of the screen as you practice the tasks in real-time in a virtual workspace on the other side of the screen. This enables you to seamlessly learn how to use everything from Tableau for data visualization and SQL for querying to TensorFlow for Machine Learning , and so much more.

applied machine learning in python coursera assignment 2

To date, there have been more than 2.5 million enrollments across the 1700+ Guided Projects currently available on Coursera. Explore some of the most popular Guided Projects in the collections below: 

  • Data Visualization Projects in Python with Plotly and Seaborn
  • TensorFlow and Keras Projects for Beginners
  • Software Development Projects for Beginners
  • Android Development Projects for Beginners

Get unlimited access to the largest volume of hands-on learning experiences with Coursera Plus

You can lock in unlimited access to most courses with Lab Assignments and Lab Sandbox—as well as nearly 2,000 Guided Projects—with Coursera Plus , now available as a monthly subscription. Try it for free with a 7-day trial. You can also choose an annual subscription plan to get the best value on hands-on learning to build job-relevant, technical skills. 

Whether you’re specifically honing your Python skills in a Lab Assignment , trying out SQL in a Guided Project , or taking advantage of Coursera Plus to build your skill set in multiple learning programs, you’ll always be able to practice and apply what you’ve learned as you develop new skills and build confidence in your career.

Keep reading

  • How to answer interview questions with the STAR method
  • Coursera Receives Industry-first Authorized Instructional Platform Designation from the American Council on Education
  • How to answer “what are your strengths and weaknesses?” in interviews

Here are 7 free AI classes you can take online from top tech firms, universities

Photograph from Google's campus

AI set the tech world on fire last year and it’s not stopping anytime soon.

Almost a quarter of global jobs is expected to change within the next five years thanks to AI , and with only a small percentage of workers with skills in this field, the rush to learn the ins-and-outs of AI is ever more important. 

The University of Texas at Austin logo

Embark on an AI revolution with UT Austin. Two course start dates per year. Accessible $10,000 tuition.

“AI is providing people with on-demand learning anywhere they are at any time of day on any day,” says Jared Curham, a professor of work and organizational studies at MIT’s Sloan School of Management . Curhan recently launched two new AI-powered courses focused on the world of strategic negotiation and says that the technology is overall making education more accessible with personalized feedback and coaching.

While there are an increasing number of full-fledged AI degree programs , including within business schools , some students may be looking for a simpler or self-paced route. If you’re interested in learning more about this in-demand field, several top tech firms and universities offer free online courses that serve as an introduction to AI technologies.

AWS Skill Builder

Amazon has more than 100 free and low-cost AI courses and learning resources available through AWS . Learners can obtain the basic skills in machine learning, generative AI, and foundational models. As a whole, the company has a commitment to provide free AI skills training to 2 million people by 2025.

The machine learning plan has nearly seven hours of free content in which individuals can learn the foundations of the technology, including relevant terminology, and decision-making processes. It also teaches users how to utilize Amazon SageMaker , the company’s machine learning platform used by companies like AT&T and LG .

Google: Google AI for Everyone

Google offers a beginner course for anyone who may be interested in how AI is being used in the real world. Google AI for Everyone , which is offered through online education platform edX , is a self-paced course that takes about four weeks to complete, assuming you dedicate two-to-three hours per week to the course. Participants learn about both AI and machine-learning principles and real-world applications of the technologies.

Google also covers what AI programming looks like and the process of teaching a computer how to learn. The course is taught by Laurence Moroney , who leads AI Advocacy at Google as part of the Google Research into Machine Intelligence (RMI) team. Nearly 12,000 people have enrolled in this free online course, according to edX.

Harvard University: Introduction to Artificial Intelligence with Python

If you’re one of the 5.7 million people who has taken Harvard University’s CS50 Introduction to Computer Science course through edX , then the university’s introductory AI class might be the best option for you. CS50, which is one of the most popular free online courses of all time, is a prerequisite for Harvard’s Introduction to Artificial Intelligence with Python course.

This seven-week course covers AI algorithms, game-playing engines, handwriting recognition, and machine translation. Students have to commit between 10 and 30 hours per week to complete the course, which includes hands-on projects and lectures. The course is taught by David J. Malan , a renowned computer scientist and Harvard professor.

IBM: AI Foundations for Everyone

IBM, which is recognized as a revolutionary leader in emerging technologies, offers an AI Foundations for Everyone specialization through Coursera. The specialization includes three courses:

  • Introduction to Artificial Intelligence (AI)
  • Getting Started with AI using IBM Watson
  • Building AI Powered Chatbots Without Programming   

The entire specialization takes about three months to complete, assuming you dedicate two hours per week to coursework. Students will learn the basics of what AI is, as well as its applications and ethical concerns. They’ll also hear from experts about starting a career in AI. The program is taught by Rav Ahuja and Antonio Cangiano, who work for IBM’s Skills Network . Participants earn a certificate upon completion.

Intel has a goal to provide more than 30 million people with AI skills by 2030 . As part of this commitment, the company provides dozens of free self-paced courses online on subjects such as deep learning for robotics, deep learning, and natural language processing. 

Intel also has several “AI Concepts” educational pages that will walk you through definitions, real-world examples, tools, and resources for topics such as generative AI, AI inference, and transfer learning. Additionally, the company provides free on-demand webinars on more advanced AI use cases such as optimizing transformer models, optimizing AI workloads, and AI performance tuning.

University of California—Davis: Big Data, Artificial Intelligence, and Ethics

As part of its Computational Social Science specialization through Coursera, the University of California—Davis offers a course focused on AI: Big Data, Artificial Intelligence, and Ethics . During this four-week course, participants learn about big data and its limitations, the history of artificial intelligence, and research ethics. The entire self-paced course takes about 12 total hours to complete.

The course is taught by Martin Hilbert, who is a professor at UC Davis and serves as a chair for computational social science. The course uses case studies to help participants learn AI concepts. More than 31,000 participants have completed this course, and those who do earn a certificate that can be shared on LinkedIn.

University of Pennsylvania: AI for Business

For someone who may be looking to break into AI or who wants to learn more about the applications of this new technology to different industries, the University of Pennsylvania offers a string of courses focused on artificial intelligence. The AI for Business specialization includes four courses:

  • AI Fundamentals for Non-Data Scientists
  • AI Applications in Marketing and Finance
  • AI Applications in People Management
  • AI Strategy and Governance  

These beginner courses take a total of about four months to complete and culminate in an applied learning project. Program participants complete peer-reviewed exercises to illustrate what they’ve learned about data analytics, machine learning tools, and people management. The specialization is taught by eight UPenn professors from the Wharton School , a top-ranked business school by Fortune Education, and other professors from the university. The courses are offered through online education platform Coursera, and students can earn a certificate that can be displayed on their LinkedIn profile.

Frequently asked questions

Which course is best in ai.

There is no one best course or program since AI is still so new. What ultimately matters is your curiosity to learn about AI, which you can do by working directly with prompt engineering or machine learning to gain hands-on skills.

Can I learn AI in three months?

You can certainly learn the foundations of AI in three months—especially if you already have a background in computer science. It is important to keep in mind that because AI is always changing and developing, you will need to keep up to date with the latest trends if you are looking to pursue a career focused on working with the technology.

How to learn AI for beginners?

Taking  free AI courses on platforms such as Udemy or Codecademy is a great place to learn AI if you’re a beginner. You can also learn AI by watching YouTube videos or reading through AI subreddits. The number of ways to learn AI are only growing, so there is ultimately no perfect path. Above all, just be curious, ask important questions, and don’t be afraid to dive down rabbit holes

Check out all of  Fortune’ s  rankings of degree programs , and learn more about specific career paths .

Sydney Lake contributed to this piece.

MIT Sloan Executive Education - Computer Science and Artificial Intelligence Laboratory logo

Rethink AI and its potential to innovate in this 6-week online course from MIT Sloan.

Mba rankings.

  • Best Online MBA Programs for 2024
  • Best Online Master’s in Accounting Programs for 2024
  • Best MBA Programs for 2024
  • Best Executive MBA Programs for 2024
  • Best Part-Time MBA Programs for 2024
  • 25 Most Affordable Online MBAs for 2024
  • Best Online Master’s in Business Analytics Programs for 2024

Information technology & data rankings

  • Best Online Master’s in Data Science Programs for 2024
  • Most Affordable Master’s in Data Science for 2024
  • Best Master’s in Cybersecurity Degrees for 2024
  • Best Online Master’s in Cybersecurity Degrees for 2024
  • Best Online Master’s in Computer Science Degrees for 2024
  • Best Master’s in Data Science Programs for 2024
  • Most Affordable Online Master’s in Data Science Programs for 2024
  • Most Affordable Online Master’s in Cybersecurity Degrees for 2024

Health rankings

  • Best Online MSN Nurse Practitioner Programs for 2024
  • Accredited Online Master’s of Social Work (MSW) Programs for 2024
  • Best Online Master’s in Nursing (MSN) Programs for 2024
  • Best Online Master’s in Public Health (MPH) Programs for 2024
  • Most Affordable Online MSN Nurse Practitioner Programs for 2024
  • Best Online Master’s in Psychology Programs for 2024

Leadership rankings

  • Best Online Doctorate in Education (EdD) Programs for 2024
  • Most Affordable Online Doctorate in Education (EdD) Programs for 2024
  • Coding Bootcamps in New York for 2024
  • Best Data Science and Analytics Bootcamps for 2024
  • Best Cybersecurity Bootcamps for 2024
  • Best UX/UI bootcamps for 2024

Boarding schools

  • World’s Leading Boarding Schools for 2024
  • Top Boarding School Advisors for 2024

Saïd Business School, University of Oxford logo

Examine the technology behind AI over 6 weeks on this Oxford online programme.

IMAGES

  1. Applied Machine Learning in Python Coursera Assignment Answers

    applied machine learning in python coursera assignment 2

  2. Applied Machine Learning in Python Complete Course

    applied machine learning in python coursera assignment 2

  3. Applied-Machine-Learning-in-Python/Assignment 2.ipynb at master

    applied machine learning in python coursera assignment 2

  4. coursera-applied-machine-learning-with-python/Assignment+2.py at master

    applied machine learning in python coursera assignment 2

  5. Applied Machine Learning in Python

    applied machine learning in python coursera assignment 2

  6. Coursera

    applied machine learning in python coursera assignment 2

VIDEO

  1. Is the Python For Everybody Specialization on Coursera WORTH IT? 🧐

  2. Introduction to Data Science in Python

  3. Introduction To Machine Learning Week 3 Assignment 3 Solution

  4. Course 5 : Analyse data to get answers [COURSERA] [GOOGLE] week 1

  5. Python Data Structures Assignment 10.2 Solution [Coursera]

  6. Machine learning Coursera Ex 1[ Week 2] Easy steps for submission and getting started(2022) Updated

COMMENTS

  1. applied-machine-learning-in-python/Assignment+2.ipynb at master

    Solutions to the 'Applied Machine Learning In Python' Coursera course exercises - amirkeren/applied-machine-learning-in-python

  2. Assignment 2

    Part 2 - Classification¶ Here's an application of machine learning that could save your life! For this section of the assignment we will be working with the UCI Mushroom Data Set stored in readonly/mushrooms.csv. The data will be used to train a model to predict whether or not a mushroom is poisonous. The following attributes are provided:

  3. Applied Machine Learning in Python Course (UMich)

    There are 4 modules in this course. This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit through ...

  4. Applied Machine Learning in Python

    There are 4 modules in this course. This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit through ...

  5. Applied ML| Week 2| assignment solution 100 percent correct |coursera

    Key Concepts:Understand how different supervised learning algorithms - in particular, those based on linear models - estimate their own parameters from data ...

  6. Applied Machine Learning in Python

    Syllabus. Module 1: Fundamentals of Machine Learning - Intro to SciKit Learn. This module introduces basic machine learning concepts, tasks, and workflow using an example classification problem based on the K-nearest neighbors method, and implemented using the scikit-learn library. Module 2: Supervised Machine Learning - Part 1.

  7. Applied Machine Learning in Python

    Offered by University of Michigan. This course will introduce the learner to applied machine learning, focusing more on the techniques and ... Enroll for free.

  8. Applied Machine Learning in python university of michigan All weeks

    Please Subscribe #coursera Coursera Applied Machine Learning in python- university of michigan - All weeks solutions of assignments and quizAll Quiz Ans are ...

  9. Solved Coursera

    The following attributes are. Coursera - Applied Machine Learning in Python - Module 2. Hi, Please provide the correct code. Part 2 - Classification. For this section of the assignment we will be working with the UCI Mushroom Data Set stored in mushrooms.csv. The data will be used to trian a model to predict whether or not a mushroom is poisonous.

  10. Solved Hi, Please help to resolve the code. Coursera

    Coursera - Applied Machine Learning in Python - Module 2. Assignment 2. In this assignment you'll explore the relationship between model complexity and generalization performance, by adjusting key parameters of various supervised learning models. Part 1 of this assignment will look at regression and Part 2 will look at classification. Part 1 ...

  11. Introduction

    This book aims to provide an accessible introduction into applying machine learning with Python, in particular using the scikit-learn library. I assume that you're already somewhat familiar with Python and the libaries of the scientific Python ecosystem. If you find that you have a hard time following along some of the details of numpy ...

  12. Solved Coursera

    For each level of gamma, validation_curve will use 3-fold cross validation (use cv=3, n_jobs=2 as parameters for validation_curve), returning two 6x3 (6 levels of gamma x 3 fits per level) arrays of the scores for the training and test sets in each fold.

  13. | notebook.community

    Assignment 3 - Evaluation. In this assignment you will train several models and evaluate how effectively they predict instances of fraud using data based on this dataset from Kaggle.. Each row in fraud_data.csv corresponds to a credit card transaction. Features include confidential variables V1 through V28 as well as Amount which is the amount of the transaction.

  14. Applied Machine Learning in Python

    This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit through a tutorial. The issue of dimensionality of data will be discussed, and the task of ...

  15. Machine Learning with Python Course (IBM)

    Module 1 • 54 minutes to complete. In this module, you will learn about applications of Machine Learning in different fields such as health care, banking, telecommunication, and so on. You'll get a general overview of Machine Learning topics such as supervised vs unsupervised learning, and the usage of each algorithm.

  16. Hands-on, applied learning on Coursera

    Here are a few of the most popular Specializations on Coursera with Lab Assignments: Applied Data Science with Python Specialization from University of Michigan; Excel to MySQL: Analytic Techniques for Business Specialization from Duke University; Mathematics for Machine Learning from Imperial College of London

  17. Here are 7 free AI classes you can take online from top tech firms

    Learners can obtain the basic skills in machine learning, generative AI, and foundational models. As a whole, the company has a commitment to provide free AI skills training to 2 million people by ...

  18. Machine Learning

    Applied Learning Project. By the end of this Specialization, you will be ready to: • Build machine learning models in Python using popular machine learning libraries NumPy and scikit-learn. • Build and train supervised machine learning models for prediction and binary classification tasks, including linear regression and logistic regression.

  19. Fundamentals of Robotics & Industrial Automation

    Optical Encoders in speed measurement- Part 2 • 9 minutes; Interfacing sensors in a CNC Machine for Tool life monitoring -Part 1 • 12 minutes; Interfacing sensors in a CNC Machine for Tool life monitoring -Part 2 • 26 minutes; Autonomous Mechatronics systems in Machine tools • 10 minutes

  20. Learner Reviews & Feedback for Perform data science with ...

    Find helpful learner reviews, feedback, and ratings for Perform data science with Azure Databricks from Microsoft. Read stories and highlights from Coursera learners who completed Perform data science with Azure Databricks and wanted to share their experience. Easy to understand and very good explanation from the instructor