Articles

How MUSIC Changes the Way You Read COMICS! – Comic Book Science || NerdSync


– [Voiceover] Hi, I’m Jerry Kurata. Welcome to the Pluralsight course on Understanding Machine Learning with R. In this course, you’ll learn
how to apply machine learning to solve problems that are difficult, and some might say impossible to solve with standard coding techniques. In this module, we’ll provide
some basic information about machine learning. This includes examples
of machine learning, a definition of machine learning, and importantly, how
machine learning differs from traditional programming. We’ll go over the two basic
types of machine learning, supervised and unsupervised. We’ll see each of these types in action, which will clarify how they differ, and when each type of machine
learning should be used. After that, we’ll review
the contents of this course, and the skills you need, and do not need, for this course. We’ll finish with a brief
discussion of how machine learning fits into the larger
subject of data science. Machine learning is one
of those technologies that is being used all around us, and we may not even realize it. For example, machine learning
is used to solve problems like determining if an email is spam, how people will vote in the next election, and what products people
are likely to buy. Every day you see these types
of machine learning solutions in action. When you open your email, and messages are automatically scanned, classified as spam, and
moved to your spam folder. Since 2008, data scientist Nate Silver has been able to predict
with amazing accuracy, the results of all major US elections, before a single ballot is cast. And data giants like Google and Amazon have been able to predict which
items you’re looking to buy, and ensure you see ads for
those items on every webpage you visit. All of these useful, and sometimes annoying
and creepy behaviors, are the result of machine learning, but before we proceed, what do you think when you hear
the term “machine learning?” Perhaps something like this? While the thinking robots we see in movies likely use machine learning. Defining the complex logic
that lets robots think is, unfortunately, beyond
the scope of this course, so let’s define machine
learning in general, and the specific type of
machine learning this course focuses on. For this course, we’ll
define machine learning to be building a model from example inputs to make data-driven predictions, versus following strictly
static programming instructions. This definition points out the key feature of machine learning, namely that the system learns
how to solve the problem from example data, rather than you writing specific logic. This is a significant departure
from how most programming is done. In more traditional program, we carefully analyze the problem, and define the code that will
produce the desired results. We use constructs such as If statements, Case statements, and controlled loops
implemented with While and Until statements. Each of these has tests
that we have to define, and with the changing data
typical of machine learning, can be difficult to maintain. In contrast with machine learning, we don’t write the control
logic that produces the result. Instead, we gather the data we need, and modify its format into a
form machine learning can use. We then pass this data to an algorithm. The algorithm analyzes the data, and creates a model that
implements the solution to solve the problem based on the data. Machine learning
algorithms learn from data by utilizing one of two techniques, supervised or unsupervised
machine learning. While this course focuses on
supervised machine learning, it is important to
understand the difference between the two techniques, and when you should use
one versus the other. In supervised machine learning, the subject of this course, each row of data has fields
containing feature values and the value we want
the algorithm to predict. To accomplish this, we would
pass a set of training data with each row of the data
containing features of the price, such as size, number of bedrooms, the year the house was built, and the value we want to predict, namely the price. We pass many rows of this
training data to the algorithm. The algorithm analyzes the features, and the resultant price. It determines the relationship
between the features, and creates a model that is trained to predict the price of the house, based on the features. Then when the train model
is presented with data for new house, it executes its logic, and accurately predicts
the price of the new house. Let’s contrast this to
unsupervised machine learning. In unsupervised machine learning, we are looking for clusters of like data. The algorithm analyzes input data, and identifies groups of data
that share the same traits. For example, let’s start with a recording of a room full of people talking. We can convert this recording into data, containing values for vocal
attributes, such as pitch, intonation, and inflection. We can then pass this mixed voice data to an unsupervised learning algorithm. The algorithm can analyze the voices, and create a model that clusters data, words in this case, that have certain patterns
of pitch, intonation, and other vocal features. This results in the ability to
isolate an individual’s voice from the mixture of voices. Since it’s important to
understand the correct technique to use to solve a problem, let’s recap the differences
between supervised and unsupervised machine learning. A primary difference
is what type of problem are you trying to solve? If we’re looking to predict a value like the price of a house, then you’re likely looking at a supervised machine learning problem. If you’re going into a set of data, and trying to define groups
or clusters of like items, then this is likely an unsupervised machine learning problem. The data we have also matters. Supervised machine learning
requires that we have some training data that has
the value we were trying to predict. With unsupervised machine learning, we don’t have the value. We are trying to figure out the values. As we saw in the examples, in supervised machine learning, we use the data with the value, to train the model, so that it can predict future
values when it sees new data, like a new house. In contrast with unsupervised
machine learning, we get clusters of like
data from the model. And importantly, in this course, we’ll cover supervised machine learning, and not unsupervised machine learning. This choice was made because many of the machine learning
problems you’re likely to run into are predictions, and thus solved with
supervised machine learning. Now, let’s take a few moments, and go over the content of the course. We’ll start with an overview of the machine learning work flow. This work flow will provide
the framework we use to approach our problem, and ensure we do not forget critical steps in developing the solution to the problem. Once we understand the
structure of the work flow, we’ll spend the majority of the course diving deep into each
steps of the work flow. We will do this by applying
the work flow steps to a sample problem of
predicting if a flight will be on time. At the end of the course,
we’ll have a short review on what you have learned, and where you can go
from here on your journey to learn more about machine learning, but before we go further, let’s review the skills and
experience you should have to get the most from this course. Let’s start with the skills
I do not expect you to have. First of all, as the “With R” in the title implies, we’re going to be doing
our programming in R. However, you don’t need to have
any prior experience with R. This is a “learn by doing” course. I will introduce the parts
of R we need as we use them. Likewise, I will show
you how to use R Studio, which is an IDE that wraps
the R command line interface. While you could do this entire
course without R Studio, R Studio has nice features
like color coded editing, auto completion, script file integration, and data browsing that
make development easier, and less error prone. Finally, I do not expect
you to have advanced math or statistics knowledge. While the algorithms are
based on advanced math and statistics, we’ll be focused on using existing algorithms, rather than creating new algorithms. Now, let’s take a look at
the skills and experience you do need to have. First, you need to have some
sort of programming experience. That can be in C, C#, VBE, Java, Python, or
almost any other language. The language specifics don’t matter, but you need to be able to understand basic programming constructs such as assignment, condition statements, loops and passing parameters to functions. Second, you need to understand
and have some experience working with data and tables and lists. That is, you need to
understand that in a table, a given column contains
data of the same type, and that rows contain one or more columns. Also, that in lists, all of the data is the same type. Third, you need to have some basic math and statistics knowledge. The statistics needed are
not much beyond means, medians, max and min, but I am assuming this level of knowledge. Likewise, nothing more than
basic algebra math skills are assumed, and finally, and most importantly, you need to have a curious
mind and be enthusiastic. This is a course about seeking
understanding through data. You learn how with a little manipulation, you can create models
that predict the future, but getting there will take
a little bit of effort. I’m going to assume that
if you’ve made it this far, it’s because you have an
interest in machine learning, but just in case you have some doubts, let’s talk about why you
want to watch the rest of this course. One obvious reason for taking this course will to be to add an skill set, namely, machine learning. As developers, we always want
to constantly add skills, and learn how to do things
in new and interesting ways, and just like when you learned
a new programming language, you needed a set of
resources like this course to get you started. This course is about machine learning, and machine learning is
one of the key technologies used in the broader field
of study of data science. As we see, data science itself
is a blend of mathematics and statistics, software development, and expertise in the problem subject area. Machine learning is one of
the most important parts of data science, and is at the intersection
of software development and math and statistics. I know you have the software
development expertise, so in this course, we’ll
blend that with using math and statistical algorithms and models. And if you can add some
subject matter expertise, you can become a tech unicorn. One of the things that has made
data science so exciting now is the recognition of what
machine learning can do for companies. In particular, the ability
to predict outcomes can directly impact a
company’s bottom line by defining new business opportunities, and bringing in new customers. With this new awareness has
been a substantial increase in the salaries for data
science professionals, especially for tech unicorn, but beyond financial gains, I hope this course can
satisfy your curiosity about machine learning
and how it’s applied. Once you understand the
basics of machine learning, who knows where your skills can take you? Perhaps, this will be your next project. If so, I, for one, welcome
our new robot overlords. Now, let’s get started on
this machine learning journey. Head over to the next
section where we’ll discuss the machine learning workflow.

Leave a Reply

Your email address will not be published. Required fields are marked *