Monday, 12 February 2024

Where it (Machine Learning) is used in data science

 Although machine learning is mainly linked to the data-modeling step of the data science process, it can be used at almost every step. the data science process is shown below

The data modeling phase can’t start until you have qualitative raw data you can understand. But prior to that, the data preparation phase can benefit from the use of machine learning. An example would be cleansing a list of text strings; machine learning can group similar strings together so it becomes easier to correct spelling errors.

Machine learning is also useful when exploring data. Algorithms can root out underlying patterns in the data where they’d be difficult to find with only charts.

Given that machine learning is useful throughout the data science process, it shouldn’t come as a surprise that a considerable number of Python libraries were developed to make your life a bit easier.

Applications of machine learning in data science

 Regression and classification are of primary importance to a data scientist. To achieve these goals, one of the main tools a data scientist uses is machine learning. The uses for regression and automatic classification are wide ranging, such as the following:

  • Finding oil fields, gold mines, or archeological sites based on existing sites (classification and regression)
  • Finding place names or persons in text (classification)
  • Identifying people based on pictures or voice recordings (classification)
  • Recognizing birds based on their whistle (classification)
  • Identifying profitable customers (regression and classification)
  • Proactively identifying car parts that are likely to fail (regression)
  • Identifying tumors and diseases (classification)
  • Predicting the amount of money a person will spend on product X (regression)
  • Predicting the number of eruptions of a volcano in a period (regression)
  • Predicting your company’s yearly revenue (regression)
  • Predicting which team will win the Champions League in soccer (classification)

Occasionally data scientists build a model (an abstraction of reality) that provides insight to the underlying processes of a phenomenon. When the goal of a model isn’t prediction but interpretation, it’s called root cause analysis. Here are a few examples:

  • Understanding and optimizing a business process, such as determining which products add value to a product line
  • Discovering what causes diabetes
  • Determining the causes of traffic jams


This list of machine learning applications can only be seen as an appetizer because it’s ubiquitous within data science. Regression and classification are two important techniques, but the repertoire and the applications don’t end, with clustering as one other example of a valuable technique.

What is machine learning and why we should care about

 “Machine learning is a field of study that gives computers the ability to learn without being explicitly programmed.”
                                                 —Arthur Samuel, 19591

 

The definition of machine learning coined by Arthur Samuel is often quoted and is genius in its broadness, but it leaves you with the question of how the computer learns. To achieve machine learning, experts develop general-purpose algorithms that can be used on large classes of learning problems. When you want to solve a specific task you only need to feed the algorithm more specific data. In a way, you’re programming by example. In most cases a computer will use data as its source of information and compare its output to a desired output and then correct for it. The more data or “experience” the computer gets, the better it becomes at its designated job, like a human does.

When machine learning is seen as a process, the following definition is insightful:

“Machine learning is the process by which a computer can work more accurately as it collects and learns from the data it is given.”
                                                —Mike Roberts2
 

For example, as a user writes more text messages on a phone, the phone learns more about the messages’ common vocabulary and can predict (autocomplete) their words faster and more accurately.


In the broader field of science, machine learning is a sub field of artificial intelligence and is closely related to applied mathematics and statistics. All this might sound a bit abstract, but machine learning has many applications in everyday life.

Here are the important points from the above discussion 


  1. Defined by Arthur Samuel as a field of study that enables computers to learn without explicit programming.Experts develop general-purpose algorithms for large learning problems.
  2. For specific tasks, more specific data is fed to the algorithm.
  3. Machines use data as their source of information and compare their output to desired outputs.
  4. As a process, machine learning improves accuracy by collecting and learning from given data.
  5. Examples include a phone learning about common vocabulary and predicting user's words faster.
  6. Machine learning is a sub field of artificial intelligence, closely related to applied mathematics and statistics.

Latest Notifications

More

Results

More

Timetables

More

Latest Schlorships

More

Materials

More

Previous Question Papers

More

All syllabus Posts

More

AI Fundamentals Tutorial

More

Data Science and R Tutorial

More
Top