• Math Article

Graphical Representation

Graphical Representation is a way of analysing numerical data. It exhibits the relation between data, ideas, information and concepts in a diagram. It is easy to understand and it is one of the most important learning strategies. It always depends on the type of information in a particular domain. There are different types of graphical representation. Some of them are as follows:

  • Line Graphs – Line graph or the linear graph is used to display the continuous data and it is useful for predicting future events over time.
  • Bar Graphs – Bar Graph is used to display the category of data and it compares the data using solid bars to represent the quantities.
  • Histograms – The graph that uses bars to represent the frequency of numerical data that are organised into intervals. Since all the intervals are equal and continuous, all the bars have the same width.
  • Line Plot – It shows the frequency of data on a given number line. ‘ x ‘ is placed above a number line each time when that data occurs again.
  • Frequency Table – The table shows the number of pieces of data that falls within the given interval.
  • Circle Graph – Also known as the pie chart that shows the relationships of the parts of the whole. The circle is considered with 100% and the categories occupied is represented with that specific percentage like 15%, 56%, etc.
  • Stem and Leaf Plot – In the stem and leaf plot, the data are organised from least value to the greatest value. The digits of the least place values from the leaves and the next place value digit forms the stems.
  • Box and Whisker Plot – The plot diagram summarises the data by dividing into four parts. Box and whisker show the range (spread) and the middle ( median) of the data.

Graphical Representation

General Rules for Graphical Representation of Data

There are certain rules to effectively present the information in the graphical representation. They are:

  • Suitable Title: Make sure that the appropriate title is given to the graph which indicates the subject of the presentation.
  • Measurement Unit: Mention the measurement unit in the graph.
  • Proper Scale: To represent the data in an accurate manner, choose a proper scale.
  • Index: Index the appropriate colours, shades, lines, design in the graphs for better understanding.
  • Data Sources: Include the source of information wherever it is necessary at the bottom of the graph.
  • Keep it Simple: Construct a graph in an easy way that everyone can understand.
  • Neat: Choose the correct size, fonts, colours etc in such a way that the graph should be a visual aid for the presentation of information.

Graphical Representation in Maths

In Mathematics, a graph is defined as a chart with statistical data, which are represented in the form of curves or lines drawn across the coordinate point plotted on its surface. It helps to study the relationship between two variables where it helps to measure the change in the variable amount with respect to another variable within a given interval of time. It helps to study the series distribution and frequency distribution for a given problem.  There are two types of graphs to visually depict the information. They are:

  • Time Series Graphs – Example: Line Graph
  • Frequency Distribution Graphs – Example: Frequency Polygon Graph

Principles of Graphical Representation

Algebraic principles are applied to all types of graphical representation of data. In graphs, it is represented using two lines called coordinate axes. The horizontal axis is denoted as the x-axis and the vertical axis is denoted as the y-axis. The point at which two lines intersect is called an origin ‘O’. Consider x-axis, the distance from the origin to the right side will take a positive value and the distance from the origin to the left side will take a negative value. Similarly, for the y-axis, the points above the origin will take a positive value, and the points below the origin will a negative value.

Principles of graphical representation

Generally, the frequency distribution is represented in four methods, namely

  • Smoothed frequency graph
  • Pie diagram
  • Cumulative or ogive frequency graph
  • Frequency Polygon

Merits of Using Graphs

Some of the merits of using graphs are as follows:

  • The graph is easily understood by everyone without any prior knowledge.
  • It saves time
  • It allows us to relate and compare the data for different time periods
  • It is used in statistics to determine the mean, median and mode for different data, as well as in the interpolation and the extrapolation of data.

Example for Frequency polygonGraph

Here are the steps to follow to find the frequency distribution of a frequency polygon and it is represented in a graphical way.

  • Obtain the frequency distribution and find the midpoints of each class interval.
  • Represent the midpoints along x-axis and frequencies along the y-axis.
  • Plot the points corresponding to the frequency at each midpoint.
  • Join these points, using lines in order.
  • To complete the polygon, join the point at each end immediately to the lower or higher class marks on the x-axis.

Draw the frequency polygon for the following data

Mark the class interval along x-axis and frequencies along the y-axis.

Let assume that class interval 0-10 with frequency zero and 90-100 with frequency zero.

Now calculate the midpoint of the class interval.

Using the midpoint and the frequency value from the above table, plot the points A (5, 0), B (15, 4), C (25, 6), D (35, 8), E (45, 10), F (55, 12), G (65, 14), H (75, 7), I (85, 5) and J (95, 0).

To obtain the frequency polygon ABCDEFGHIJ, draw the line segments AB, BC, CD, DE, EF, FG, GH, HI, IJ, and connect all the points.

data graphical representation called

Frequently Asked Questions

What are the different types of graphical representation.

Some of the various types of graphical representation include:

  • Line Graphs
  • Frequency Table
  • Circle Graph, etc.

Read More:  Types of Graphs

What are the Advantages of Graphical Method?

Some of the advantages of graphical representation are:

  • It makes data more easily understandable.
  • It saves time.
  • It makes the comparison of data more efficient.

Leave a Comment Cancel reply

Your Mobile number and Email id will not be published. Required fields are marked *

Request OTP on Voice Call

Post My Comment

data graphical representation called

Very useful for understand the basic concepts in simple and easy way. Its very useful to all students whether they are school students or college sudents

Thanks very much for the information

data graphical representation called

  • Share Share

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

close

What Is Data Visualization: Brief Theory, Useful Tips and Awesome Examples

  • Share on Facebook
  • Share on Twitter

By Al Boicheva

in Insights , Inspiration

3 years ago

Viewed 9,992 times

Spread the word about this article:

What Is Data Visualization Brief Theory, Useful Tips and Awesome Examples

Updated: June 23, 2022

To create data visualization in order to present your data is no longer just a nice to have skill. Now, the skill to effectively sort and communicate your data through charts is a must-have for any business in any field that deals with data. Data visualization helps businesses quickly make sense of complex data and start making decisions based on that data. This is why today we’ll talk about what is data visualization. We’ll discuss how and why does it work, what type of charts to choose in what cases, how to create effective charts, and, of course, end with beautiful examples.

So let’s jump right in. As usual, don’t hesitate to fast-travel to a particular section of your interest.

Article overview: 1. What Does Data Visualization Mean? 2. How Does it Work? 3. When to Use it? 4. Why Use it? 5. Types of Data Visualization 6. Data Visualization VS Infographics: 5 Main Differences 7. How to Create Effective Data Visualization?: 5 Useful Tips 8. Examples of Data Visualization

1. What is Data Visualization?

Data Visualization is a graphic representation of data that aims to communicate numerous heavy data in an efficient way that is easier to grasp and understand . In a way, data visualization is the mapping between the original data and graphic elements that determine how the attributes of these elements vary. The visualization is usually made by the use of charts, lines, or points, bars, and maps.

  • Data Viz is a branch of Descriptive statistics but it requires both design, computer, and statistical skills.
  • Aesthetics and functionality go hand in hand to communicate complex statistics in an intuitive way.
  • Data Viz tools and technologies are essential for making data-driven decisions.
  • It’s a fine balance between form and functionality.
  • Every STEM field benefits from understanding data.

2. How Does it Work?

If we can see it, our brains can internalize and reflect on it. This is why it’s much easier and more effective to make sense of a chart and see trends than to read a massive document that would take a lot of time and focus to rationalize. We wouldn’t want to repeat the cliche that humans are visual creatures, but it’s a fact that visualization is much more effective and comprehensive.

In a way, we can say that data Viz is a form of storytelling with the purpose to help us make decisions based on data. Such data might include:

  • Tracking sales
  • Identifying trends
  • Identifying changes
  • Monitoring goals
  • Monitoring results
  • Combining data

3. When to Use it?

Data visualization is useful for companies that deal with lots of data on a daily basis. It’s essential to have your data and trends instantly visible. Better than scrolling through colossal spreadsheets. When the trends stand out instantly this also helps your clients or viewers to understand them instead of getting lost in the clutter of numbers.

With that being said, Data Viz is suitable for:

  • Annual reports
  • Presentations
  • Social media micronarratives
  • Informational brochures
  • Trend-trafficking
  • Candlestick chart for financial analysis
  • Determining routes

Common cases when data visualization sees use are in sales, marketing, healthcare, science, finances, politics, and logistics.

4. Why Use it?

Short answer: decision making. Data Visualization comes with the undeniable benefits of quickly recognizing patterns and interpret data. More specifically, it is an invaluable tool to determine the following cases.

  • Identifying correlations between the relationship of variables.
  • Getting market insights about audience behavior.
  • Determining value vs risk metrics.
  • Monitoring trends over time.
  • Examining rates and potential through frequency.
  • Ability to react to changes.

5. Types of Data Visualization

As you probably already guessed, Data Viz is much more than simple pie charts and graphs styled in a visually appealing way. The methods that this branch uses to visualize statistics include a series of effective types.

Map visualization is a great method to analyze and display geographically related information and present it accurately via maps. This intuitive way aims to distribute data by region. Since maps can be 2D or 3D, static or dynamic, there are numerous combinations one can use in order to create a Data Viz map.

COVID-19 Spending Data Visualization POGO by George Railean

The most common ones, however, are:

  • Regional Maps: Classic maps that display countries, cities, or districts. They often represent data in different colors for different characteristics in each region.
  • Line Maps: They usually contain space and time and are ideal for routing, especially for driving or taxi routes in the area due to their analysis of specific scenes.
  • Point Maps: These maps distribute data of geographic information. They are ideal for businesses to pinpoint the exact locations of their buildings in a region.
  • Heat Maps: They indicate the weight of a geographical area based on a specific property. For example, a heat map may distribute the saturation of infected people by area.

Charts present data in the form of graphs, diagrams, and tables. They are often confused with graphs since graphs are indeed a subcategory of charts. However, there is a small difference: graphs show the mathematical relationship between groups of data and is only one of the chart methods to represent data.

Gluten in America - chart data visualization

Infographic Data Visualization by Madeline VanRemmen

With that out of the way, let’s talk about the most basic types of charts in data visualization.

Finance Statistics - Bar Graph visualization

They use a series of bars that illustrate data development.  They are ideal for lighter data and follow trends of no more than three variables or else, the bars become cluttered and hard to comprehend. Ideal for year-on-year comparisons and monthly breakdowns.

Pie chart visualization type

These familiar circular graphs divide data into portions. The bigger the slice, the bigger the portion. They are ideal for depicting sections of a whole and their sum must always be 100%. Avoid pie charts when you need to show data development over time or lack a value for any of the portions. Doughnut charts have the same use as pie charts.

Line graph - common visualization type

They use a line or more than one lines that show development over time. It allows tracking multiple variables at the same time. A great example is tracking product sales by a brand over the years. Area charts have the same use as line charts.

Scatter Plot

Scatter Plot - data visualization idea

These charts allow you to see patterns through data visualization. They have an x-axis and a y-axis for two different values. For example, if your x-axis contains information about car prices while the y-axis is about salaries, the positive or negative relationship will tell you about what a person’s car tells about their salary.

Unlike the charts we just discussed, tables show data in almost a raw format. They are ideal when your data is hard to present visually and aim to show specific numerical data that one is supposed to read rather than visualize.

Creative data table visualization

Data Visualisation | To bee or not to bee by Aishwarya Anand Singh

For example, charts are perfect to display data about a particular illness over a time period in a particular area, but a table comes to better use when you also need to understand specifics such as causes, outcomes, relapses, a period of treatment, and so on.

6. Data Visualization VS Infographics

5 main differences.

They are not that different as both visually represent data. It is often you search for infographics and find images titled Data Visualization and the other way around. In many cases, however, these titles aren’t misleading. Why is that?

  • Data visualization is made of just one element. It could be a map, a chart, or a table. Infographics , on the other hand, often include multiple Data Viz elements.
  • Unlike data visualizations that can be simple or extremely complex and heavy, infographics are simple and target wider audiences. The latter is usually comprehensible even to people outside of the field of research the infographic represents.
  • Interestingly enough, data Viz doesn’t offer narratives and conclusions, it’s a tool and basis for reaching those. While infographics, in most cases offer a story and a narrative. For example, a data visualization map may have the title “Air pollution saturation by region”, while an infographic with the same data would go “Areas A and B are the most polluted in Country C”.
  • Data visualizations can be made in Excel or use other tools that automatically generate the design unless they are set for presentation or publishing. The aesthetics of infographics , however, are of great importance and the designs must be appealing to wider audiences.
  • In terms of interaction, data visualizations often offer interactive charts, especially in an online form. Infographics, on the other hand, rarely have interaction and are usually static images.

While on topic, you could also be interested to check out these 50 engaging infographic examples that make complex data look great.

7. Tips to Create Effective Data Visualization

The process is naturally similar to creating Infographics and it revolves around understanding your data and audience. To be more precise, these are the main steps and best practices when it comes to preparing an effective visualization of data for your viewers to instantly understand.

1. Do Your Homework

Preparation is half the work already done. Before you even start visualizing data, you have to be sure you understand that data to the last detail.

Knowing your audience is undeniable another important part of the homework, as different audiences process information differently. Who are the people you’re visualizing data for? How do they process visual data? Is it enough to hand them a single pie chart or you’ll need a more in-depth visual report?

The third part of preparing is to determine exactly what you want to communicate to the audience. What kind of information you’re visualizing and does it reflect your goal?

And last, think about how much data you’ll be working with and take it into account.

2. Choose the Right Type of Chart

In a previous section, we listed the basic chart types that find use in data visualization. To determine best which one suits your work, there are a few things to consider.

  • How many variables will you have in a chart?
  • How many items will you place for each of your variables?
  • What will be the relation between the values (time period, comparison, distributions, etc.)

With that being said, a pie chart would be ideal if you need to present what portions of a whole takes each item. For example, you can use it to showcase what percent of the market share takes a particular product. Pie charts, however, are unsuitable for distributions, comparisons, and following trends through time periods. Bar graphs, scatter plots,s and line graphs are much more effective in those cases.

Another example is how to use time in your charts. It’s way more accurate to use a horizontal axis because time should run left to right. It’s way more visually intuitive.

3. Sort your Data

Start with removing every piece of data that does not add value and is basically excess for the chart. Sometimes, you have to work with a huge amount of data which will inevitably make your chart pretty complex and hard to read. Don’t hesitate to split your information into two or more charts. If that won’t work for you, you could use highlights or change the entire type of chart with something that would fit better.

Tip: When you use bar charts and columns for comparison, sort the information in an ascending or a descending way by value instead of alphabetical order.

4. Use Colors to Your Advantage

In every form of visualization, colors are your best friend and the most powerful tool. They create contrasts, accents, and emphasis and lead the eye intuitively. Even here, color theory is important.

When you design your chart, make sure you don’t use more than 5 or 6 colors. Anything more than that will make your graph overwhelming and hard to read for your viewers. However, color intensity is a different thing that you can use to your advantage. For example, when you compare the same concept in different periods of time, you could sort your data from the lightest shade of your chosen color to its darker one. It creates a strong visual progression, proper to your timeline.

Things to consider when you choose colors:

  • Different colors for different categories.
  • A consistent color palette for all charts in a series that you will later compare.
  • It’s appropriate to use color blind-friendly palettes.

5. Get Inspired

Always put your inspiration to work when you want to be at the top of your game. Look through examples, infographics, and other people’s work and see what works best for each type of data you need to implement.

This Twitter account Data Visualization Society is a great way to start. In the meantime, we’ll also handpick some amazing examples that will get you in the mood to start creating the visuals for your data.

8. Examples for Data Visualization

As another art form, Data Viz is a fertile ground for some amazing well-designed graphs that prove that data is beautiful. Now let’s check out some.

Dark Souls III Experience Data

We start with Meng Hsiao Wei’s personal project presenting his experience with playing Dark Souls 3. It’s a perfect example that infographics and data visualization are tools for personal designs as well. The research is pretty massive yet very professionally sorted into different types of charts for the different concepts. All data visualizations are made with the same color palette and look great in infographics.

Data of My Dark Souls 3 example

My dark souls 3 playing data by Meng Hsiao Wei

Greatest Movies of all Time

Katie Silver has compiled a list of the 100 greatest movies of all time based on critics and crowd reviews. The visualization shows key data points for every movie such as year of release, oscar nominations and wins, budget, gross, IMDB score, genre, filming location, setting of the film, and production studio. All movies are ordered by the release date.

Greatest Movies visualization chart

100 Greatest Movies Data Visualization by Katie Silver

The Most Violent Cities

Federica Fragapane shows data for the 50 most violent cities in the world in 2017. The items are arranged on a vertical axis based on population and ordered along the horizontal axis according to the homicide rate.

The Most Violent Cities example

The Most Violent Cities by Federica Fragapane

Family Businesses as Data

These data visualizations and illustrations were made by Valerio Pellegrini for Perspectives Magazine. They show a pie chart with sector breakdown as well as a scatter plot for contribution for employment.

Family Businesses as Data Visual

PERSPECTIVES MAGAZINE – Family Businesses by Valerio Pellegrini

Orbit Map of the Solar System

The map shows data on the orbits of more than 18000 asteroids in the solar system. Each asteroid is shown at its position on New Years’ Eve 1999, colored by type of asteroid.

Orbit Map of the Solar System graphic

An Orbit Map of the Solar System by Eleanor Lutz

The Semantics Of Headlines

Katja Flükiger has a take on how headlines tell the story. The data visualization aims to communicate how much is the selling influencing the telling. The project was completed at Maryland Institute College of Art to visualize references to immigration and color-coding the value judgments implied by word choice and context.

The Semantics Of Headlines graph

The Semantics of Headlines by Katja Flükiger

Moon and Earthquakes

This data visualization works on answering whether the moon is responsible for earthquakes. The chart features the time and intensity of earthquakes in response to the phase and orbit location of the moon.

Moon and Earthquakes statistics visual

Moon and Earthquakes by Aishwarya Anand Singh

Dawn of the Nanosats

The visualization shows the satellites launched from 2003 to 2015. The graph represents the type of institutions focused on projects as well as the nations that financed them. On the left, it is shown the number of launches per year and satellite applications.

Dawn of the Nanosats visualization

WIRED UK – Dawn of the by Nanosats by Valerio Pellegrini

Final Words

Data visualization is not only a form of science but also a form of art. Its purpose is to help businesses in any field quickly make sense of complex data and start making decisions based on that data. To make your graphs efficient and easy to read, it’s all about knowing your data and audience. This way you’ll be able to choose the right type of chart and use visual techniques to your advantage.

You may also be interested in some of these related articles:

  • Infographics for Marketing: How to Grab and Hold the Attention
  • 12 Animated Infographics That Will Engage Your Mind from Start to Finish
  • 50 Engaging Infographic Examples That Make Complex Ideas Look Great
  • Good Color Combinations That Go Beyond Trends: Inspirational Examples and Ideas

data graphical representation called

Add some character to your visuals

Cartoon Characters, Design Bundles, Illustrations, Backgrounds and more...

Like us on Facebook

Subscribe to our newsletter

Be the first to know what’s new in the world of graphic design and illustrations.

  • [email protected]

Browse High Quality Vector Graphics

E.g.: businessman, lion, girl…

Related Articles

Angels and demons character design: the complete guide, logo sizes and dimensions for social media, websites, and prints, 23 social media design tips that you can try right now, where to find free vector images for commercial use, 30+ creative adobe character animator video examples with puppets, check out our infographics bundle with 500+ infographic templates:, enjoyed this article.

Don’t forget to share!

  • Comments (2)

data graphical representation called

Al Boicheva

Al is an illustrator at GraphicMama with out-of-the-box thinking and a passion for anything creative. In her free time, you will see her drooling over tattoo art, Manga, and horror movies.

data graphical representation called

Thousands of vector graphics for your projects.

Hey! You made it all the way to the bottom!

Here are some other articles we think you may like:

Design Agency

Do You Really Need a Design Agency? Full Guide And Alternatives

by Lyudmil Enchev

What is material design - Full Guide

Material Design: What is it and How To Get Started [+Resources]

30 Inspiring UX Design Examples For Your Next Vision in 2022

30 Inspiring UX Design Examples For Your Next Vision in 2022

Looking for design bundles or cartoon characters.

A source of high-quality vector graphics offering a huge variety of premade character designs, graphic design bundles, Adobe Character Animator puppets, and more.

data graphical representation called

tableau.com is not available in your region.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Mathematics LibreTexts

8.2: Presenting Quantitative Data Graphically

  • Last updated
  • Save as PDF
  • Page ID 113184

  • David Lippman & Jeff Eldridge
  • Pierce College via The OpenTextBookStore

Learning Objectives

  • Summarize quantitative data using a frequency distribution
  • Construct a histogram, frequency polygon and stem plot

Frequency Distributions

Quantitative, or numerical, data can also be summarized into frequency tables, also known as frequency distributions.

Example \(\PageIndex{1}\)

A teacher records scores on a 20-point quiz for the 30 students in his class. The scores are:

19 20 18 18 17 18 19 17 20 18 20 16 20 15 17 12 18 19 18 19 17 20 18 16 15 18 20 5 0 0

Construct a frequency table for the data.

These scores could be summarized into a frequency table by grouping like values:

\(\begin{array}{|c|c|} \hline \textbf { Score } & \textbf { Frequency } \\ \hline 0 & 2 \\ \hline 5 & 1 \\ \hline 12 & 1 \\ \hline 15 & 2 \\ \hline 16 & 2 \\ \hline 17 & 4 \\ \hline 18 & 8 \\ \hline 19 & 4 \\ \hline 20 & 6 \\ \hline \end{array}\)

Using the table above, it would be possible to create a standard bar chart from this summary, like we did for categorical data:

Class Quiz Scores

A bar chart, with horizontal axis labeled Score and the vertical axis labeled Frequency.  The horizontal axis has bars labeled 0 5 12 15 16 17 18 19 20, with heights from the previous table.

However, since the scores are numerical values, this chart doesn’t really make sense; the first and second bars are five values apart, while the later bars are only one value apart. It would be more correct to treat the horizontal axis as a number line. This type of graph is called a histogram .

A histogram is a graphical representation of quantitative data, similar to a bar graph. The horizontal axis is a number line and the bars are touching.

Example \(\PageIndex{2}\)

For the values above, a histogram would look like:

Notice that in the histogram, a bar represents values on the horizontal axis from that on the left hand-side of the bar up to, but not including, the value on the right hand side of the bar. Some people choose to have bars start at \(\frac{1}{2}\) values to avoid this ambiguity.

This is a histogram of the same data but this time the spaces are labeled 0 to 20 instead of the tick marks.

Unfortunately, not a lot of common software packages can correctly graph a histogram. About the best you can do in Excel or Word is a bar graph with no gap between the bars and spacing added to simulate a numerical horizontal axis.

If we have a large number of widely varying data values, creating a frequency table that lists every possible value as a category would lead to an exceptionally long frequency table, and probably would not reveal any patterns. For this reason, it is common with quantitative data to group data into class intervals .

Class Intervals

Class intervals are groupings of the data. In general, we define class intervals so that:

  • Each interval is equal in size. For example, if the first class contains values from 120-129, the second class should include values from 130-139.
  • Each interval has a lower limit and an upper limit . For example, for the class interval of 120-129, the lower limit is 120 and the upper limit is 129.
  • The size of the interval is called the class width . It is the difference between 2 consecutive lower limits. For example, the class width for a class interval of 120-129 is 10 since the next class interval starts at 130 (and 130 - 120 = 10).
  • We have between 5 and 20 classes, typically, depending upon the number of data we’re working with.

Example \(\PageIndex{3}\)

Suppose that we have collected weights from 100 male subjects as part of a nutrition study. For our weight data, we have values ranging from a low of 121 pounds to a high of 263 pounds, giving a total range of \(263 - 121 = 142\). Construct a frequency distribution for the data and a histogram.

We could create 7 intervals with a width of around 20, 14 intervals with a width of around 10, or somewhere in between. Often times we have to experiment with a few possibilities to find something that represents the data well. Let us try using a class width of 15. We could start at 121, or at 120 since it is a nice round number. The second class interval will start at 120 + 15 = 135.

A histogram of this data would look like:

Weights of Subjects in Nutrition Study

In many software packages, you can create a graph similar to a histogram by putting the class intervals as the labels on a bar chart.

A histogram of the same data as above, but instead of the tick marks being labeled, the bars are labeled with the class definition, like 120 - 134 for the first bar, and 135 - 149 for the second.

Other graph types such as pie charts are possible for quantitative data. The usefulness of different graph types will vary depending upon the number of intervals and the type of data being represented. For example, a pie chart of our weight data is difficult to read because of the quantity of intervals we used.

Try It \(\PageIndex{1}\)

The total cost of textbooks for the term was collected from 36 students. Create a histogram for this data.

$140 $160 $160 $165 $180 $220 $235 $240 $250 $260 $280 $285

$285 $285 $290 $300 $300 $305 $310 $310 $315 $315 $320 $320

$330 $340 $345 $350 $355 $360 $360 $380 $395 $420 $460 $460

Using a class intervals of size 55, we can group our data into 6 intervals:

\(\begin{array}{|l|r|} \hline \textbf { cost interval } & \textbf { Frequency } \\ \hline \$ 140-194 & 5 \\ \hline \$ 195-249 & 3 \\ \hline \$ 250-304 & 9 \\ \hline \$ 305-359 & 12 \\ \hline \$ 360-414 & 4 \\ \hline \$ 415-469 & 3 \\ \hline \end{array}\)

We can use the frequency distribution to generate the histogram.

Histogram of Total Cost of Textbooks

When collecting data to compare two groups, it is desirable to create a graph that compares quantities.

Example \(\PageIndex{4}\)

The data below came from a task in which the goal is to move a computer mouse to a target on the screen as fast as possible. On 20 of the trials, the target was a small rectangle; on the other 20, the target was a large rectangle. Time to reach the target was recorded on each trial.

\(\begin{array}{|c|c|c|} \hline \begin{array}{c} \textbf { Interval } \\ \textbf { (milliseconds) } \end{array} & \begin{array}{c} \textbf { Frequency } \\ \textbf { small target } \end{array} & \begin{array}{c} \textbf { Frequency } \\ \textbf { large target } \end{array} \\ \hline 300-399 & 0 & 0 \\ \hline 400-499 & 1 & 5 \\ \hline 500-599 & 3 & 10 \\ \hline 600-699 & 6 & 5 \\ \hline 700-799 & 5 & 0 \\ \hline 800-899 & 4 & 0 \\ \hline 900-999 & 0 & 0 \\ \hline 1000-1099 & 1 & 0 \\ \hline 1100-1199 & 0 & 0 \\ \hline \end{array}\)

One option to represent this data would be a comparative histogram or side-by-side bar chart, in which bars for the small target group and large target group are placed next to each other.

Reaction Time for Small and Large Targets

A comparitive bar graph.  The horizontal axis is labeled Reaction time (milliseconds) and the vertical is labeled Frequency.  The horizontal axis is divided into spaces labled with the class definitions, like 300-399 for the first, and 400-499 for the second.  In each space, there are two bars next to each other; the first is labeled small target and the second is labeled large target, and the heights correspond to the frequency values for each group.

Frequency Polygons

An alternative representation is a frequency polygon .

Frequency polygon

A frequency polygon is a line graph of a frequency distribution.

It starts out like a histogram, but instead of drawing a bar, a point is placed in the midpoint of each interval with height equal to the frequency. The midpoint of a class interval is

\[\text{class midpoint} = \dfrac{\text{lower limit}_{1} + \text{lower limit}_{2}}{2} \nonumber\]

The points are connected with straight lines to emphasize the distribution of the data. By definition, a polygon is a closed figure, so the graph is "closed" on both ends by connecting the first and last points back to 0 (the x -axis) at the appropriate interval midpoint before the first and last class intervals.

Example \(\PageIndex{5}\)

Construct a frequency polygon for both small and large targets on the same graph using the data on reaction time from the previous example.

Find the midpoint of the first class interval: \(\text{class midpoint} = \dfrac{400+500}{2} = \dfrac{900}{2} = 450\). Since the class width is \(500 - 400 = 100\), add 100 to find the second midpoint: 450 + 100 = 550. Find the rest of the midpoints, including the midpoint of the class before the first class interval (\(450 - 100 = 350)\) and the midpoint of the class after the last class interval (\(1050 + 100 = 1150)\) where the polygons will connect back to the \(x\)-axis. Plot the midpoints as \(x\)-coordinates and frequencies as \(y\)-coordinates and connect the points with straight lines. The table below shows the midpoints and frequencies.

\(\begin{array}{|c|c|c|} \hline \begin{array}{c} \textbf { Midpoint } \\ \textbf { (milliseconds) } \end{array} & \begin{array}{c} \textbf { Frequency } \\ \textbf { small target } \end{array} & \begin{array}{c} \textbf { Frequency } \\ \textbf { large target } \end{array} \\ \hline 350 & 0 & 0 \\ \hline 450 & 1 & 5 \\ \hline 550 & 3 & 10 \\ \hline 650 & 6 & 5 \\ \hline 750 & 5 & 0 \\ \hline 850 & 4 & 0 \\ \hline 950 & 0 & 0 \\ \hline 1050 & 1 & 0 \\ \hline 1150 & 0 & 0 \\ \hline \end{array}\)

The completed graph is shown below.

A comparative frequency polygon.  The horizontal axis is labeled Reaction time (milliseconds) and the vertical is labeled Frequency.  The horizontal axis ranges from 300 to 1200 with scale of 100.  At the middle of each class group, like 350, 450, etc. there are two dots:  the first is labeled small target and the second is labeled large target, and the heights correspond to the frequency values for each group.  The dots are connected with line segments.

This graph makes it easier to see that reaction times were generally shorter for the larger target, and that the reaction times for the smaller target were more spread out.

Stem-and-leaf plots, or stem plots, are a quick and easy way to look at small samples of numerical data. You can look for any patterns or any strange data values. It is easy to compare two samples using stem plots.

The first step is to divide each number into 2 parts, the stem (such as the leftmost digit) and the leaf (such as the rightmost digit). There are no set rules, you just have to look at the data and see what makes sense.

Example \(\PageIndex{6}\)

The following are the percentage grades of 25 students from a statistics course. Draw a stem plot of the data.

Divide each number so that the tens digit is the stem and the ones digit is the leaf. 62 becomes 6|2.

Make a vertical chart with the stems on the left of a vertical bar. Be sure to fill in any missing stems. In other words, the stems should have equal spacing (for example, count by ones or count by tens). Here is what the stems for our data look like:

\[\begin{array}{c| c} 4 & \\ 5 & \\ 6 & \\ 7 & \\ 8 & \\ 9 & \\ \end{array}\nonumber \]

Now go through the list of data and add the leaves. Put each leaf next to its corresponding stem. Don’t worry about order yet, just get all the leaves down.

When the data value 62 is placed on the plot it looks like the plot below.

\[\begin{array}{c| c} 4 & \\ 5 & \\ 6 & 2 \\ 7 & \\ 8 & \\ 9 & \\ \end{array}\nonumber \]

When the data value 87 is placed on the plot it looks like the plot below.

\[\begin{array}{c| c} 4 & \\ 5 & \\ 6 & 2 \\ 7 & \\ 8 & 7 \\ 9 & \\ \end{array}\nonumber \]

Filling in the rest of the leaves to obtain the plot below.

\[\begin{array}{c| c c c c c c c } 4 & 5 & 0 & & & & & \\ 5 & 8 & & & & & & \\ 6 & 2 & 9 & 2 & 2 &5 &7 & 4 \\ 7 & 6 & 6 & 1 & 2 & 7 & 3 & \\ 8 & 7 & 1 & 7 & 0 & 7 & 4 & 9 \\ 9 & 5 & 3 & & & & & \\ \end{array}\nonumber \]

Now you have to add labels and make the graph look pretty. You need to add a title and sort the leaves into increasing order. You also need to tell people what the stems and leaves mean by inserting a key. Be careful to line the leaves up in columns . You need to be able to compare the lengths of the rows when you interpret the graph. The final stem plot for the test grade data is shown below .

Test Grades

\[\begin{array}{c| c c c c c c c } 4 & 0 & 5 & & & & & \\ 5 & 8 & & & & & & \\ 6 & 2 & 2 &2 &4 &5 &7 &9 \\ 7 & 1 & 2 & 3& 6 &6 & 7& \\ 8 & 0 & 1 & 4& 7&7 & 7& 9 \\ 9 & 3 & 5 & & & & & \\ \end{array}\nonumber \]

key: 4|0 = 40%

  • School Guide
  • Class 9 Syllabus
  • Maths Notes Class 9
  • Science Notes Class 9
  • History Notes Class 9
  • Geography Notes Class 9
  • Political Science Notes Class 9
  • NCERT Soln. Class 9 Maths
  • RD Sharma Soln. Class 9
  • Math Formulas Class 9
  • CBSE Class 9 Maths Revision Notes

Chapter 1: Number System

  • Number System in Maths
  • Natural Numbers | Definition, Examples, Properties
  • Whole Numbers - Definition, Properties and Examples
  • Rational Number: Definition, Examples, Worksheet
  • Irrational Numbers- Definition, Identification, Examples, Symbol, Properties
  • Real Numbers
  • Decimal Expansion of Real Numbers
  • Decimal Expansions of Rational Numbers
  • Representation of Rational Numbers on the Number Line | Class 8 Maths
  • Represent √3 on the number line
  • Operations on Real Numbers
  • Rationalization of Denominators
  • Laws of Exponents for Real Numbers

Chapter 2: Polynomials

  • Polynomials in One Variable - Polynomials | Class 9 Maths
  • Polynomial Formula
  • Types of Polynomials
  • Zeros of Polynomial
  • Factorization of Polynomial
  • Remainder Theorem
  • Factor Theorem
  • Algebraic Identities

Chapter 3: Coordinate Geometry

  • Coordinate Geometry
  • Cartesian Coordinate System
  • Cartesian Plane

Chapter 4: Linear equations in two variables

  • Linear Equations in One Variable
  • Linear Equation in Two Variables
  • Graph of Linear Equations in Two Variables
  • Graphical Methods of Solving Pair of Linear Equations in Two Variables
  • Equations of Lines Parallel to the x-axis and y-axis

Chapter 5: Introduction to Euclid's Geometry

  • Euclidean Geometry
  • Equivalent Version of Euclid’s Fifth Postulate

Chapter 6: Lines and Angles

  • Lines and Angles
  • Types of Angles
  • Pairs of Angles - Lines & Angles
  • Transversal Lines
  • Angle Sum Property of a Triangle

Chapter 7: Triangles

  • Triangles in Geometry
  • Congruence of Triangles |SSS, SAS, ASA, and RHS Rules
  • Theorem - Angle opposite to equal sides of an isosceles triangle are equal | Class 9 Maths
  • Triangle Inequality Theorem, Proof & Applications

Chapter 8: Quadrilateral

  • Angle Sum Property of a Quadrilateral
  • Quadrilateral - Definition, Properties, Types, Formulas, Examples
  • Introduction to Parallelogram: Properties, Types, and Theorem
  • Rhombus: Definition, Properties, Formula, Examples
  • Kite - Quadrilaterals
  • Properties of Parallelograms
  • Mid Point Theorem

Chapter 9: Areas of Parallelograms and Triangles

  • Area of Triangle | Formula and Examples
  • Area of Parallelogram
  • Figures on the Same Base and between the Same Parallels

Chapter 10: Circles

  • Circles in Maths
  • Radius of Circle
  • Tangent to a Circle
  • What is the longest chord of a Circle?
  • Circumference of Circle - Definition, Perimeter Formula, and Examples
  • Angle subtended by an arc at the centre of a circle
  • What is Cyclic Quadrilateral
  • Theorem - The sum of opposite angles of a cyclic quadrilateral is 180° | Class 9 Maths

Chapter 11: Construction

  • Basic Constructions - Angle Bisector, Perpendicular Bisector, Angle of 60°
  • Construction of Triangles

Chapter 12: Heron's Formula

  • Area of Equilateral Triangle
  • Area of Isosceles Triangle
  • Heron's Formula
  • Applications of Heron's Formula
  • Area of Quadrilateral
  • Area of Polygons

Chapter 13: Surface Areas and Volumes

  • Surface Area of Cuboid
  • Volume of Cuboid | Formula and Examples
  • Surface Area of Cube
  • Volume of a Cube
  • Surface Area of Cylinder (CSA and TSA) |Formula, Derivation, Examples
  • Volume of Cylinder
  • Surface Area of Cone
  • Volume of Cone: Formula, Derivation and Examples
  • Surface Area of Sphere | CSA, TSA, Formula and Derivation
  • Volume of a Sphere
  • Surface Area of a Hemisphere
  • Volume of Hemisphere

Chapter 14: Statistics

  • Collection and Presentation of Data

Graphical Representation of Data

  • Bar Graphs and Histograms
  • Central Tendency
  • Mean, Median and Mode

Chapter 15: Probability

  • Experimental Probability
  • Empirical Probability
  • CBSE Class 9 Maths Formulas
  • NCERT Solutions for Class 9 Maths: Chapter Wise PDF 2024
  • RD Sharma Class 9 Solutions

Graphical Representation of Data: In today’s world of the internet and connectivity, there is a lot of data available, and some or other method is needed for looking at large data, the patterns, and trends in it. There is an entire branch in mathematics dedicated to dealing with collecting, analyzing, interpreting, and presenting numerical data in visual form in such a way that it becomes easy to understand and the data becomes easy to compare as well, the branch is known as Statistics .

The branch is widely spread and has a plethora of real-life applications such as Business Analytics, demography, astrostatistics, and so on. In this article, we have provided everything about the graphical representation of data, including its types, rules, advantages, etc.

Table of Content

What is Graphical Representation?

Types of graphical representations, graphical representations used in maths, principles of graphical representations, advantages and disadvantages of using graphical system, general rules for graphical representation of data, solved examples on graphical representation of data.

There are two ways of representing data,

  • Pictorial Representation through graphs.

They say, “A picture is worth a thousand words”.  It’s always better to represent data in a graphical format. Even in Practical Evidence and Surveys, scientists have found that the restoration and understanding of any information is better when it is available in the form of visuals as Human beings process data better in visual form than any other form. Does it increase the ability 2 times or 3 times? The answer is it increases the Power of understanding 60,000 times for a normal Human being, the fact is amusing and true at the same time.

Comparison between different items is best shown with graphs, it becomes easier to compare the crux of the data about different items. Let’s look at all the different types of graphical representations briefly: 

Line Graphs

A line graph is used to show how the value of a particular variable changes with time. We plot this graph by connecting the points at different values of the variable. It can be useful for analyzing the trends in the data and predicting further trends. 

data graphical representation called

A bar graph is a type of graphical representation of the data in which bars of uniform width are drawn with equal spacing between them on one axis (x-axis usually), depicting the variable. The values of the variables are represented by the height of the bars. 

data graphical representation called

Histograms 

This is similar to bar graphs, but it is based frequency of numerical values rather than their actual values. The data is organized into intervals and the bars represent the frequency of the values in that range. That is, it counts how many values of the data lie in a particular range. 

data graphical representation called

It is a plot that displays data as points and checkmarks above a number line, showing the frequency of the point. 

data graphical representation called

Stem and Leaf Plot 

This is a type of plot in which each value is split into a “leaf”(in most cases, it is the last digit) and “stem”(the other remaining digits). For example: the number 42 is split into leaf (2) and stem (4).  

data graphical representation called

Box and Whisker Plot 

These plots divide the data into four parts to show their summary. They are more concerned about the spread, average, and median of the data. 

data graphical representation called

It is a type of graph which represents the data in form of a circular graph. The circle is divided such that each portion represents a proportion of the whole. 

data graphical representation called

Graphs in maths are used to study the relationships between two or more variables that are changing. Statistical data can be summarized in a better way using graphs. There are basically two lines of thoughts of making graphs in maths: 

  • Value-Based or Time Series Graphs

Frequency Based

Value-based or time series graphs .

These graphs allow us to study the change of a variable with respect to another variable within a given interval of time. The variables can be anything. Time Series graphs study the change of variable with time. They study the trends, periodic behavior, and patterns in the series. We are more concerned with the values of the variables here rather than the frequency of those values. 

Example: Line Graph

These kinds of graphs are more concerned with the distribution of data. How many values lie between a particular range of the variables, and which range has the maximum frequency of the values. They are used to judge a spread and average and sometimes median of a variable under study. 

Example: Frequency Polygon, Histograms.

All types of graphical representations require some rule/principles which are to be followed. These are some algebraic principles. When we plot a graph, there is an origin, and we have our two axes. These two axes divide the plane into four parts called quadrants. The horizontal one is usually called the x-axis and the other one is called the y-axis. The origin is the point where these two axes intersect. The thing we need to keep in mind about the values of the variable on the x-axis is that positive values need to be on the right side of the origin and negative values should be on the left side of the origin. Similarly, for the variable on the y-axis, we need to make sure that the positive values of this variable should be above the x-axis and negative values of this variable must be below the y-axis. 

data graphical representation called

  • It gives us a summary of the data which is easier to look at and analyze.
  • It saves time.
  • We can compare and study more than one variable at a time.

Disadvantages

It usually takes only one aspect of the data and ignores the other. For example, A bar graph does not represent the mean, median, and other statistics of the data. 

We should keep in mind some things while plotting and designing these graphs. The goal should be a better and clear picture of the data. Following things should be kept in mind while plotting the above graphs: 

  • Whenever possible, the data source must be mentioned for the viewer.
  • Always choose the proper colors and font sizes. They should be chosen to keep in mind that the graphs should look neat.
  • The measurement Unit should be mentioned in the top right corner of the graph.
  • The proper scale should be chosen while making the graph, it should be chosen such that the graph looks accurate.
  • Last but not the least, a suitable title should be chosen.

Frequency Polygon

A frequency polygon is a graph that is constructed by joining the midpoint of the intervals. The height of the interval or the bin represents the frequency of the values that lie in that interval. 

data graphical representation called

People Also View:

Diagrammatic and Graphic Presentation of Data What are the different ways of Data Representation?

Question 1: What are different types of frequency-based plots? 

Types of frequency based plots:  Histogram Frequency Polygon Box Plots

Question 2: A company with an advertising budget of Rs 10,00,00,000 has planned the following expenditure in the different advertising channels such as TV Advertisement, Radio, Facebook, Instagram, and Printed media. The table represents the money spent on different channels. 

Draw a bar graph for the following data. 

Steps:  Put each of the channels on the x-axis The height of the bars is decided by the value of each channel.

Question 3: Draw a line plot for the following data 

Steps:  Put each of the x-axis row value on the x-axis joint the value corresponding to the each value of the x-axis.

Question 4: Make a frequency plot of the following data: 

Steps:  Draw the class intervals on the x-axis and frequencies on the y-axis. Calculate the mid point of each class interval. Class Interval Mid Point Frequency 0-3 1.5 3 3-6 4.5 4 6-9 7.5 2 9-12 10.5 6 Now join the mid points of the intervals and their corresponding frequencies on the graph.  This graph shows both the histogram and frequency polygon for the given distribution.

FAQs on Graphical Representation of Data

What are the advantages of using graphs to represent data.

Graphs offer visualization, clarity, and easy comparison of data, aiding in outlier identification and predictive analysis.

What are the common types of graphs used for data representation?

Common graph types include bar, line, pie, histogram, and scatter plots, each suited for different data representations and analysis purposes.

How do you choose the most appropriate type of graph for your data?

Select a graph type based on data type, analysis objective, and audience familiarity to effectively convey information and insights.

How do you create effective labels and titles for graphs?

Use descriptive titles, clear axis labels with units, and legends to ensure the graph communicates information clearly and concisely.

How do you interpret graphs to extract meaningful insights from data?

Interpret graphs by examining trends, identifying outliers, comparing data across categories, and considering the broader context to draw meaningful insights and conclusions.

Please Login to comment...

Similar reads.

  • Mathematics
  • School Learning
  • Otter.ai vs. Fireflies.ai: Which AI Transcribes Meetings More Accurately?
  • Google Chrome Will Soon Let You Talk to Gemini In The Address Bar
  • AI Interior Designer vs. Virtual Home Decorator: Which AI Can Transform Your Home Into a Pinterest Dream Faster?
  • Top 10 Free Webclipper on Chrome Browser in 2024
  • 30 OOPs Interview Questions and Answers (2024)

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

data graphical representation called

Graphical Representation

Graphical representation definition.

Graphical representation refers to the use of charts and graphs to visually display, analyze, clarify, and interpret numerical data, functions, and other qualitative structures. ‍

data graphical representation called

What is Graphical Representation?

Graphical representation refers to the use of intuitive charts to clearly visualize and simplify data sets. Data is ingested into graphical representation of data software and then represented by a variety of symbols, such as lines on a line chart, bars on a bar chart, or slices on a pie chart, from which users can gain greater insight than by numerical analysis alone. 

Representational graphics can quickly illustrate general behavior and highlight phenomenons, anomalies, and relationships between data points that may otherwise be overlooked, and may contribute to predictions and better, data-driven decisions. The types of representational graphics used will depend on the type of data being explored.

Types of Graphical Representation

Data charts are available in a wide variety of maps, diagrams, and graphs that typically include textual titles and legends to denote the purpose, measurement units, and variables of the chart. Choosing the most appropriate chart depends on a variety of different factors -- the nature of the data, the purpose of the chart, and whether a graphical representation of qualitative data or a graphical representation of quantitative data is being depicted. There are dozens of different formats for graphical representation of data. Some of the most popular charts include:

  • Bar Graph -- contains a vertical axis and horizontal axis and displays data as rectangular bars with lengths proportional to the values that they represent; a useful visual aid for marketing purposes
  • Choropleth -- thematic map in which an aggregate summary of a geographic characteristic within an area is represented by patterns of shading proportionate to a statistical variable
  • Flow Chart -- diagram that depicts a workflow graphical representation with the use of arrows and geometric shapes; a useful visual aid for business and finance purposes
  • Heatmap -- a colored, two-dimensional matrix of cells in which each cell represents a grouping of data and each cell’s color indicates its relative value
  • Histogram – frequency distribution and graphical representation uses adjacent vertical bars erected over discrete intervals to represent the data frequency within a given interval; a useful visual aid for meteorology and environment purposes
  • Line Graph – displays continuous data; ideal for predicting future events over time;  a useful visual aid for marketing purposes
  • Pie Chart -- shows percentage values as a slice of pie; a useful visual aid for marketing purposes
  • Pointmap -- CAD & GIS contract mapping and drafting solution that visualizes the location of data on a map by plotting geographic latitude and longitude data
  • Scatter plot -- a diagram that shows the relationship between two sets of data, where each dot represents individual pieces of data and each axis represents a quantitative measure
  • Stacked Bar Graph -- a graph in which each bar is segmented into parts, with the entire bar representing the whole, and each segment representing different categories of that whole; a useful visual aid for political science and sociology purposes
  • Timeline Chart -- a long bar labelled with dates paralleling it that display a list of events in chronological order, a useful visual aid for history charting purposes
  • Tree Diagram -- a hierarchical genealogical tree that illustrates a family structure; a useful visual aid for history charting purposes
  • Venn Diagram -- consists of multiple overlapping usually circles, each representing a set; the default inner join graphical representation

Proprietary and open source software for graphical representation of data is available in a wide variety of programming languages. Software packages often provide spreadsheets equipped with built-in charting functions.

Advantages and Disadvantages of Graphical Representation of Data

Tabular and graphical representation of data are a vital component in analyzing and understanding large quantities of numerical data and the relationship between data points. Data visualization is one of the most fundamental approaches to data analysis, providing an intuitive and universal means to visualize, abstract, and share complex data patterns. The primary advantages of graphical representation of data are:

  • Facilitates and improves learning: graphics make data easy to understand and eliminate language and literacy barriers
  • Understanding content: visuals are more effective than text in human understanding
  • Flexibility of use: graphical representation can be leveraged in nearly every field involving data
  • Increases structured thinking: users can make quick, data-driven decisions at a glance with visual aids
  • Supports creative, personalized reports for more engaging and stimulating visual  presentations 
  • Improves communication: analyzing graphs that highlight relevant themes is significantly faster than reading through a descriptive report line by line
  • Shows the whole picture: an instantaneous, full view of all variables, time frames, data behavior and relationships

Disadvantages of graphical representation of data typically concern the cost of human effort and resources, the process of selecting the most appropriate graphical and tabular representation of data, greater design complexity of visualizing data, and the potential for human bias.

Why Graphical Representation of Data is Important

Graphic visual representation of information is a crucial component in understanding and identifying patterns and trends in the ever increasing flow of data. Graphical representation enables the quick analysis of large amounts of data at one time and can aid in making predictions and informed decisions. Data visualizations also make collaboration significantly more efficient by using familiar visual metaphors to illustrate relationships and highlight meaning, eliminating complex, long-winded explanations of an otherwise chaotic-looking array of figures. 

Data only has value once its significance has been revealed and consumed, and its consumption is best facilitated with graphical representation tools that are designed with human cognition and perception in mind. Human visual processing is very efficient at detecting relationships and changes between sizes, shapes, colors, and quantities. Attempting to gain insight from numerical data alone, especially in big data instances in which there may be billions of rows of data, is exceedingly cumbersome and inefficient.

Does HEAVY.AI Offer a Graphical Representation Solution?

HEAVY.AI's visual analytics platform is an interactive data visualization client that works seamlessly with server-side technologies HEAVY.AIDB and Render to enable data science analysts to easily visualize and instantly interact with massive datasets. Analysts can interact with conventional charts and data tables, as well as big data graphical representations such as massive-scale scatterplots and geo charts. Data visualization contributes to a broad range of use cases, including performance analysis in business and guiding research in academia.

Introduction to Graphs

Table of Contents

15 December 2020                 

Read time: 6 minutes

Introduction

What are graphs?

What are the different types of data?

What are the different types of graphical representations?

The graph is nothing but an organized representation of data. It helps us to understand the data. Data are the numerical information collected through observation.

The word data came from the Latin word Datum which means “something given”

After a research question is developed, data is being collected continuously through observation. Then it is organized, summarized, classified, and then represented graphically.

Differences between Data and information: Data is the raw fact without any add on but the information is the meaning derived from data.

Introduction to Graphs-PDF

The graph is nothing but an organized representation of data. It helps us to understand the data. Data are the numerical information collected through observation. Here is a downloadable PDF to explore more.

  • Line and Bar Graphs Application
  • Graphs in Mathematics & Statistics

What are the different Types of Data?

There are two types of Data :

Types of Data

Quantitative

The data which are statistical or numerical are known as Quantitive data. Quantitive data is generated through. Quantitative data is also known as Structured data. Experiments, Tests, Surveys, Market Report.

Quantitive data is again divided into Continuous data and Discrete data.

Continuous Data

Continuous data is the data which can have any value. That means Continuous data can give infinite outcomes so it should be grouped before representing on a graph.

  • The speed of a vehicle as it passes a checkpoint
  • The mass of a cooking apple
  • The time taken by a volunteer to perform a task

Discrete Data

Discrete data can have certain values. That means only a finite number can be categorized as discrete data.

  • Numbers of cars sold at a dealership during a given month
  • Number of houses in certain block
  • Number of fish caught on a fishing trip
  • Number of complaints received at the office of airline on a given day
  • Number of customers who visit at bank during any given hour
  • Number of heads obtained in three tosses of a coin

Differences between Discrete and Continuous data

  • Numerical data could be either discrete or continuous
  • Continuous data can take any numerical value (within a range); For example, weight, height, etc.
  • There can be an infinite number of possible values in continuous data
  • Discrete data can take only certain values by finite ‘jumps’, i.e., it ‘jumps’ from one value to another but does not take any intermediate value between them (For example, number of students in the class)

Qualitative

Data that deals with description or quality instead of numbers are known as Quantitative data. Qualitative data is also known as unstructured data. Because this type of data is loosely compact and can’t be analyzed conventionally.

Different Types of Graphical Representations

There are many types of graph we can use to represent data. They are as follows,

A bar graph or chart is a way to represent data by rectangular column or bar. The heights or length of the bar is proportional to the values.

A bar graph or chart

A line graph is a type of graph where the information or data is plotted as some dots which are known as markers and then they are added to each other by a straight line.

The line graph is normally used to represent the data that changes over time.

A line graph

A histogram graph is a graph where the information is represented along with the height of the rectangular bar. Though it does look like a bar graph, there is a fundamental difference between them. With the histogram, each column represents a range of quantitative data when a bar graph represents categorical variables.

Histogram and Piechart

The other name of the pie chart is a circle graph. It is a circular chart where numerical information represents as slices or in fractional form or percentage where the whole circle is 100%.

Pie chart

  • Stem and leaf plot

The stem and leaf plot is a way to represents quantitative data according to frequency ranges or frequency distribution.

In the stem and leaf plot, each data is split into stem and leaf, which is 32 will be split into 3 stems and 2 leaves.

Stem and leaf plot

Frequency table: Frequency means the number of occurrences of an event. A frequency distribution table is a graph or chart which shows the frequency of events. It is denoted as ‘f’ .

Frequency table

Pictograph or Pictogram is the earliest way to represents data in a pictorial form or by using symbols or images. And each image represents a particular number of things.

Pictograph or Pictogram

According to the above-mentioned Pictograph, the number of Appels sold on Monday is 6x2=12.

  • Scatter diagrams

Scatter diagram or scatter plot is a way of graphical representation by using cartesian coordinates of two variables. The plot shows the relationship between two variables. Below there is a data table as well as a Scattergram as per the given data.

What is the meaning of Graphical representation?

Graphical representation is a way to represent and analyze quantitive data. A graph is a kind of a chart where data are plotted as variables across the coordinate. It became easy to analyze the extent of change of one variable based on the change of other variables.

Principles of graphical representation

The principles of graphical representation are algebraic. In a graph, there are two lines known as Axis or Coordinate axis. These are the X-axis and Y-axis. The horizontal axis is the X-axis and the vertical axis is the Y-axis. They are perpendicular to each other and intersect at O or point of Origin.

On the right side of the Origin, the Xaxis has a positive value and on the left side, it has a negative value. In the same way, the upper side of the Origin Y-axis has a positive value where the down one is with a negative value.

When X-axis and y-axis intersected each other at the origin it divides the plane into four parts which are called Quadrant I, Quadrant II, Quadrant III, Quadrant IV.

Principles of graphical representation

The location on the coordinate plane is known as the ordered pair and it is written as (x,y). That means the first value will be on the x-axis and the second one is on the y-axis. When we will plot any coordinate, we always have to start counting from the origin and have to move along the x-axis, if it is positive then to the right side, and if it is negative then to the left side. Then from the x-axis, we have to plot the y’s value, which means we have to move up for positive value or down if the value is negative along with the y-axis.

In the following graph, 1st ordered pair (2,3) where both the values of x and y are positive and it is on quadrant I. 2nd ordered pair (-3,1), here the value of x is negative and value of y is positive and it is in quadrant II. 3rd ordered pair (-1.5, -2.5), here the value of x as well as y both are Negative and in quadrant III.

Principles of graphical representation

Methods of representing a frequency distribution

There are four methods to represent a frequency distribution graphically. These are,

  • Smoothed Frequency graph
  • Cumulative frequency graph or Ogive.
  • Pie diagram.

Advantages and Disadvantages of Graphical representation of data

  • It improves the way of analyzing and learning as the graphical representation makes the data easy to understand.
  • It can be used in almost all fields from mathematics to physics to psychology and so on.
  • It is easy to understand for its visual impacts.
  • It shows the whole and huge data in an instance.

The main disadvantage of graphical representation of data is that it takes a lot of effort as well as resources to find the most appropriate data and then represents it graphically.

You may also like:

  • Graphing a Quadratic Function
  • Empirical Relationship Between Mean, Median, and Mode

Not only in mathematics but almost in every field the graph is a very important way to store, analyze, and represents information. After any research work or after any survey the next step is to organize the observation or information and plotting them on a graph paper or plane. The visual representation of information makes the understanding of crucial components or trends easier.

A huge amount of data can be store or analyze in a small space.

The graphical representation of data helps to decide by following the trend.

A complete Idea: Graphical representation constitutes a clear and comprehensive idea in the minds of the audience. Reading a large number (say hundreds) of pages may not help to make a decision. Anyone can get a clear idea just by looking into the graph or design.

Graphs are a very conceptual topic, so it is essential to get a complete understanding of the concept. Graphs are great visual aids and help explain numerous things better, they are important in everyday life. Get better at graphs with us, sign up for a free trial . 

About Cuemath

Cuemath, a student-friendly mathematics and coding platform, conducts regular Online Classes for academics and skill-development, and their Mental Math App, on both iOS and Android , is a one-stop solution for kids to develop multiple skills. Understand the Cuemath Fee structure and sign up for a free trial.

Frequently Asked Questions (FAQs)

What is data.

Data are characteristics or information, usually numerical, that are collected through observation.

How do you differentiate between data and information?

Data is the raw fact without any add on but the information is the meaning derived from data.

What are the types of data?

There are two types of Data:

Two types of Data

What are the ways to represent data?

Tables, charts and graphs are all ways of representing data , and they can be used for two broad purposes. The first is to support the collection, organisation and analysis of data as part of the process of a scientific study.

- Tables, charts and graphs are all ways of representing data, and they can be used for two broad purposes. The first is to support the collection, organisation and analysis of data as part of the process of a scientific study.

What are the different types of graphs?

Different types of graphs include:

data graphical representation called

Guide On Graphical Representation of Data – Types, Importance, Rules, Principles And Advantages

data graphical representation called

What are Graphs and Graphical Representation?

Graphs, in the context of data visualization, are visual representations of data using various graphical elements such as charts, graphs, and diagrams. Graphical representation of data , often referred to as graphical presentation or simply graphs which plays a crucial role in conveying information effectively.

Principles of Graphical Representation

Effective graphical representation follows certain fundamental principles that ensure clarity, accuracy, and usability:Clarity : The primary goal of any graph is to convey information clearly and concisely. Graphs should be designed in a way that allows the audience to quickly grasp the key points without confusion.

  • Simplicity: Simplicity is key to effective data visualization. Extraneous details and unnecessary complexity should be avoided to prevent confusion and distraction.
  • Relevance: Include only relevant information that contributes to the understanding of the data. Irrelevant or redundant elements can clutter the graph.
  • Visualization: Select a graph type that is appropriate for the supplied data. Different graph formats, like bar charts, line graphs, and scatter plots, are appropriate for various sorts of data and relationships.

Rules for Graphical Representation of Data

Creating effective graphical representations of data requires adherence to certain rules:

  • Select the Right Graph: Choosing the appropriate type of graph is essential. For example, bar charts are suitable for comparing categories, while line charts are better for showing trends over time.
  • Label Axes Clearly: Axis labels should be descriptive and include units of measurement where applicable. Clear labeling ensures the audience understands the data’s context.
  • Use Appropriate Colors: Colors can enhance understanding but should be used judiciously. Avoid overly complex color schemes and ensure that color choices are accessible to all viewers.
  • Avoid Misleading Scaling: Scale axes appropriately to prevent exaggeration or distortion of data. Misleading scaling can lead to incorrect interpretations.
  • Include Data Sources: Always provide the source of your data. This enhances transparency and credibility.

Importance of Graphical Representation of Data

Graphical representation of data in statistics is of paramount importance for several reasons:

  • Enhances Understanding: Graphs simplify complex data, making it more accessible and understandable to a broad audience, regardless of their statistical expertise.
  • Helps Decision-Making: Visual representations of data enable informed decision-making. Decision-makers can easily grasp trends and insights, leading to better choices.
  • Engages the Audience: Graphs capture the audience’s attention more effectively than raw data. This engagement is particularly valuable when presenting findings or reports.
  • Universal Language: Graphs serve as a universal language that transcends linguistic barriers. They can convey information to a global audience without the need for translation.

Advantages of Graphical Representation

The advantages of graphical representation of data extend to various aspects of communication and analysis:

  • Clarity: Data is presented visually, improving clarity and reducing the likelihood of misinterpretation.
  • Efficiency: Graphs enable the quick absorption of information. Key insights can be found in seconds, saving time and effort.
  • Memorability: Visuals are more memorable than raw data. Audiences are more likely to retain information presented graphically.
  • Problem-Solving: Graphs help in identifying and solving problems by revealing trends, correlations, and outliers that may require further investigation.

Use of Graphical Representations

Graphical representations find applications in a multitude of fields:

  • Business: In the business world, graphs are used to illustrate financial data, track performance metrics, and present market trends. They are invaluable tools for strategic decision-making.
  • Science: Scientists employ graphs to visualize experimental results, depict scientific phenomena, and communicate research findings to both colleagues and the general public.
  • Education: Educators utilize graphs to teach students about data analysis, statistics, and scientific concepts. Graphs make learning more engaging and memorable.
  • Journalism: Journalists rely on graphs to support their stories with data-driven evidence. Graphs make news articles more informative and impactful.

Types of Graphical Representation

There exists a diverse array of graphical representations, each suited to different data types and purposes. Common types include:

1.Bar Charts:

Used to compare categories or discrete data points, often side by side.

data graphical representation called

2. Line Charts:

Ideal for showing trends and changes over time, such as stock market performance or temperature fluctuations.

data graphical representation called

3. Pie Charts:

Display parts of a whole, useful for illustrating proportions or percentages.

data graphical representation called

4. Scatter Plots:

Reveal relationships between two variables and help identify correlations.

data graphical representation called

5. Histograms:

Depict the distribution of data, especially in the context of continuous variables.

data graphical representation called

In conclusion, the graphical representation of data is an indispensable tool for simplifying complex information, aiding in decision-making, and enhancing communication across diverse fields. By following the principles and rules of effective data visualization, individuals and organizations can harness the power of graphs to convey their messages, support their arguments, and drive informed actions.

Download PPT of Graphical Representation

data graphical representation called

Video On Graphical Representation

FAQs on Graphical Representation of Data

What is the purpose of graphical representation.

Graphical representation serves the purpose of simplifying complex data, making it more accessible and understandable through visual means.

Why are graphs and diagrams important?

Graphs and diagrams are crucial because they provide visual clarity, aiding in the comprehension and retention of information.

How do graphs help learning?

Graphs engage learners by presenting information visually, which enhances understanding and retention, particularly in educational settings.

Who uses graphs?

Professionals in various fields, including scientists, analysts, educators, and business leaders, use graphs to convey data effectively and support decision-making.

Where are graphs used in real life?

Graphs are used in real-life scenarios such as business reports, scientific research, news articles, and educational materials to make data more accessible and meaningful.

Why are graphs important in business?

In business, graphs are vital for analyzing financial data, tracking performance metrics, and making informed decisions, contributing to success.

Leave a comment

Cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Related Posts

data graphical representation called

Best Google AdWords Consultants in India...

What is a Google Ads Consultant? A Google Ads Consultant is an expert who specializes in delivering expertise and advice on Google Ads, which is Google’s online advertising medium. Google Ads permits companies to develop and run ads that are visible on Google’s search engine and other Google platforms. The function of a Google Ads […]

data graphical representation called

Best PPC Consultants in India –...

What Is a PPC Consultant? A PPC consultant or a pay per click consultant is an expert who specializes in handling and optimizing PPC advertisement drives for companies. PPC is a digital marketing model where advertisers pay a price each time their ad is clicked. Standard PPC mediums include Bing Ads, Google Ads, and social media advertisement platforms like […]

data graphical representation called

Top 20 Generic Digital Marketing Interview...

1. What is Digital Marketing? Digital marketing is also known as online marketing which means promoting and selling products or services to potential customers using the internet and online platforms. It includes email, social media, and web-based advertising, but also text and multimedia messages as a marketing channel. 2. What are the types of Digital […]

data graphical representation called

Best Social Media Consultants in India...

What Is a Social Media Consultant? A social media advisor is a specialist who delivers direction, recommendation, and assistance linked to the usage of social media for people, companies, or associations. Their prime objective is to support customers effectively by employing social media platforms to gain specific objectives, such as improving brand awareness, entertaining target […]

data graphical representation called

Gaurav Mittal

Had a great time spent with some awesome learning at The Digital Education Institute. It really helped me to build my career and i am thankful to the institute for making me what i am today.

Company where our students are working

data graphical representation called

Enroll Now for 2 Hour Free Digital Marketing Class

Lorem Ipsum is simply dummy text of the printing and typesetting industry

Lorem Ipsum is simply dummy text of the printing and typesetting industry . Lorem Ipsum is simply dummy text of the printing and typesetting industry

Your Article Library

Graphic representation of data: meaning, principles and methods.

data graphical representation called

ADVERTISEMENTS:

Read this article to learn about the meaning, principles and methods of graphic representation of data.

Meaning of Graphic Representation of Data:

Graphic representation is another way of analysing numerical data. A graph is a sort of chart through which statistical data are represented in the form of lines or curves drawn across the coordinated points plotted on its surface.

Graphs enable us in studying the cause and effect relationship between two variables. Graphs help to measure the extent of change in one variable when another variable changes by a certain amount.

Graphs also enable us in studying both time series and frequency distribution as they give clear account and precise picture of problem. Graphs are also easy to understand and eye catching.

General Principles of Graphic Representation:

There are some algebraic principles which apply to all types of graphic representation of data. In a graph there are two lines called coordinate axes. One is vertical known as Y axis and the other is horizontal called X axis. These two lines are perpendicular to each other. Where these two lines intersect each other is called ‘0’ or the Origin. On the X axis the distances right to the origin have positive value (see fig. 7.1) and distances left to the origin have negative value. On the Y axis distances above the origin have a positive value and below the origin have a negative value.

General Principles of Graphic Representation

Methods to Represent a Frequency Distribution:

Generally four methods are used to represent a frequency distribution graphically. These are Histogram, Smoothed frequency graph and Ogive or Cumulative frequency graph and pie diagram.

1. Histogram:

Histogram is a non-cumulative frequency graph, it is drawn on a natural scale in which the representative frequencies of the different class of values are represented through vertical rectangles drawn closed to each other. Measure of central tendency, mode can be easily determined with the help of this graph.

How to draw a Histogram :

Represent the class intervals of the variables along the X axis and their frequencies along the Y-axis on natural scale.

Start X axis with the lower limit of the lowest class interval. When the lower limit happens to be a distant score from the origin give a break in the X-axis n to indicate that the vertical axis has been moved in for convenience.

Now draw rectangular bars in parallel to Y axis above each of the class intervals with class units as base: The areas of rectangles must be proportional to the frequencies of the cor­responding classes.

Plot the following Data by a Histogram

In this graph we shall take class intervals in the X axis and frequencies in the Y axis. Before plotting the graph we have to convert the class into their exact limits.

Histogram Plotted from the Data

Advantages of histogram :

1. It is easy to draw and simple to understand.

2. It helps us to understand the distribution easily and quickly.

3. It is more precise than the polygene.

Limitations of histogram :

1. It is not possible to plot more than one distribution on same axes as histogram.

2. Comparison of more than one frequency distribution on the same axes is not possible.

3. It is not possible to make it smooth.

Uses of histogram :

1. Represents the data in graphic form.

2. Provides the knowledge of how the scores in the group are distributed. Whether the scores are piled up at the lower or higher end of the distribution or are evenly and regularly distributed throughout the scale.

3. Frequency Polygon. The frequency polygon is a frequen­cy graph which is drawn by joining the coordinating points of the mid-values of the class intervals and their corresponding fre­quencies.

Let us discuss how to draw a frequency polygon:

Draw a horizontal line at the bottom of graph paper named ‘OX’ axis. Mark off the exact limits of the class intervals along this axis. It is better to start with c.i. of lowest value. When the lowest score in the distribution is a large number we cannot show it graphically if we start with the origin. Therefore put a break in the X axis () to indicate that the vertical axis has been moved in for convenience. Two additional points may be added to the two extreme ends.

Draw a vertical line through the extreme end of the horizontal axis known as OY axis. Along this line mark off the units to represent the frequencies of the class intervals. The scale should be chosen in such a way that it will make the largest frequency (height) of the polygon approximately 75 percent of the width of the figure.

Plot the points at a height proportional to the frequencies directly above the point on the horizontal axis representing the mid-point of each class interval.

After plotting all the points on the graph join these points by a series of short straight lines to form the frequency polygon. In order to complete the figure two additional intervals at the high end and low end of the distribution should be included. The frequency of these two intervals will be zero.

Illustration: No. 7.3 :

Draw a frequency polygon from the following data:

Frequency Polygon

In this graph we shall take the class intervals (marks in mathematics) in X axis, and frequencies (Number of students) in the Y axis. Before plotting the graph we have to convert the c.i. into their exact limits and extend one c.i. in each end with a frequency of O.

Class intervals with exact limits:

Class intervals with exact limits

Advantages of frequency polygon :

2. It is possible to plot two distributions at a time on same axes.

3. Comparison of two distributions can be made through frequency polygon.

4. It is possible to make it smooth.

Limitations of frequency polygon :

1. It is less precise.

2. It is not accurate in terms of area the frequency upon each interval.

Uses of frequency polygon :

1. When two or more distributions are to be compared the frequency polygon is used.

2. It represents the data in graphic form.

3. It provides knowledge of how the scores in one or more group are distributed. Whether the scores are piled up at the lower or higher end of the distribution or are evenly and regularly distributed throughout the scale.

2. Smoothed Frequency Polygon :

When the sample is very small and the frequency distribution is irregular the polygon is very jig-jag. In order to wipe out the irregularities and “also get a better notion of how the figure might look if the data were more numerous, the frequency polygon may be smoothed.”

In this process to adjust the frequencies we take a series of ‘moving’ or ‘running’ averages. To get an adjusted or smoothed frequency we add the frequency of a class interval with the two adjacent intervals, just below and above the class interval. Then the sum is divided by 3. When these adjusted frequencies are plotted against the class intervals on a graph we get a smoothed frequency polygon.

Illustration 7.4 :

Draw a smoothed frequency polygon, of the data given in the illustration No. 7.3:

Here we have to first convert the class intervals into their exact limits. Then we have to determine the adjusted or smoothed frequencies.

Determine the Adjusted or Smoothed Frequencies

3. Ogive or Cumulative Frequency Polygon:

Ogive is a cumulative frequency graphs drawn on natural scale to determine the values of certain factors like median, Quartile, Percentile etc. In these graphs the exact limits of the class intervals are shown along the X-axis and the cumulative frequen­cies are shown along the Y-axis. Below are given the steps to draw an ogive.

Get the cumulative frequency by adding the frequencies cumulatively, from the lower end (to get a less than ogive) or from the upper end (to get a more than ogive).

Mark off the class intervals in the X-axis.

Represent the cumulative frequencies along the Y-axis begin­ning with zero at the base.

Put dots at each of the coordinating points of the upper limit and the corresponding frequencies.

Join all the dots with a line drawing smoothly. This will result in curve called ogive.

Illustration No. 7.5 :

Draw an ogive from the data given below:

ogive

To plot this graph first we have to convert, the class intervals into their exact limits. Then we have to calculate the cumulative frequencies of the distribution.

Cumulative Frequencies of the Distribution

Now we have to plot the cumulative frequencies in respect to their corresponding class-intervals.

Ogive plotted from the data given above:

Ogive plotted

Uses of Ogive:

1. Ogive is useful to determine the number of students below and above a particular score.

2. When the median as a measure of central tendency is wanted.

3. When the quartiles, deciles and percentiles are wanted.

4. By plotting the scores of two groups on a same scale we can compare both the groups.

4. The Pie Diagram:

Figure given below shows the distribution of elementary pupils by their academic achievement in a school. Of the total, 60% are high achievers, 25% middle achievers and 15% low achievers. The construction of this pie diagram is quite simple. There are 360 degree in the circle. Hence, 60% of 360′ or 216° are counted off as shown in the diagram; this sector represents the proportion of high achievers students.

Ninety degrees counted off for the middle achiever students (25%) and 54 degrees for low achiever students (15%). The pie-diagram is useful when one wishes to picture proportions of the total in a striking way. Numbers of degrees may be measured off “by eye” or more accurately with a protractor.

Distribution by Academic Achievement of Pupils in Class VI of a School

Uses of Pie diagram :

1. Pie diagram is useful when one wants to picture proportions of the total in a striking way.

2. When a population is stratified and each strata is to be presented as a percentage at that time pie diagram is used.

Related Articles:

  • 5 Methods to Depict Frequency Distribution | Statistics
  • Representing Data Graphically: 3 Methods | Statistics

Comments are closed.

web statistics

  • Business Essentials
  • Leadership & Management
  • Credential of Leadership, Impact, and Management in Business (CLIMB)
  • Entrepreneurship & Innovation
  • Digital Transformation
  • Finance & Accounting
  • Business in Society
  • For Organizations
  • Support Portal
  • Media Coverage
  • Founding Donors
  • Leadership Team

data graphical representation called

  • Harvard Business School →
  • HBS Online →
  • Business Insights →

Business Insights

Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.

  • Career Development
  • Communication
  • Decision-Making
  • Earning Your MBA
  • Negotiation
  • News & Events
  • Productivity
  • Staff Spotlight
  • Student Profiles
  • Work-Life Balance
  • AI Essentials for Business
  • Alternative Investments
  • Business Analytics
  • Business Strategy
  • Business and Climate Change
  • Design Thinking and Innovation
  • Digital Marketing Strategy
  • Disruptive Strategy
  • Economics for Managers
  • Entrepreneurship Essentials
  • Financial Accounting
  • Global Business
  • Launching Tech Ventures
  • Leadership Principles
  • Leadership, Ethics, and Corporate Accountability
  • Leading with Finance
  • Management Essentials
  • Negotiation Mastery
  • Organizational Leadership
  • Power and Influence for Positive Impact
  • Strategy Execution
  • Sustainable Business Strategy
  • Sustainable Investing
  • Winning with Digital Platforms

17 Data Visualization Techniques All Professionals Should Know

Data Visualizations on a Page

  • 17 Sep 2019

There’s a growing demand for business analytics and data expertise in the workforce. But you don’t need to be a professional analyst to benefit from data-related skills.

Becoming skilled at common data visualization techniques can help you reap the rewards of data-driven decision-making , including increased confidence and potential cost savings. Learning how to effectively visualize data could be the first step toward using data analytics and data science to your advantage to add value to your organization.

Several data visualization techniques can help you become more effective in your role. Here are 17 essential data visualization techniques all professionals should know, as well as tips to help you effectively present your data.

Access your free e-book today.

What Is Data Visualization?

Data visualization is the process of creating graphical representations of information. This process helps the presenter communicate data in a way that’s easy for the viewer to interpret and draw conclusions.

There are many different techniques and tools you can leverage to visualize data, so you want to know which ones to use and when. Here are some of the most important data visualization techniques all professionals should know.

Data Visualization Techniques

The type of data visualization technique you leverage will vary based on the type of data you’re working with, in addition to the story you’re telling with your data .

Here are some important data visualization techniques to know:

  • Gantt Chart
  • Box and Whisker Plot
  • Waterfall Chart
  • Scatter Plot
  • Pictogram Chart
  • Highlight Table
  • Bullet Graph
  • Choropleth Map
  • Network Diagram
  • Correlation Matrices

1. Pie Chart

Pie Chart Example

Pie charts are one of the most common and basic data visualization techniques, used across a wide range of applications. Pie charts are ideal for illustrating proportions, or part-to-whole comparisons.

Because pie charts are relatively simple and easy to read, they’re best suited for audiences who might be unfamiliar with the information or are only interested in the key takeaways. For viewers who require a more thorough explanation of the data, pie charts fall short in their ability to display complex information.

2. Bar Chart

Bar Chart Example

The classic bar chart , or bar graph, is another common and easy-to-use method of data visualization. In this type of visualization, one axis of the chart shows the categories being compared, and the other, a measured value. The length of the bar indicates how each group measures according to the value.

One drawback is that labeling and clarity can become problematic when there are too many categories included. Like pie charts, they can also be too simple for more complex data sets.

3. Histogram

Histogram Example

Unlike bar charts, histograms illustrate the distribution of data over a continuous interval or defined period. These visualizations are helpful in identifying where values are concentrated, as well as where there are gaps or unusual values.

Histograms are especially useful for showing the frequency of a particular occurrence. For instance, if you’d like to show how many clicks your website received each day over the last week, you can use a histogram. From this visualization, you can quickly determine which days your website saw the greatest and fewest number of clicks.

4. Gantt Chart

Gantt Chart Example

Gantt charts are particularly common in project management, as they’re useful in illustrating a project timeline or progression of tasks. In this type of chart, tasks to be performed are listed on the vertical axis and time intervals on the horizontal axis. Horizontal bars in the body of the chart represent the duration of each activity.

Utilizing Gantt charts to display timelines can be incredibly helpful, and enable team members to keep track of every aspect of a project. Even if you’re not a project management professional, familiarizing yourself with Gantt charts can help you stay organized.

5. Heat Map

Heat Map Example

A heat map is a type of visualization used to show differences in data through variations in color. These charts use color to communicate values in a way that makes it easy for the viewer to quickly identify trends. Having a clear legend is necessary in order for a user to successfully read and interpret a heatmap.

There are many possible applications of heat maps. For example, if you want to analyze which time of day a retail store makes the most sales, you can use a heat map that shows the day of the week on the vertical axis and time of day on the horizontal axis. Then, by shading in the matrix with colors that correspond to the number of sales at each time of day, you can identify trends in the data that allow you to determine the exact times your store experiences the most sales.

6. A Box and Whisker Plot

Box and Whisker Plot Example

A box and whisker plot , or box plot, provides a visual summary of data through its quartiles. First, a box is drawn from the first quartile to the third of the data set. A line within the box represents the median. “Whiskers,” or lines, are then drawn extending from the box to the minimum (lower extreme) and maximum (upper extreme). Outliers are represented by individual points that are in-line with the whiskers.

This type of chart is helpful in quickly identifying whether or not the data is symmetrical or skewed, as well as providing a visual summary of the data set that can be easily interpreted.

7. Waterfall Chart

Waterfall Chart Example

A waterfall chart is a visual representation that illustrates how a value changes as it’s influenced by different factors, such as time. The main goal of this chart is to show the viewer how a value has grown or declined over a defined period. For example, waterfall charts are popular for showing spending or earnings over time.

8. Area Chart

Area Chart Example

An area chart , or area graph, is a variation on a basic line graph in which the area underneath the line is shaded to represent the total value of each data point. When several data series must be compared on the same graph, stacked area charts are used.

This method of data visualization is useful for showing changes in one or more quantities over time, as well as showing how each quantity combines to make up the whole. Stacked area charts are effective in showing part-to-whole comparisons.

9. Scatter Plot

Scatter Plot Example

Another technique commonly used to display data is a scatter plot . A scatter plot displays data for two variables as represented by points plotted against the horizontal and vertical axis. This type of data visualization is useful in illustrating the relationships that exist between variables and can be used to identify trends or correlations in data.

Scatter plots are most effective for fairly large data sets, since it’s often easier to identify trends when there are more data points present. Additionally, the closer the data points are grouped together, the stronger the correlation or trend tends to be.

10. Pictogram Chart

Pictogram Example

Pictogram charts , or pictograph charts, are particularly useful for presenting simple data in a more visual and engaging way. These charts use icons to visualize data, with each icon representing a different value or category. For example, data about time might be represented by icons of clocks or watches. Each icon can correspond to either a single unit or a set number of units (for example, each icon represents 100 units).

In addition to making the data more engaging, pictogram charts are helpful in situations where language or cultural differences might be a barrier to the audience’s understanding of the data.

11. Timeline

Timeline Example

Timelines are the most effective way to visualize a sequence of events in chronological order. They’re typically linear, with key events outlined along the axis. Timelines are used to communicate time-related information and display historical data.

Timelines allow you to highlight the most important events that occurred, or need to occur in the future, and make it easy for the viewer to identify any patterns appearing within the selected time period. While timelines are often relatively simple linear visualizations, they can be made more visually appealing by adding images, colors, fonts, and decorative shapes.

12. Highlight Table

Highlight Table Example

A highlight table is a more engaging alternative to traditional tables. By highlighting cells in the table with color, you can make it easier for viewers to quickly spot trends and patterns in the data. These visualizations are useful for comparing categorical data.

Depending on the data visualization tool you’re using, you may be able to add conditional formatting rules to the table that automatically color cells that meet specified conditions. For instance, when using a highlight table to visualize a company’s sales data, you may color cells red if the sales data is below the goal, or green if sales were above the goal. Unlike a heat map, the colors in a highlight table are discrete and represent a single meaning or value.

13. Bullet Graph

Bullet Graph Example

A bullet graph is a variation of a bar graph that can act as an alternative to dashboard gauges to represent performance data. The main use for a bullet graph is to inform the viewer of how a business is performing in comparison to benchmarks that are in place for key business metrics.

In a bullet graph, the darker horizontal bar in the middle of the chart represents the actual value, while the vertical line represents a comparative value, or target. If the horizontal bar passes the vertical line, the target for that metric has been surpassed. Additionally, the segmented colored sections behind the horizontal bar represent range scores, such as “poor,” “fair,” or “good.”

14. Choropleth Maps

Choropleth Map Example

A choropleth map uses color, shading, and other patterns to visualize numerical values across geographic regions. These visualizations use a progression of color (or shading) on a spectrum to distinguish high values from low.

Choropleth maps allow viewers to see how a variable changes from one region to the next. A potential downside to this type of visualization is that the exact numerical values aren’t easily accessible because the colors represent a range of values. Some data visualization tools, however, allow you to add interactivity to your map so the exact values are accessible.

15. Word Cloud

Word Cloud Example

A word cloud , or tag cloud, is a visual representation of text data in which the size of the word is proportional to its frequency. The more often a specific word appears in a dataset, the larger it appears in the visualization. In addition to size, words often appear bolder or follow a specific color scheme depending on their frequency.

Word clouds are often used on websites and blogs to identify significant keywords and compare differences in textual data between two sources. They are also useful when analyzing qualitative datasets, such as the specific words consumers used to describe a product.

16. Network Diagram

Network Diagram Example

Network diagrams are a type of data visualization that represent relationships between qualitative data points. These visualizations are composed of nodes and links, also called edges. Nodes are singular data points that are connected to other nodes through edges, which show the relationship between multiple nodes.

There are many use cases for network diagrams, including depicting social networks, highlighting the relationships between employees at an organization, or visualizing product sales across geographic regions.

17. Correlation Matrix

Correlation Matrix Example

A correlation matrix is a table that shows correlation coefficients between variables. Each cell represents the relationship between two variables, and a color scale is used to communicate whether the variables are correlated and to what extent.

Correlation matrices are useful to summarize and find patterns in large data sets. In business, a correlation matrix might be used to analyze how different data points about a specific product might be related, such as price, advertising spend, launch date, etc.

Other Data Visualization Options

While the examples listed above are some of the most commonly used techniques, there are many other ways you can visualize data to become a more effective communicator. Some other data visualization options include:

  • Bubble clouds
  • Circle views
  • Dendrograms
  • Dot distribution maps
  • Open-high-low-close charts
  • Polar areas
  • Radial trees
  • Ring Charts
  • Sankey diagram
  • Span charts
  • Streamgraphs
  • Wedge stack graphs
  • Violin plots

Business Analytics | Become a data-driven leader | Learn More

Tips For Creating Effective Visualizations

Creating effective data visualizations requires more than just knowing how to choose the best technique for your needs. There are several considerations you should take into account to maximize your effectiveness when it comes to presenting data.

Related : What to Keep in Mind When Creating Data Visualizations in Excel

One of the most important steps is to evaluate your audience. For example, if you’re presenting financial data to a team that works in an unrelated department, you’ll want to choose a fairly simple illustration. On the other hand, if you’re presenting financial data to a team of finance experts, it’s likely you can safely include more complex information.

Another helpful tip is to avoid unnecessary distractions. Although visual elements like animation can be a great way to add interest, they can also distract from the key points the illustration is trying to convey and hinder the viewer’s ability to quickly understand the information.

Finally, be mindful of the colors you utilize, as well as your overall design. While it’s important that your graphs or charts are visually appealing, there are more practical reasons you might choose one color palette over another. For instance, using low contrast colors can make it difficult for your audience to discern differences between data points. Using colors that are too bold, however, can make the illustration overwhelming or distracting for the viewer.

Related : Bad Data Visualization: 5 Examples of Misleading Data

Visuals to Interpret and Share Information

No matter your role or title within an organization, data visualization is a skill that’s important for all professionals. Being able to effectively present complex data through easy-to-understand visual representations is invaluable when it comes to communicating information with members both inside and outside your business.

There’s no shortage in how data visualization can be applied in the real world. Data is playing an increasingly important role in the marketplace today, and data literacy is the first step in understanding how analytics can be used in business.

Are you interested in improving your analytical skills? Learn more about Business Analytics , our eight-week online course that can help you use data to generate insights and tackle business decisions.

This post was updated on January 20, 2022. It was originally published on September 17, 2019.

data graphical representation called

About the Author

  • Graphic Presentation of Data

Apart from diagrams, Graphic presentation is another way of the presentation of data and information. Usually, graphs are used to present time series and frequency distributions. In this article, we will look at the graphic presentation of data and information along with its merits, limitations , and types.

Suggested Videos

Construction of a graph.

The graphic presentation of data and information offers a quick and simple way of understanding the features and drawing comparisons. Further, it is an effective analytical tool and a graph can help us in finding the mode, median, etc.

We can locate a point in a plane using two mutually perpendicular lines – the X-axis (the horizontal line) and the Y-axis (the vertical line). Their point of intersection is the Origin .

We can locate the position of a point in terms of its distance from both these axes. For example, if a point P is 3 units away from the Y-axis and 5 units away from the X-axis, then its location is as follows:

presentation of data and information

Browse more Topics under Descriptive Statistics

  • Definition and Characteristics of Statistics
  • Stages of Statistical Enquiry
  • Importance and Functions of Statistics
  • Nature of Statistics – Science or Art?
  • Application of Statistics
  • Law of Statistics and Distrust of Statistics
  • Meaning and Types of Data
  • Methods of Collecting Data
  • Sample Investigation
  • Classification of Data
  • Tabulation of Data
  • Frequency Distribution of Data
  • Diagrammatic Presentation of Data
  • Measures of Central Tendency
  • Mean Median Mode
  • Measures of Dispersion
  • Standard Deviation
  • Variance Analysis

Some points to remember:

  • We measure the distance of the point from the Y-axis along the X-axis. Similarly, we measure the distance of the point from the X-axis along the Y-axis. Therefore, to measure 3 units from the Y-axis, we move 3 units along the X-axis and likewise for the other coordinate .
  • We then draw perpendicular lines from these two points.
  • The point where the perpendiculars intersect is the position of the point P.
  • We denote it as follows (3,5) or (abscissa, ordinate). Together, they are the coordinates of the point P.
  • The four parts of the plane are Quadrants.
  • Also, we can plot different points for a different pair of values.

General Rules for Graphic Presentation of Data and Information

There are certain guidelines for an attractive and effective graphic presentation of data and information. These are as follows:

  • Suitable Title – Ensure that you give a suitable title to the graph which clearly indicates the subject for which you are presenting it.
  • Unit of Measurement – Clearly state the unit of measurement below the title.
  • Suitable Scale – Choose a suitable scale so that you can represent the entire data in an accurate manner.
  • Index – Include a brief index which explains the different colors and shades, lines and designs that you have used in the graph. Also, include a scale of interpretation for better understanding.
  • Data Sources – Wherever possible, include the sources of information at the bottom of the graph.
  • Keep it Simple – You should construct a graph which even a layman (without any exposure in the areas of statistics or mathematics) can understand.
  • Neat – A graph is a visual aid for the presentation of data and information. Therefore, you must keep it neat and attractive. Choose the right size, right lettering, and appropriate lines, colors, dashes, etc.

Merits of a Graph

  • The graph presents data in a manner which is easier to understand.
  • It allows us to present statistical data in an attractive manner as compared to tables. Users can understand the main features, trends, and fluctuations of the data at a glance.
  • A graph saves time.
  • It allows the viewer to compare data relating to two different time-periods or regions.
  • The viewer does not require prior knowledge of mathematics or statistics to understand a graph.
  • We can use a graph to locate the mode, median, and mean values of the data.
  • It is useful in forecasting, interpolation, and extrapolation of data.

Limitations of a Graph

  • A graph lacks complete accuracy of facts.
  • It depicts only a few selected characteristics of the data.
  • We cannot use a graph in support of a statement.
  • A graph is not a substitute for tables.
  • Usually, laymen find it difficult to understand and interpret a graph.
  • Typically, a graph shows the unreasonable tendency of the data and the actual values are not clear.

Types of Graphs

Graphs are of two types:

  • Time Series graphs
  • Frequency Distribution graphs

Time Series Graphs

A time series graph or a “ histogram ” is a graph which depicts the value of a variable over a different point of time. In a time series graph, time is the most important factor and the variable is related to time. It helps in the understanding and analysis of the changes in the variable at a different point of time. Many statisticians and businessmen use these graphs because they are easy to understand and also because they offer complex information in a simple manner.

Further, constructing a time series graph does not require a user with technical skills. Here are some major steps in the construction of a time series graph:

  • Represent time on the X-axis and the value of the variable on the Y-axis.
  • Start the Y-value with zero and devise a suitable scale which helps you present the whole data in the given space.
  • Plot the values of the variable and join different point with a straight line.
  • You can plot multiple variables through different lines.

You can use a line graph to summarize how two pieces of information are related and how they vary with each other.

  • You can compare multiple continuous data-sets easily
  • You can infer the interim data from the graph line

Disadvantages

  • It is only used with continuous data.

Use of a false Base Line

Usually, in a graph, the vertical line starts from the Origin. However, in some cases, a false Base Line is used for a better representation of the data. There are two scenarios where you should use a false Base Line:

  • To magnify the minor fluctuation in the time series data
  • To economize the space

Net Balance Graph

If you have to show the net balance of income and expenditure or revenue and costs or imports and exports, etc., then you must use a net balance graph. You can use different colors or shades for positive and negative differences.

Frequency Distribution Graphs

Let’s look at the different types of frequency distribution graphs.

A histogram is a graph of a grouped frequency distribution. In a histogram, we plot the class intervals on the X-axis and their respective frequencies on the Y-axis. Further, we create a rectangle on each class interval with its height proportional to the frequency density of the class.

presentation of data and information

Frequency Polygon or Histograph

A frequency polygon or a Histograph is another way of representing a frequency distribution on a graph. You draw a frequency polygon by joining the midpoints of the upper widths of the adjacent rectangles of the histogram with straight lines.

presentation of data and information

Frequency Curve

When you join the verticals of a polygon using a smooth curve, then the resulting figure is a Frequency Curve. As the number of observations increase, we need to accommodate more classes. Therefore, the width of each class reduces. In such a scenario, the variable tends to become continuous and the frequency polygon starts taking the shape of a frequency curve.

Cumulative Frequency Curve or Ogive

A cumulative frequency curve or Ogive is the graphical representation of a cumulative frequency distribution. Since a cumulative frequency is either of a ‘less than’ or a ‘more than’ type, Ogives are of two types too – ‘less than ogive’ and ‘more than ogive’.

presentation of data and information

Scatter Diagram

A scatter diagram or a dot chart enables us to find the nature of the relationship between the variables. If the plotted points are scattered a lot, then the relationship between the two variables is lesser.

presentation of data and information

Solved Question

Q1. What are the general rules for the graphic presentation of data and information?

Answer: The general rules for the graphic presentation of data are:

  • Use a suitable title
  • Clearly specify the unit of measurement
  • Ensure that you choose a suitable scale
  • Provide an index specifying the colors, lines, and designs used in the graph
  • If possible, provide the sources of information at the bottom of the graph
  • Keep the graph simple and neat.

Customize your course in 30 seconds

Which class are you in.

tutor

Descriptive Statistics

  • Nature of Statistics – Science or Art?

2 responses to “Stages of Statistical Enquiry”

Im trying to find out if my mother ALICE Desjarlais is registered with the Red Pheasant Reserve, I applied with Metie Urban Housing and I need my Metie card. Is there anyway you can help me.

Quite useful details about statistics. I’d also like to add one point. If you need professional help with a statistics project? Find a professional in minutes!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Download the App

Google Play

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

2.3: Graphical Displays

  • Last updated
  • Save as PDF
  • Page ID 24025

  • Rachel Webb
  • Portland State University

Statistical graphs are useful in getting the audience’s attention in a publication or presentation. Data presented graphically is easier to summarize at a glance compared to frequency distributions or numerical summaries. Graphs are useful to reinforce a critical point, summarize a data set, or discover patterns or trends over a period of time. Florence Nightingale (1820-1910) was one of the first people to use graphical representations to present data. Nightingale was a nurse in the Crimean War and used a type of graph that she called polar area diagram, or coxcombs to display mortality figures for contagious diseases such as cholera and typhus.

clipboard_eb2e7c2490074c70c342069f0909a448a.png

Nightingale

clipboard_eefb1ad16c8747346826dbc9816844907.png

Nightingale-mortality.jpg. (2021, May 18). Wikimedia Commons, the free media repository . Retrieved July 2021 from https://commons.wikimedia.org/w/index.php?title=File:Nightingale-mortality.jpg&oldid=561529217.

It is hard to provide a complete overview of the most recent developments in data visualization with the onset of technology. The development of a variety of highly interactive software has accelerated the pace and variety of graphical displays across a wide range of disciplines.

2.3.1 Stem-and-Leaf Plot

Stem-and-leaf plots (or stemplots) are a useful way of getting a quick picture of the shape of a distribution by hand. Turn the graph sideways and you can see the shape of your data. You can now easily identify outliers. Each observation is divided into two pieces; the stem and the leaf. If the number is just two digits then the stem would be the tens digit and the leaf would be the ones digit. When a number is more than two digits then the cut point should split the data into enough classes that is useful to see the shape of the data.

To create a stem-and-leaf plot:

  • Separate each observation into a stem and a leaf.
  • Write the stems in a vertical column in ascending order (from smallest to largest). Fill in missing numbers even if there are gaps in the data. Draw a vertical line to the right of this column.
  • Write each leaf in the row to the right of its stem, in increasing order.

Create a stem-and-leaf plot for the sample of 35 ages.

A small sample of house prices in thousands of dollars was collected: 375, 189, 432, 225, 305, 275. Make a stem-and-leaf plot.

If we were to split the stem and leaf between the ones and tens place, then we would need stems going from 18 up to 43. Twenty-six stems for only six data points is too many. The next break then for a stem would be between the tens and hundreds. This would give stems from 1 to 4. Then each leaf will be the ones and tens. For example, then number 375 would have a stem = 3 and a leaf = 75.

\begin{array}{l|ll} 1 & 89 \\ 2 & 25 & 75 \\ 3 & 05 & 75 \\ 4 & 32 \end{array}

Leaf = $1000

A small sample of coffee prices: 3.75, 1.89, 4.32, 2.25, 3.05, 2.75 was collected. Make a stem-and-leaf plot.

Leaf = $0.01

Note that the last two stem-and-leaf plots look identical except for the footnote. It is important to include units to tell people what the stems and leaves mean by inserting a legend.

Back-to-back stem-and-leaf plots let us compare two data sets on the same number line. The two samples share the same set of stems. The sample on the right is written backward from largest leaf to smallest leaf, and the sample on the left has leaves from smallest to largest.

Use the following back-to-back stem-and-leaf plot to compare pulse rates before and after exercise.

clipboard_e3a46b797187e36d2ad6e1f3636ebdc1d.png

The group on the left has leaves going in descending order and represent the pulse rates before exercise. The stems are in the middle column. The group on the right has leaves going in ascending order and represent the pulse rates after exercise. The first row has pulse rates of 62, 65, 66, 67, 68, 68 and 69. The last row of pulse rates are 124, 125, and 128.

2.3.2 Histogram

A histogram is a graph for quantitative data (we call these bar graphs for qualitative data). The data is divided into a number of classes. The class limits become the horizontal axis demarcated with a number line and the vertical axis is either the frequency or the relative frequency of each class. Figure 2-9 is an example of a histogram.

The histogram for quantitative data looks similar to a bar graph, except there are some major differences.

First, in a bar graph the categories can be put in any order on the horizontal axis. There is no set order for these nominal data. You cannot say how the data is distributed based on the shape, since the shape can change just by putting the categories in different orders. With quantitative data, the data are in a specific order, since you are dealing with numbers. With quantitative data, you can talk about a distribution shape.

This leads to the second difference from bar graphs. In a bar graph, the categories that you made in the frequency table were the words used for the category name. In quantitative data, the categories are numerical categories, and the numbers are determined by how many classes you choose. If two people have the same number of categories, then they will have the same frequency distribution. Whereas in qualitative data, there can be many different categories depending on the point of view of the author.

The third difference is that the bars touch with quantitative data, and there will be no gaps in the graph. The reason that bar graphs have gaps is to show that the categories do not continue on, as they do in quantitative data. Since the graph for quantitative data is different from qualitative data, it is given a different name of histogram.

Some key features of a histogram:

  • Equal spacing on each axis
  • Bars are the same width
  • Label each axis and title the graph
  • Show the scale on the frequency axis
  • Label the categories on the category axis
  • The bars should touch at the class boundaries

clipboard_e5c43615a8306b45a4edee77c778a4993.png

To create a histogram, you must first create a frequency distribution. Software and calculators can create histograms easily when a large amount of sample data is being analyzed.

To create a histogram in Excel you will need to first install the Data Analysis tool.

If your Data Analysis is not showing in the Data tab, follow the directions for installing the free add-in here: https://support.office.com/en-us/article/Load-the-Analysis-ToolPak-in-Excel-6a63e598-cd6d-42e3-9317- 6b40ba1a66b4.

Type in the data into one blank column in any order. If you want to have class widths other than Excel’s default setting, type in a new column the endpoints of each class found in your frequency distribution, these are called the bins in Excel.

Using the sample of 35 ages, make a histogram using Excel.

The histogram has bars for the height of each frequency and then makes a line graph of the cumulative relative frequencies over the bars. This red line is a line graph of the cumulative relative frequencies, also called an ogive and is discussed in a later section.

clipboard_e6d641d5f1f3622850be570d011965ded.png

It is important to note that the number of classes that are used and the value of the first class boundary will change the shape of the histogram.

A relative frequency histogram is when the relative frequencies are used for the vertical axis instead of the frequencies and the y-axis will represent a percent instead of the number of people.

In Excel, after you create your histogram, you can manually change the frequency column to the relative frequency values by dividing each number by the sample size. Here is a screen shot just as the last number was changed, note as soon as you hit enter the bars will shrink and adjust.

clipboard_e10ddf9990799aa36139dd9af712b6ce8.png

After the last value =7/35 was entered and the label changed to Relative Frequency you get the following graph.

clipboard_e79033b250627f538ada05f407cdf110a.png

The shape of the histogram will be the same for the relative frequency distribution and the frequency distribution; the height, though, is the proportion instead of frequency.

TI-84: To make a histogram, enter the data by pressing [STAT]. The first option is already highlighted (1:Edit) so you can either press [ENTER] or [1]. Make sure the cursor is in the list, not on the list name and type the desired values pressing [ENTER] after each one.

clipboard_efb333a061afe3c407ce6b8855df18d9b.png

Press [2nd] [QUIT] to return to the home screen. To clear a previously stored list of data values, arrow up to the list name you want to clear, press [CLEAR], and then press enter. An alternative way is press [STAT], press 4 for 4:ClrList, press [2nd], then press the number key corresponding to the data list you wish to clear, for example, [2 nd ] [1] will clear L 1 , then press [ENTER]. After you enter the data, press [2 nd ] [STAT PLOT]. Select the first plot by hitting [Enter] or the number [1:Plot 1]. Turn the plot [On] by moving the cursor to On and selecting Enter. Select the Histogram option using the right arrow keys. Select [Zoom], then [ZoomStat].

clipboard_eb246b06885a6ad8e3b13354397bdc72e.png

You can see and change the class width by selecting [Window], then change the minimum x value Xmin=20, the maximum x value Xmax=50, the x-scale to Xscl=5 and the minimum y value Ymin=-6.5 and the maximum y value to Ymax=14. Select the [GRAPH] button. We get a similar looking Histogram compared to the stem-and-leaf plot and Excel histogram. Select the [TRACE] button to see the height of each bar and the classes.

clipboard_eacdd5632d21b885a8fcceab06046a10f.png

TI-89: First, enter the data into the Stat/List editor under list 1. Press [APP] then scroll down to Stat/List Editor, on the older style TI-89 calculators, go into the Flash/App menu, and then scroll down the list. Make sure the cursor is in the list, not on the list name, and type the desired values pressing [ENTER] after each one. To clear a previously stored list of data values, arrow up to the list name you want to clear, press [CLEAR], and then press enter. After you enter the data, select Press [F2] Plots, scroll down to [1: Plot Setup] and press [Enter].

clipboard_eed6707889170d9327de8ec8a35301a0c.png

Select [F1] Define. Use your arrow keys to select Histogram for Type, and then scroll down to the x-variable box. Press [2 nd ] [Var-Link] this key is above the [+] sign. Then arrow down until you find your List1 name under the Main file folder. Then press [Enter] and this will bring the name List1 back to the menu. You will now see that Plot1 has a small picture of a histogram. To view the histogram, select [F5] [Zoom Data].

clipboard_e9f29fc4f8f772d101c259e0b2e46ca3b.png

The histogram looks a little different from Excel; you can change the settings for the bucket to match your table. Press [♦] [F2:Window]. Change the minimum x value xmin=20, the maximum x value xmax=50, the x-scale to xscl=5 and the minimum y value ymin=-6.5 and the maximum y value to ymax=14. Then press the [♦] [F3:GRAPH] button. Select [F3:Trace] to see the frequency for each bar. Then use your left and right arrow keys to move to the other bars.

clipboard_e8470841df965c98e3bd3238fb36877fb.png

Make a histogram for the following random sample of student rent prices using Excel.

Figure 2-11

Make sure the total of the frequencies is the same as the number of data points and the total of the relative frequency is one. Since we want the bars on the histogram to touch, the number line needs to use the class boundaries that are half way between the endpoints of the class limits. Start by finding the distance between the class endpoints and divide by two: (665-664)/2 = 0.5. Then subtract 0.5 from the left-hand side of each class limit and this will give you the points to use on the x-axis: 349.5, 664.5, 979.5, 1294.5, 1609.5, 1924.5, 2239.5, and 2554.5. Then draw your graph as in Figure 2-12. You can use frequencies or relative frequencies for the y-axis.

clipboard_ef61b10d754b81260a491e14b9ced98cb.png

Figure 2-12

clipboard_ee724d4ce0b96d53e4b01d8b31309d3e1.png

Figure 2-13

Reviewing the graph in Figure 2-13, you can see that most of the students pay around $750 per month for rent, with about $1,500 being the other common value. Most students pay between $600 and $1,600 per month for rent. Of course, these values are just estimates pulled from the graph.

There is a large gap between the $1,500 class and the highest data value. This seems to say that one student is paying a great deal more than everyone else is. This value may be an outlier.

An outlier is a data value that is far from the rest of the values. It may be an unusual value or a mistake. It is a data value that should be investigated. In this case, the student lives in a very expensive part of town, thus the value is not a mistake, and is just very unusual. There are other aspects that can be discussed, but first some other concepts need to be introduced.

2.3.3 Ogive

The line graph for the cumulative or cumulative relative frequency is called an ogive ( oh-jyve ). To create an ogive, first create a scale on both the horizontal and vertical axes that will fit the data. Then plot the points of the upper class boundary versus the cumulative (or cumulative relative) frequency. Make sure you include the point with the lowest class and the zero cumulative frequency. Then just connect the dots.

The steeper the line the more accumulation occurs across the corresponding class. If the line is flat then the frequency for that class is zero. The ogive graph will always be going uphill from left to right and should never dip below the previous point. Figure 2-14 is an example of an ogive.

Ogive comes from the uphill shape used in architecture. Here is an example of an ogive in the East Hall staircase at PSU.

clipboard_eff7eea06a1faa7ff3d14232fda37fd5a.png

Figure 2-14

Make an ogive for the following random sample of rent prices students pay with the corresponding cumulative frequency distribution table.

Find the class boundaries, 349.5, 664.5 … use these for the tick mark labels on the horizontal x-axis, the same as what was used for the histogram. The y-axis uses the cumulative frequencies. The largest cumulative frequency is 24. Every third number is marked on the y-axis units. See Figure 2-15 and Figure 2-16.

clipboard_e12c6072fa9696e35bc747b124e87b09d.png

Figure 2-15

Using software:

clipboard_e965ee5ce6a583dd335f9d627a70de79b.png

Figure 2-16

The usefulness of an ogive is to allow the reader to find out how many students pay less than a certain value, and what amount of monthly rent a certain number of students pay.

For instance, if you want to know how many students pay less than $1,500 a month in rent, then you can go up from the $1,500 until you hit the line and then you go left to the cumulative frequency axis to see what cumulative frequency corresponds to $1,500. It appears that around 21 students pay less than $1,500. See Figure 2-17.

If you want to know the cost of rent that 15 students pay less than, then you start at 15 on the vertical axis and then go right to the line and down to the horizontal axis to the monthly rent of about $1,200. You can see that about 15 students pay less than about $1,200 a month. See Figure 2-18.

clipboard_ed42b868b914e331900f0d35b58f74af8.png

Figure 2-17

clipboard_e26a016d8dd476a05198002dbb5be3201.png

Figure 2-18

If you graph the cumulative relative frequency then you can find out what percentage is below a certain number instead of just the number of people below a certain value.

Using the sample of 35 ages, make an ogive.

The orange line is the ogive and the vertical axis is on the right side.

clipboard_efc9754e8e10f76332c66c8eb6f80b94e.png

2.3.4 Pie Chart

You cannot make stem-and-leaf plots, histograms, ogives or time series graphs for qualitative data. Instead, we use bar or pie charts for a qualitative variable, which lists the categories and gives either the frequency (count) or the relative frequency (percent) of individual items that fall into each category.

A pie chart or pie graph is a very common and easy-to-construct graph for qualitative data. A pie chart takes a circle and divides the circle into pie shaped wedges that are proportional to the size of the relative frequency. There are 360 degrees in a full circle. Relative frequency is just the percentage as a decimal. To find the angle for each pie wedge, multiply the relative frequency for each category by 360 degrees. Figure 2-19 is an example of a pie chart.

clipboard_e59dad9fb137bd20c8e3fc152a0c55a7a.png

Figure 2-19

Use Excel to make a pie chart for the following frequency distribution of marital status.

2.3.5 Bar Graph

clipboard_e1285960f8655c1fddc00458bbc5917a1.png

Figure 2-20

Some key features of a bar graph:

  • The bars do not touch.

You can draw a bar graph with frequency or relative frequency on the vertical axis. The relative frequency is useful when you want to compare two samples with different sample sizes. The relative frequency graph and the frequency graph should look the same, except for the scaling on the frequency axis.

Use Excel to make a bar chart for the following frequency distribution of marital status.

2.3.6 Pareto Chart

A Pareto (pronounced pə-RAY-toh) chart is a bar graph that starts from the most frequent class to the least frequent class. The advantage of Pareto charts is that you can visually see the more popular answer to the least popular. This is especially useful in business applications, where you want to know what services your customers like the most, what processes result in more injuries, which issues employees find more important, and other type of questions where you are interested in comparing frequency. Figure 2-21 is an example of a Pareto chart.

clipboard_eedbf9ccbcaede15d2dfc2d34d1c22815.png

Figure 2-21

Use Excel to make a Pareto chart for the following frequency distribution of marital status.

2.3.7 Stacked Column Chart

The next example illustrates one of these types known as a stacked column chart. Stacked column (bar) charts are used when we need to show the ratio between a total and its parts. Each color shows the different series as a part of the same single bar, where the entire bar is used as a total.

In the Wii Fit game, you can do four different types of exercises: yoga, strength, aerobic, and balance. The Wii system keeps track of how many minutes you spend on each of the exercises every day. The following graph is the data for Niko over one-week time-period. Discuss any interpretations you can infer from the graph.

clipboard_e8dd728660c7d01c0f6f44c217777dcf4.png

Figure 2-22

It appears that Niko spends more time on yoga than on any other exercises on any given day. He seems to spend less time on aerobic exercises on a given day. There are several days when the amount of exercise in the different categories is almost equal. The usefulness of a stacked column chart is the ability to compare several different categories over another variable, in this case time. This allows a person to interpret the data with a little more ease.

Data scientists write programming using statistics to filter spam from incoming email messages. By noting specific characteristics of an email, a data scientist may be able to classify some emails as spam or not spam with high accuracy. One of those characteristics is whether the email contains no numbers, small numbers, or big numbers. Make a stacked column chart with the data in the table. Which type of email is more likely to be spam?

2.3.8 Multiple or Side-by-Side Bar Graph

A multiple bar graph, also called a side-by-side bar graph, allows comparisons of several different categories over another variable.

The percentages of people who use certain contraceptives in Central American countries are displayed in the graph below. Use the graph to find the type of contraceptive that is most used in Costa Rica and El Salvador.

clipboard_e25edadbce88ea3e00dabff572dd0da33.png

(9/21/2020) Retrieved from https://public.tableau.com/profile/prbdata#!/vizhome/AccesstoContraceptiveMethods/AccesstoContraceptiveMethods

Figure 2-24

This side-by-side bar graph allows you to quickly see the differences between the countries. For instance, the birth control pill is used most often in Costa Rica, while condoms are most used in El Salvador.

Make a side-by-side bar graph for the following medal count for the 2018 Olympics.

2.3.9 Time-Series Plot

A time-series plot is a graph showing the data measurements in chronological order, where the data is quantitative data. For example, a time-series plot is used to show profits over the last 5 years. To create a time-series plot, time always goes on the horizontal axis, and the frequency or relative frequency goes on the vertical axis. Then plot the ordered pairs and connect the dots. A time series allows you to see trends over time. Caution: You must realize that the trend may not continue. Just because you see an increase does not mean the increase will continue forever. As an example, prior to 2007, many people noticed that housing prices were increasing. The belief at the time was that housing prices would continue to increase. However, the housing bubble burst in 2007, and many houses lost value during the recession.

The New York Stock Exchange (NYSE) has a website where you can download information on the stock market. Use technology to make a time-series plot.

Using Excel, we will make a time series plot for NYSE daily trading volume. Using the Ctrl key highlight just the date column and the NYSE Volume, then select the Insert tab and the first 2-D line graph option.

clipboard_e333832b274e8ecd43ce27813e8daae62.png

You can then select different designs.

clipboard_e0210f1205130e85b4977554a51f2f314.png

One can use time-series plots to see when they want to cash out or buy a stock.

The time-series graph shows the behavior of one variable over time and does not reflect other variables that are influencing the trading volume.

2.3.10 Scatter Plot

Sometimes you have two quantitative variables and you want to see if they are related in any way. A scatter plot helps you to see what the relationship may look like. A scatter plot is just a plotting of the ordered pairs.

  • When you see the dots increasing from left to right then there is a positive relationship between the two quantitative variables.
  • If the dots are decreasing from left to right then there is a negative relationship.
  • If there is no apparent pattern going up or down, then we say there is no relationship between the two variables.

Is there any relationship between elevation and high temperature on a given day? The following data are the high temperatures at various cities on a single day and the elevation of the city.

Make a scatterplot to see what type of relationship exists.

2.3.11 Misleading Graphs

One thing to be aware of as a consumer, data in the media may be represented in misleading graphs. Misleading graphs not only misrepresent the data, they can lead the reader to false conclusions. There are many ways that graphs can be misleading. One way to mislead is to use picture graphs or 3D graphs that exaggerate differences and should be used with caution. Leaving off units and labels can result in a misleading graph. Another more common example is to rescale or reverse the vertical axis to try to show a large difference between categories. Not starting the vertical axes at zero will show a more dramatic rate of change. Other ways that graphs can be misleading is to change the horizontal axis labels so that they are out of time sequence, using inappropriate graphs, not showing the base population.

What is misleading about the following graph?

An ad for a new diet pill shows the following time-series plot for someone that has lost weight over a 5-month period.

clipboard_e9b1f1a8d0ccecf6442d2630d2a774821.png

If you do not start the vertical axis at zero, then a change can look much more dramatic than it really is. Notice the decrease in weight looks much larger in Figure 2-27. The graph in Figure 2-28 has the vertical axis starting at zero. Notice that over the 5 months, the weight appears to be decreasing, however, it does not look like there is a large decrease.

clipboard_e525678d554f586102f7e0df130962564.png

Figure 2-27

clipboard_e9f4995ff828371d093c1bd68311e798c.png

What is misleading about the graph in Figure 2-29?

clipboard_e7db4511fe0fb38c1bac7fa60e3a12439.png

https://www.mediamatters.org/blog/2014/03/31/dishonest-fox-charts-obamacare-enrollment-editi/198679.

Figure 2-29

The y-axis scale is different for each bar and there are no units on the axis. The first bar has each tic mark as 2 billion, the second bar has each tick as less then 1 billion.

This exaggerates the difference. If they used square scaling as in Figure 2-30, there would not be such an extreme difference between the height of the bars.

clipboard_ee3a9ca04a704504d8516825a8a52fb6d.png

Figure 2-30

What is misleading about the graph in Figure 2-31?

clipboard_e2be6760c97ced071dec328f9ee732642.png

https://www.livescience.com/45083-misleading-gun-death-chart.html

Figure 2-31

The graph has the y-axis reversed. What looks like an increasing trend line really is decreasing when you correct the y-axis. The red background is also an effect to raise alarm, almost like a curtain of blood.

What is misleading about the graph shown in a Lanacane commercial in May 2012, shown in Figure 2-32?

clipboard_e9dd527ec400d3930966ca394d728fc92.png

Retrieved 7/2/2021 from https://youtu.be/I0DapkQ-c1I?t=17

Figure 2-32

It appears that Lanacane is better than regular hydrocorisone cream at releiving itching. However, note that there are no units or labels to the axis.

What is misleading about the graph published Georgia’s Department of Public Health website in May 2020, shown in Figure 2-33?

 clipboard_eca214b90df14140425e050daa46c5f84.png

Retrieved 7/3/2021 from https://www.vox.com/covid-19-coronav...ning-reopening Figure 2-33

There are two misleading items for this graph. The horizontal axis is time, yet the dates are out of sequence starting with April 28, April 27, April 29, May 1, April 30, May 4, May 6, May 5, May 2, May 7, April 26, May 3, May 8, May 9. The first date of April 26 is presented almost at the end of the axis. The graph at first glance would deceive viewers in cases going down over time. A Pareto style chart should never be used for time series data.

The second misleading item is the graph’s title and no label on the y-axis. What does the height of each bar represent? Is the height the number of cases for each county, or is the height the number of deaths and hospitalizations? The website later corrected the graphic as shown in Figure 2-34.

clipboard_ea1b30686fb788964f3cb220e40859afc.png

Retrieved 7/3/2021 from https://www.vox.com/covid-19-coronav...ning-reopening Figure 2-34

Large data sets need to be summarized in order to make sense of all the information. The distribution of data can be represented with a table or a graph. It is the role of the researcher or data scientist to make accurate graphical representations that can help make sense of this in the context of the data. Tables and graphs can summarize data, but they alone are insufficient. In the next chapter we will look at describing data numerically.

This paper is in the following e-collection/theme issue:

Published on 18.4.2024 in Vol 26 (2024)

The Alzheimer’s Knowledge Base: A Knowledge Graph for Alzheimer Disease Research

Authors of this article:

Author Orcid Image

Original Paper

  • Joseph D Romano 1, 2, 3 , MA, MPhil, PhD   ; 
  • Van Truong 1, 4, 5 , MS   ; 
  • Rachit Kumar 1, 4, 5, 6 , BS   ; 
  • Mythreye Venkatesan 7 , BE, MS   ; 
  • Britney E Graham 7 , PhD   ; 
  • Yun Hao 1, 4 , PhD   ; 
  • Nick Matsumoto 7 , BA   ; 
  • Xi Li 7 , MS   ; 
  • Zhiping Wang 7 , MS, PhD   ; 
  • Marylyn D Ritchie 1, 3, 5 , PhD   ; 
  • Li Shen 1, 3 , PhD   ; 
  • Jason H Moore 7 , PhD  

1 Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States

2 Center of Excellence in Environmental Toxicology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States

3 Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States

4 Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States

5 Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States

6 Medical Scientist Training Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States

7 Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States

Corresponding Author:

Joseph D Romano, MA, MPhil, PhD

Institute for Biomedical Informatics

Perelman School of Medicine

University of Pennsylvania

403 Blockley Hall

423 Guardian Drive

Philadelphia, PA, 19104

United States

Phone: 1 2155735571

Email: [email protected]

Background: As global populations age and become susceptible to neurodegenerative illnesses, new therapies for Alzheimer disease (AD) are urgently needed. Existing data resources for drug discovery and repurposing fail to capture relationships central to the disease’s etiology and response to drugs.

Objective: We designed the Alzheimer’s Knowledge Base (AlzKB) to alleviate this need by providing a comprehensive knowledge representation of AD etiology and candidate therapeutics.

Methods: We designed the AlzKB as a large, heterogeneous graph knowledge base assembled using 22 diverse external data sources describing biological and pharmaceutical entities at different levels of organization (eg, chemicals, genes, anatomy, and diseases). AlzKB uses a Web Ontology Language 2 ontology to enforce semantic consistency and allow for ontological inference. We provide a public version of AlzKB and allow users to run and modify local versions of the knowledge base.

Results: AlzKB is freely available on the web and currently contains 118,902 entities with 1,309,527 relationships between those entities. To demonstrate its value, we used graph data science and machine learning to (1) propose new therapeutic targets based on similarities of AD to Parkinson disease and (2) repurpose existing drugs that may treat AD. For each use case, AlzKB recovers known therapeutic associations while proposing biologically plausible new ones.

Conclusions: AlzKB is a new, publicly available knowledge resource that enables researchers to discover complex translational associations for AD drug discovery. Through 2 use cases, we show that it is a valuable tool for proposing novel therapeutic hypotheses based on public biomedical knowledge.

Introduction

Alzheimer disease (AD) is a progressive, neurodegenerative disease affecting an estimated 6.5 million Americans aged ≥65 years and represents a significant clinical, economic, and emotional burden worldwide [ 1 ]. AD is often cited as one of the greatest health care problems of the 21st century, particularly in high-income nations with an increasing proportion of older adults. Despite its societal impact, effective pharmaceutical treatments for AD remain notoriously elusive. The US Food and Drug Administration has approved 5 drugs for the treatment of AD, 4 of which (donepezil, rivastigmine, galantamine, and memantine) only temporarily treat symptoms but do not alter the overall progression of the disease [ 2 ], whereas the fifth (aducanumab) is highly controversial in terms of evidence of effectiveness and its safety profile [ 3 ]. AD researchers have prioritized the discovery and approval of new therapies for the disease both in terms of newly discovered compounds and by repurposing drugs that are already approved to treat other (non-AD) human diseases.

AD is associated with substantial changes in pathology, including the presence of neuritic plaques associated with the amyloid-β protein, extracellular deposition of amyloid-β, and neurofibrillary tangles. Previous research has shown that these neuropathological changes begin to occur years before clinical symptoms are apparent [ 4 , 5 ]. Despite decades of research, why this pathology begins to develop remains largely unknown [ 6 ]. Current consensus is that AD risk is multifactorial. The most well-established risk factors include age; family history; and certain genetic factors, especially the presence of the σ4 allele of the apolipoprotein E gene, which is involved in fat metabolism and cholesterol transport. However, the exact mechanism through which these factors—including APOE -σ4 presence—cause or contribute to AD risk is unknown [ 7 ].

Of the many techniques used in AD therapeutics research, there is a wealth of computer-aided approaches that leverage recent advances in bioinformatics, epidemiology, artificial intelligence (AI), and machine learning (ML). For example, Rodriguez et al [ 8 ] developed an ML framework to assess gene lists constructed by differential gene expression data in response to drug treatment to determine whether those drugs would be candidates for repurposing in AD. Tsuji et al [ 9 ] used an autoencoder neural network to perform dimensionality reduction of a high-density protein interaction network to identify new possible drug targets and then found drugs associated with those targets. Genome-wide association studies have long been used for the identification of genes that confer AD risk, particularly for rare genes or genes with small (but statistically significant) contributions to disease risk [ 10 ].

In this paper, we describe the design and deployment of a major new knowledge resource for computational AD research—named The Alzheimer’s Knowledge Base (AlzKB) [ 11 ]—with a particular focus on drug discovery and drug repurposing. The overall structure and contents of AlzKB are summarized in Figure 1 . At its core, AlzKB consists of a large, heterogeneous graph database describing entities related to AD at multiple levels of biological organization, with rich semantic relationships describing how those entities are linked to one another. To demonstrate its value, we present two data-driven analyses involving ML on AlzKB’s knowledge graph: (1) predicting Parkinson disease (PD) genes that may also be associated with AD and (2) generating and explaining drug repurposing hypotheses for treating AD, both of which replicate existing knowledge while proposing entirely novel directions for future experimental validation. AlzKB is free, open source, and publicly available [ 11 ] and consists entirely of publicly sourced knowledge integrated from 22 diverse web-based biomedical databases. We hypothesized that the relationships and entities in AlzKB contain valuable knowledge that cannot be effectively captured in existing data resources, with the additional advantage of improving the explainability of new predictions.

data graphical representation called

Existing Graph-Based Approaches to AD Research

Due to the increased popularity and success of analyses using integrated knowledge, previous efforts have used knowledge graphs in AD research for a variety of purposes, including drug repurposing [ 12 - 14 ] and gene identification [ 15 ] and as general informational resources [ 16 ]. Similar to AlzKB, these bodies of work draw from a variety of sources to construct the underlying knowledge graphs, including scientific literature and formally structured biomedical databases. Some, including the Alzheimer Disease Knowledge Graph [ 14 ] and the Heterogeneous network-based data set for AD [ 16 ], have been released as publicly accessible resources similar to AlzKB. Other studies have used existing resources not specifically intended for AD research (such as the Semantic MEDLINE Database [ 13 ]) to answer questions related to AD. To our knowledge, AlzKB is the largest graph-based knowledge representation that focuses solely on AD and draws from the greatest number of source databases. For comparison, the next largest AD-specific knowledge graph that we are aware of is AD-KG, which contains 30,729 nodes and 398,544 edges (compared to AlzKB’s 118,902 nodes and 1,309,527 edges). Our emphasis on merging similar nodes or edges and cleaning the graph structure using an underlying biomedical ontology reduces the amount of noise that tends to be associated with many different node or edge types in a single graph, enabling more robust inference about relationships in AD, especially when used with emerging graph ML algorithms. Furthermore, AlzKB offers a public, web interface that allows for easy access and application to new research questions, whereas existing resources have either restricted access or are entirely unavailable for reuse. Given the challenge of identifying new or repurposed drugs for etiologically complex diseases such as AD, AlzKB represents a major step forward by improving both quantitatively and structurally on existing resources.

AlzKB Ontology

Graph databases are renowned for their flexibility in representing data that do not conform to a rigid, tabular structure, but this comes at the expense of implicitly enforcing consistency and semantic standardization [ 17 ]. To mitigate this issue, we designed a Web Ontology Language (OWL) 2 ontology—describing the types of entities relevant to AD and treatment of AD, as well as the types of relationships that link those entities—that serves as a template for nodes and edges in the knowledge graph. Ontologies (including OWL 2 ontologies) are formal representations of knowledge that are frequently used in biomedicine to computationally structure, retrieve, and make inferences about knowledge within a domain of interest [ 18 ]. Briefly, as many of the components of a graph database have a 1-to-1 correspondence with components of an OWL 2 ontology (eg, OWL 2 classes are equivalent to graph database node labels, and OWL 2 object properties are equivalent to edge types in a graph database), it is possible to populate the ontology using biomedical knowledge and translate the contents of the populated ontology into an equivalent graph database. Therefore, enforcing consistency in the ontology becomes equivalent to enforcing consistency in the graph database.

We constructed the ontology manually using the Protégé ontology editor (version 5.5.0; Stanford Center for Biomedical Informatics Research) [ 19 ] following an iterative process guided by expert domain knowledge. First, we prototyped a class hierarchy containing the types of nodes (eg, gene, disease, pathway, and drug) desired in the knowledge base. We then annotated these classes with data properties (eg, drugs can be assigned a property value corresponding to molecular weight) and object properties (relationship types that link 2 entities, such as “drug treats disease”). A thorough description of the components of OWL 2 ontologies is provided by Hitzler et al [ 20 ]. Finally, we placed restrictions on the ontology to reflect biology and clinical practice. For example, we specified restrictions stating that all pathways must contain one or more genes or that all drugs in the knowledge base must have a valid DrugBank ID. We repeated these steps several times, making revisions on previous iterations until several domain experts agreed that the semantic contents of the ontology were consistent with current AD knowledge and systems biology processes involved in AD etiology. After collecting the data sources used to populate the ontology (see the following section), we included additional data properties corresponding to identifiers in those source databases, enabling data provenance and facilitating both interoperability and validation. The final ontology structure consists of entity types involved in AD etiology (modeled as OWL 2 classes), types of semantic relationships that can link those entity types (modeled as OWL 2 object properties), and properties that can be annotated onto entities of specific types (modeled as OWL 2 data properties). Both before and after populating the ontology with individuals (see the Implementing AlzKB section), we validated its contents and structure by running FaCT++—an ontology inference engine that identifies errors by evaluating all assertions in the ontology against the ontology’s class or property hierarchy and other restrictions [ 21 ].

Collecting and Assembling Third-Party Data Sources

Using the AlzKB ontology’s class hierarchy as a starting point, we determined a set of the most important entity types to include in the first release of the knowledge base. For example, we prioritized inclusion of entities representing diseases (specifically AD and its various subtypes), genes, and drugs, among others. Similarly, we identified important relationship types (eg, “DRUG_BINDS_GENE” or “GENE_ASSOCIATED_WITH_DISEASE”) to include in the knowledge base. For each of these entity and relationship types, we identified a third-party, public data source that would serve as a collection of “ground truth knowledge” for that entity or relationship type. In the assembled knowledge base, there is roughly a 1-to-1 correspondence between a data record in the original “ground truth” data source and its corresponding entity or relationship in AlzKB, with some important exceptions. For example, we made the decision to only include neurological diseases in AlzKB rather than all diseases described in the “ground truth” data source (in this case, the Disease Ontology). We also identified instances in which properties from additional data sources could be used to augment the “ground truth” entities. For example, while DrugBank is used to specify the drugs described in AlzKB, we also used fields from Distributed Structure-Searchable Toxicity and PubChem to augment the properties annotated onto drugs (such as molecular weight, chemical fingerprint, and synonyms).

Implementing AlzKB

We populated the ontology by sequentially carrying out the following steps:

  • Import distinct entities from each data source corresponding to the corresponding ontology class and define those entities as ontology individuals (ie, instances of that class). For example, the drug memantine is defined as an instance of the ontology class Drug.
  • Populate data properties for all instances of each ontology class using data from relevant sources. For example, memantine is annotated with the Chemical Abstracts Service Registry number 19982-08-2.
  • Populate object properties as the semantic relationships linking pairs of entities using the appropriate data source. For example, an object property of type “DRUG_TREATS_DISEASE” links memantine to the instance of Disease named Alzheimer’s Disease.

After populating the AlzKB ontology with entities, relationships, and data properties, we serialized the ontology into the Resource Description Framework (RDF) or XML graph data format, which is compatible with modern graph database software as an input format. A complete list of the data sources used in AlzKB at the time of writing is provided in Table 1 . We then populated a Neo4j graph database (version 4.4.5; Neo4j, Inc) [ 22 ] with the contents of the RDF or XML file using the neosemantics library [ 23 ], which parses the RDF data, inserting semantic triples into the graph database corresponding to each entity or relationship. Finally, we stripped the newly populated graph database of unnecessary artifacts that are components of the OWL 2 standard, leaving only nodes, relationships, and properties defined within the hierarchy. For the publicly hosted version of AlzKB, we created a web server that hosts both the static AlzKB website (containing information, documentation, and use details) and the Neo4j graph database, which is available by navigating to a subdomain [ 24 ] of the main website [ 11 ]. For reproducibility, this entire pipeline (including mappings to source databases) is provided as a single Python script available on GitHub (the most recent version) [ 25 ] or Zenodo (an archived version of the code at the time of publication) [ 26 ].

a As source data elements do not correspond in a 1-to-1 manner with entities in the graph (eg, entities may be merged, filtered, or used as edges rather than nodes), actual counts for entities in AlzKB stratified by source are not available. The sizes are the best available estimates at the time of publication. Table 2 and Table S1 in Multimedia Appendix 1 [ 50 - 56 ] provide actual node and edge type counts in AlzKB.

b AOP-DB: Adverse Outcome Pathway Database.

c The derived data are structured in part using Hetionet.

d AD: Alzheimer disease.

e EPA: Environmental Protection Agency.

f DSSTox: Distributed Structure-Searchable Toxicity.

g ACToR: Aggregated Computational Toxicology Resource.

h GWAS: genome-wide association studies.

i LINCS: Library of Integrated Network-Based Cellular Signatures.

j NCBI: National Center for Biotechnology Information.

k MeSH: Medical Subject Headings.

l SIDER: Side Effect Resource.

m Counts not applicable (TISSUES associations map to edges rather than nodes in the graph).

Validating AlzKB Using Real-World Use Cases

After building AlzKB’s knowledge graph, we designed two ML-based use cases that resemble real-world tasks for which AlzKB was originally designed: (1) proposing genetic targets for new drugs based on disease similarity and topological graph features and (2) predicting new edges in the knowledge graph linking AD to repurposed drugs via a graph completion model. These 2 use cases are intended to assess the external validity of AlzKB—for the ML models to perform well on tasks defined using real-world evaluation end points (eg, effective drugs or etiologically important genes), the informative patterns and phenomena underlying those end points need to be adequately captured in the knowledge graph.

In the first use case (identifying genetic targets via graph topology measures), we trained a random forest (RF) classifier (implemented in the scikit-learn library [Python Software Foundation] for the Python programming language) using the following topological graph features, which are computed for every node pair in the graph (regardless of whether an edge does or does not exist between them): common neighbors, total neighbors, preferential attachment, Adamic-Adar, and resource allocation [ 57 - 60 ]. Each feature gives a different measure of network “relatedness” for a pair of nodes, which are then used as predictive features in the RF model. For a given node pair ( n 1 , n 2 ), these metrics are defined as follows:

where N(n 1 ) is the set of neighbor (adjacent) nodes of node i . Our training procedure for the RF model included 3-fold grid search cross-validation to optimize hyperparameters, an 80%/20% train/test split, and repeating the procedure 10 times with random sampling.

To accomplish the second use case (drug repurposing via graph completion models), we implemented and compared the performance of 5 graph completion algorithms applied to the entire AlzKB knowledge graph. These models learn low-dimensional representations of graph nodes as vector embeddings. The embeddings are then combined to propose all possible triples in the graph (source node, edge, and target node), and scores are generated to indicate the plausibility of the triple. The 5 models we evaluated are TransE, RotatE, DistMult, ComplEx, and ConvE [ 60 ].

We implemented the 5 models using PyKEEN—a Python library for knowledge graph embeddings [ 50 ]. We randomly split the data set of all triples into 80/10/10 training/validation/testing sets and used grid search to empirically set embedding dimensions to 256 and the number of epochs to 100 with early stopping allowed. All remaining hyperparameters were set to the PyKEEN defaults. We trained the models on Google Colab using a single Tesla T4 graphics processing unit and evaluated the results using the rank-based evaluation metrics hits@k ( k =1, 3, and 10) and mean reciprocal rank (MRR) [ 61 ]. Ranking-based evaluation sorts the scores of triples in descending order and sets their rank as the index in the sorted list. In the case of multiple “true” triples having an equal score, we used the average of the most optimistic (best) and pessimistic (worst) ranks across the metrics. Briefly, hits@k is the ratio of true triples in the test set that have been ranked within the top k predictions of the model. Higher values indicate better performance. The MRR, also known as inverse harmonic mean rank, is the arithmetic mean of the inverse rank of the true triples. We performed evaluation on both left- and right-side predictions (ie, how well they can predict missing entities in partial triples without either the head [source] or tail [target] entities).

Ethical Considerations

No human participants were involved in this research. All data used to build and evaluate AlzKB were derived from publicly available biomedical knowledge retrieved from open access databases. None of the data included were derived from individual human participants. Similarly, AlzKB is entirely open source and publicly available and complies with the licensing terms of all 22 source databases used to build the knowledge base.

Knowledge Base Description

The first release of AlzKB (version 1.0) [ 26 ] contains 118,902 distinct nodes (representing biomedical entities) and 1,309,527 relationships linking those nodes. A full summary of node and relationship types with counts, respectively, is provided in Table 2 and Table S1 in Multimedia Appendix 1 . Users can interact with AlzKB in their web browser using the built-in Neo4j interface or programmatically by connecting to the graph database over the internet. We also provide instructions for installing a local copy of the graph database as well as how to build the database from its original data sources.

Proposing New Therapeutic Targets for AD

As a proof of concept, we performed an analysis to predict whether known PD genes are also linked to AD etiology. PD is a chronic, progressive neurological disorder characterized by uncontrollable movements and possible mental and behavioral changes. Similar to AD, the precise etiology of PD is not fully understood, but the disease is characterized by the death or dysfunction of basal ganglia neurons. A growing body of work has established physiological and genetic similarities between PD and AD [ 62 ], and it has been proposed that drugs targeting PD genes could potentially treat AD as well. To approach this hypothesis computationally, we defined a binary classification task to predict whether gene nodes in the AlzKB knowledge graph are or are not AD genes [ 63 ]. To assemble the data set, we considered all gene nodes adjacent to AD as positive (n=101) and all gene nodes not adjacent to AD as negative (n=62,306). The negative samples are assumed to contain a mixture of true negatives and false negatives; in link prediction tasks, the goal is to recover the false negatives. We further filtered the negative nodes to omit PD genes (n=73) and orphan gene nodes (n=43,032) and down sampled the remaining genes to 303 (ie, 3 times the number of positive samples). To evaluate the performance, we used accuracy, balanced accuracy, precision, recall, F 1 -score, area under the receiver operating characteristic curve, and area under the precision-recall curve, as shown in Figure 2 .

The RF model predicted gene-disease relationships with an average balanced accuracy of 96.2% (precision=0.88; recall=0.98). We applied the trained models to predict PD genes that are likely to also be AD genes. Of the 73 PD genes in AlzKB, 8 (11%; FYN , DCTN1 , SNCA , SYNJ1 , RSP12 , ATXN2 , KCNIP3 , and CHRNB1 ; described in Table 3 ) were predicted to be AD genes. A total of 10% (7/73) of the genes were predicted to be AD genes in all 10 models, whereas CHRNB1 was predicted in 7 of the 10 models.

data graphical representation called

Drug Repurposing via Graph Data Science

As a second use case, we considered the task of repurposing existing drugs—currently used to treat other diseases—based on patterns in the knowledge graph that suggest that they may also treat AD. To do this, we trained 5 state-of-the-art knowledge graph completion methods (TransE, RotatE, DistMult, ComplEx, and ConvE) [ 51 ] on AlzKB and selected the highest-performing one to predict links between drugs and AD. Additional details about the differences between these methods are provided in Multimedia Appendix 1 .

The performance of the 5 different knowledge graph completion models is shown in Table 4 . Among them, RotatE performed best, with the highest MRR and hits@k values. Therefore, we used RotateE to make predictions on the test set to obtain missing head entities with the template ([ drug ], DRUG_TREATS_DISEASE, AD). The top 10 predicted drugs are listed in Table 5 along with their current approved use and relevant clinical trial status pertaining to AD efficacy. Of the top 10 predictions, 3 (30%) have been investigated in clinical trials to treat symptoms of AD. To further explore these predictions, we generated visualizations of a minimum spanning tree linking the 10 drugs to AD in AlzKB’s knowledge graph, as shown in Figure 3 . The visualization shows that the shortest paths between the drugs and AD are mediated by a small set of AD-associated genes, each of which is associated with one or more of the proposed drugs. The visualization is suggestive of interpretable biological mechanisms through which the drugs could act on AD etiology and provides hypotheses to further explore their validity.

a MRR: mean reciprocal rank.

b Italicized values indicate maximum scores within a given column.

a No known AD-related clinical trials for the given drug.

b ER+: estrogen-receptor positive.

data graphical representation called

Principal Findings

AlzKB is a freely available resource for the biomedical research community, with the primary goal of expanding the repertoire of therapies for AD via drug repurposing. In the previous sections, we described the current contents of AlzKB, the process of constructing it, and 2 specific data-driven use cases that illustrate how it can be applied to drug repurposing tasks. These use cases consisted of predicting the shared genetic architecture of AD and PD (potentially allowing for PD therapies to be repurposed for AD) and directly proposing drugs to repurpose for treating AD by predicting new links between drug and disease nodes in the knowledge graph. In both cases, the results are both biologically plausible and supported by quantitative metrics, yielding new hypotheses that merit experimental validation. AlzKB is a flexible resource that is not limited to these analyses, and we encourage other research teams to use it for different and complementary knowledge discovery tasks.

The Role of AlzKB in Biomedical Knowledge Discovery

AD and other neurodegenerative diseases present one of the greatest challenges in modern biomedicine. AD is by and large a disease of old age, and as improvements to health care continue to increase the overall global life expectancy, we can expect the number of people with various forms of dementia to also increase. As the etiology and pathophysiology of AD are highly multifactorial, there is likely no single “cure” for the disease. Instead, researchers and public health officials have shifted much of their focus toward finding therapies that reduce risk, slow the progression of the disease, or reverse neuronal damage. In addition, as there are various subtypes of AD with underlying mechanisms, any therapy might be effective for only some patients with AD. Therefore, an essential step for reducing global disease burden is to propose many new therapeutic agents that target various aspects of AD pathology. This is precisely the motivating use case for AlzKB. As we have demonstrated, AlzKB provides a rich representation of existing knowledge about AD and the biological context in which it acts. The 2 ML-based use cases we presented previously use real-world end points to demonstrate that the knowledge captured in AlzKB is meaningful and representative of the biological processes underlying the disease. AlzKB stands to become a major resource in the AD research community, where pattern analysis and integration with observational data can be used to propose a diverse array of new therapeutic hypotheses along with interpretable mechanistic explanations of how those therapies may act in the human body.

Building the initial release of AlzKB was a highly interdisciplinary effort involving contributions from experts in translational bioinformatics, data science, and clinical informatics as well as medical scientists. Although each of these domains was essential in delivering a knowledge base that reflects important biomedical patterns describing AD etiology and treatment, a key need during the design and implementation phases was data literacy. To support future work in this and related areas, we encourage the inclusion of informatics and data analysis techniques in all types of biomedical curricula. Beyond AlzKB, our approach for building the knowledge graph is generalizable to practically any domain and depends on (1) defining an ontology using expert knowledge that formally describes the domain of interest and (2) identifying source databases that provide the entities and relationships described in the ontology. We are directly involved in the ongoing development of other knowledge bases using this same approach, including ComptoxAI—a knowledge base that supports AI research in toxicology [ 64 ]. As both knowledge bases share many of the same “core” entities (genes, diseases, pathways, and anatomical structures), the knowledge graphs are already semantically harmonized and ready for integration in larger, cross-disciplinary biomedical knowledge applications.

Discovering Putative Therapies Through Graph Data Science

Of the PD genes predicted to also be AD genes (see the Proposing New Therapeutic Targets for AD section; Table 3 ), some are involved in neuronal signaling and structure, and some are known to be involved in a wide range of neurological disorders. FYN has seen recent attention and investigation into its possible link to AD due to its broad expression in brain tissue and known interactions with tau proteins [ 65 , 66 ]. Among the other identified genes, one ( CHRNB1 ) is known to be involved in acetylcholine signaling [ 67 , 68 ], and another ( KCNIP3 ) codes a protein that interacts with presenilin, and mutations in presenilin are causal for hereditary AD [ 69 , 70 ]. Some of these gene hits ( ATXN2 and DCTN1 ) have limited or no current research directly linking them to AD but are biologically plausible. As such, they may represent novel therapeutic targets or targets for further research and investigation [ 71 ]. For example, DCTN1 encodes the dynactin-1 protein, and deficits in dynactin are connected to several neurodegenerative diseases; however, there is limited research linking this gene to AD [ 72 , 73 ].

Among the drug repurposing predictions (see the Drug Repurposing via Graph Data Science section; Table 5 ) are some agents that have previously been proposed for the treatment of AD (risperidone and sertraline) or for symptoms associated with AD (nicotine). Sumatriptan has been the subject of several studies focused on AD [ 74 ] and is connected to a strong comorbidity of migraine headaches and dementia in women [ 75 ]. Pimozide has been shown to reduce the aggregation of tau protein in mice [ 76 ] and is linked to AD in a number of unrelated in silico models [ 77 ]. The inclusion of nicotine is also noteworthy as it has seen recent interest among AD researchers and is the subject of an ongoing clinical trial to improve memory [ 78 ]. Other drugs listed in Table 5 have not yet been identified as AD treatments and represent novel repurposing candidates. Each can be considered a testable hypothesis meriting further investigation, giving credence to the increased detective power of AlzKB’s knowledge graph approach over existing AD data resources. It should be noted that this approach can only propose new indications for existing drugs and is based on existing knowledge and derived from known biological associations with those drugs. Other approaches (including emerging techniques in graph ML) could be used to propose entirely new drugs that could treat AD.

Future Directions With AlzKB

AlzKB is a growing resource, and we have plans for adding new features and data types that are in various stages of implementation. As a central hypothesis of AD pathogenesis revolves around the atypical accumulation of proteins within and around brain cells, an important step will be to adequately distinguish and differentiate genes from the proteins that those genes code for. Existing data resources available for inclusion in AlzKB largely fail to make this distinction in a way that is accepted by the scientific community, so we are currently evaluating options to use either postprocessing of existing knowledge sources or synthesis of new knowledge to achieve a good representation of genes, proteins, and functional or structural variants that are key to understanding AD.

Current ML models often do not generalize well to heterogeneous graphs such as the one that constitutes AlzKB’s knowledge graph. This is largely because traditional models cannot use the network structure and heterogeneous nature of different entity types. Several promising algorithms can be used for prediction on heterogeneous graphs—including GraphSAGE [ 79 ] and metapath2vec [ 80 ]—but most fail to scale effectively when the number of node or edge types increases. As any effective therapy must be accompanied by a mechanistic understanding of how it functions, we also need to ensure that new heterogeneous graph ML models are explainable . With this in mind, we are using AlzKB as a motivating resource for designing new, cutting-edge algorithms that produce interpretable predictions from highly heterogeneous knowledge graphs. Furthermore, the increasing popularity of large language models (LLMs; such as GPT-4) presents a wealth of opportunities for incorporating knowledge graphs such as AlzKB into diverse AI applications [ 81 ]. One application we are considering is using AlzKB to provide LLMs with formalized knowledge about AD that allows them to more effectively produce informative outputs about AD etiology. Currently, LLMs can perform poorly on technically complex or poorly understood domains due to a scarcity of relevant content in their training corpora, and augmenting their performance using domain-specific knowledge graphs is an emerging strategy for fixing that issue. As we do so, these will be released alongside AlzKB with educational resources that facilitate ease of use and adoptability by various stakeholders.

Knowledge graphs—including AlzKB—come with several important limitations that will be crucial to address in coming years. One of these is the subjective nature of determining what does and does not constitute “knowledge,” implying broad acceptance by the scientific community (as opposed to “data,” which consist of individual observations). Currently, we use expert domain knowledge and careful screening of source databases to accomplish this, but with the advent of broadly accessible generative AI tools, there may be emerging strategies that minimize sources of human bias [ 82 ]. Furthermore, new predictions made using knowledge graphs still necessitate costly and time-consuming experimental or observational follow-up studies to validate those predictions. This is due in part to the absence of negative samples for training predictive models. While the presence of an edge between 2 nodes in a knowledge graph is interpreted as a “positive sample” for model training, the absence of an edge simply means that we do not know whether a relationship does or does not exist, and therefore, it may not in fact be a negative sample. New methods, including self-supervised contrastive learning, show promise in alleviating this issue [ 83 ], but further work is needed to determine whether these generalize well to AlzKB and similar highly heterogeneous biomedical knowledge graphs. Nonetheless, these are active areas of research in the AI, informatics, and computer science communities, and in spite of them, our results are still robust enough to provide compelling evidence demonstrating AlzKB’s scientific value.

Ultimately, we aim to provide AlzKB as a robust resource that helps unravel the etiology of AD. It is already a large, high-quality knowledge base from which graph-based AI or ML approaches can be developed for drug repurposing and drug discovery. As we and the rest of the biomedical research community make these discoveries in the coming years, they will be included and publicized on the AlzKB website as a public resource to drive innovation and scientific progress.

Obtaining AlzKB for Local Use and Extending the Knowledge Graph

As it is a public and open-source resource for scientific discovery, we provide AlzKB through a variety of interfaces with distinct advantages for different use cases and user types. Casual users who wish to browse the knowledge base or perform simple analyses can do so directly through the Neo4j browser interface [ 24 ]. However, for more advanced use cases (or when computational needs exceed those available on the public version of the knowledge base), AlzKB can be either downloaded and populated locally into a Neo4j installation or built from the original source data files via the tools included on the AlzKB GitHub repository [ 25 ]. The latter of these options also allows users to extend the knowledge base to include additional data sources, entity types, or relationships beyond those provided in the official knowledge base distribution. We also encourage users who make modifications to the knowledge base to submit their changes for review to be included in the main code distribution. Instructions for how to contribute to AlzKB are also available on the GitHub repository.

As the data sources included in AlzKB are all, themselves, from open-source databases, we urge users to ensure that any new data sources they merge into AlzKB similarly comply with open-source standards. In brief, AlzKB can only be maintained under the most restrictive license terms of its included third-party sources, so restrictive license terms in a database being considered decrease that database’s suitability for inclusion. We hope for AlzKB to be recognized as a community effort for aggregating and democratizing the discovery of new AD therapeutics and, therefore, encourage public discussion of new methods and data sources to be included.

Conclusions

In this work, we introduced AlzKB as a free, publicly available toolkit and data resource for novel discoveries in AD research, with a particular focus on therapeutic approaches to treating AD. AlzKB is both new and continually growing, and we aim to cultivate a community of researchers to collaboratively increase the impact, speed, and throughput of AD research, along with rapid dissemination to health care, academia, and the pharmaceutical industry. In the future, we will develop new AI and data science methods to continually extract knowledge from AlzKB, but in this study, we already demonstrate through graph data science that AlzKB can both replicate existing AD knowledge and generate entirely new, testable hypotheses to drive the future of drug repurposing and drug discovery.

Acknowledgments

The Alzheimer’s Knowledge Base is supported by US National Institutes of Health grants U01-AG066833, R01-LM010098, R01-LM013463 (principal investigator [PI]: JHM), and R00-LM013646 (PI: JDR).

Data Availability

The data sets generated during and analyzed during this study are available in the GitHub and Zenodo repositories [ 25 , 26 ].

Conflicts of Interest

None declared.

Supplemental information providing expanded details on the knowledge graph completion methods used to validate Alzheimer’s Knowledge Base, as well as counts for relationship types in the knowledge graph.

  • 2022 Alzheimer's disease facts and figures. Alzheimers Dement. Apr 2022;18(4):700-789. [ CrossRef ] [ Medline ]
  • Yiannopoulou KG, Papageorgiou SG. Current and future treatments in Alzheimer disease: an update. J Cent Nerv Syst Dis. Feb 29, 2020;12:1179573520907397. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Rabinovici GD. Controversy and progress in Alzheimer's disease - FDA approval of aducanumab. N Engl J Med. Aug 26, 2021;385(9):771-774. [ CrossRef ] [ Medline ]
  • DeTure MA, Dickson DW. The neuropathological diagnosis of Alzheimer's disease. Mol Neurodegener. Aug 02, 2019;14(1):32. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Aisen PS, Cummings J, Jack CRJ, Morris JC, Sperling R, Frölich L, et al. On the path to 2025: understanding the Alzheimer's disease continuum. Alzheimers Res Ther. Aug 09, 2017;9(1):60. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Fan L, Mao C, Hu X, Zhang S, Yang Z, Hu Z, et al. New insights into the pathogenesis of Alzheimer's disease. Front Neurol. 2019;10:1312. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Silva MV, de Mello Gomide Loures C, Alves LC, de Souza LC, Borges KB, Carvalho MD. Alzheimer's disease: risk factors and potentially protective measures. J Biomed Sci. May 09, 2019;26(1):33. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Rodriguez S, Hug C, Todorov P, Moret N, Boswell SA, Evans K, et al. Machine learning identifies candidates for drug repurposing in Alzheimer's disease. Nat Commun. Feb 15, 2021;12(1):1033. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Tsuji S, Hase T, Yachie-Kinoshita A, Nishino T, Ghosh S, Kikuchi M, et al. Artificial intelligence-based computational framework for drug-target prioritization and inference of novel repositionable drugs for Alzheimer's disease. Alzheimers Res Ther. May 03, 2021;13(1):92. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Grupe A, Abraham R, Li Y, Rowland C, Hollingworth P, Morgan A, et al. Evidence for novel susceptibility genes for late-onset Alzheimer's disease from a genome-wide association study of putative functional variants. Hum Mol Genet. Apr 15, 2007;16(8):865-873. [ CrossRef ] [ Medline ]
  • The Alzheimer's KnowledgeBase (AlzKB). AlzKB. URL: https://alzkb.ai/ [accessed 2023-02-24]
  • Daluwatumulle G, Wijesinghe R, Weerasinghe R. In silico drug repurposing using knowledge graph embeddings for Alzheimer's disease. In: Proceedings of the 9th International Conference on Bioinformatics Research and Applications. 2022. Presented at: ICBRA '22; September 18-20, 2022; Berlin, Germany. [ CrossRef ]
  • Nian Y, Hu X, Zhang R, Feng J, Du J, Li F, et al. Mining on Alzheimer's diseases related knowledge graph to identity potential AD-related semantic triples for drug repurposing. BMC Bioinformatics. Sep 30, 2022;23(Suppl 6):407. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hsieh KL, Plascencia-Villa G, Lin KH, Perry G, Jiang X, Kim Y. Synthesize heterogeneous biological knowledge via representation learning for Alzheimer's disease drug repurposing. iScience. Nov 26, 2022;26(1):105678. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Binder J, Ursu O, Bologa C, Jiang S, Maphis N, Dadras S, et al. Machine learning prediction and tau-based screening identifies potential Alzheimer's disease genes relevant to immunity. Commun Biol. Feb 11, 2022;5(1):125. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sügis E, Dauvillier J, Leontjeva A, Adler P, Hindie V, Moncion T, et al. HENA, heterogeneous network-based data set for Alzheimer's disease. Sci Data. Aug 14, 2019;6(1):151. [ CrossRef ] [ Medline ]
  • Robinson I, Webber J, Eifrem E. Graph Databases: New Opportunities for Connected Data. Sebastopol, CA. O'Reilly Media; 2015.
  • Davis R, Shrobe H, Szolovits P. What is a knowledge representation? AI Mag. 1993;14(1):17. [ CrossRef ]
  • Musen MA, Protégé Team. The Protégé project: a look back and a look forward. AI Matters. Jun 2015;1(4):4-12. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hitzler P, Krötzsch M, Parsia B, Patel-Schneider PF, Rudolph S. OWL 2 Web ontology language primer. World Wide Web Consortium. Apr 21, 2009. URL: https://www.w3.org/TR/2009/WD-owl2-primer-20090421/ [accessed 2024-03-25]
  • Tsarkov D, Horrocks I. FaCT++ description logic reasoner: system description. In: Proceedings of the International Joint Conference on Automated Reasoning. 2006. Presented at: IJCAR 2006; August 17-20, 2006; Seattle, WA. [ CrossRef ]
  • Neo4j. URL: https://neo4j.com/ [accessed 2022-10-25]
  • Barrasa J, Cowley A. neosemantics (n10s): Neo4j RDF and semantics toolkit. Neo4j. URL: https://neo4j.com/labs/neosemantics/ [accessed 2022-10-25]
  • Neo4j browser. Neo4j. URL: http://neo4j.alzkb.ai/browser/ [accessed 2023-02-24]
  • EpistasisLab/AlzKB. GitHub. URL: https://github.com/EpistasisLab/AlzKB [accessed 2023-02-24]
  • Romano J, Wang P. EpistasisLab/AlzKB: AlzKB first DOI release. Zenodo. Aug 22, 2022. URL: https://zenodo.org/records/7015728 [accessed 2024-03-27]
  • Mortensen HM, Senn J, Levey T, Langley P, Williams AJ. The 2021 update of the EPA's adverse outcome pathway database. Sci Data. Jul 12, 2021;8(1):169. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bastian F, Parmentier G, Roux J, Moretti S, Laudet V, Robinson-Rechavi M. Bgee: integrating and comparing heterogeneous transcriptome data among species. In: Proceedings of the Data Integration in the Life Sciences. 2008. Presented at: DILS 2008; June 25-27, 2008; Evry, France. [ CrossRef ]
  • Schriml LM, Mitraka E, Munro J, Tauber B, Schor M, Nickle L, et al. Human Disease Ontology 2018 update: classification, content and workflow expansion. Nucleic Acids Res. Jan 08, 2019;47(D1):D955-D962. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Piñero J, Queralt-Rosinach N, Bravo A, Deu-Pons J, Bauer-Mehren A, Baron M, et al. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database (Oxford). 2015;2015:bav028. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. Jan 2008;36(Database issue):D901-D906. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Grulke CM, Williams AJ, Thillanadarajah I, Richard AM. EPA's DSSTox database: history of development of a curated chemistry resource supporting computational toxicology research. Comput Toxicol. Nov 01, 2019;12:100096. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Judson R, Richard A, Dix D, Houck K, Elloumi F, Martin M, et al. ACToR--Aggregated computational toxicology resource. Toxicol Appl Pharmacol. Nov 15, 2008;233(1):7-13. [ CrossRef ] [ Medline ]
  • Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. May 2000;25(1):25-29. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gene Ontology Consortium. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. Jan 08, 2021;49(D1):D325-D334. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Buniello A, MacArthur JA, Cerezo M, Harris LW, Hayhurst J, Malangone C, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. Jan 08, 2019;47(D1):D1005-D1012. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Himmelstein DS, Lizee A, Hessler C, Brueggeman L, Chen SL, Hadley D, et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. Elife. Sep 22, 2017;6:e26726. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • The human reference protein interactome mapping project. The Human Reference Interactome. URL: http://www.interactome-atlas.org/ [accessed 2023-02-24]
  • Duan Q, Reid SP, Clark NR, Wang Z, Fernandez NF, Rouillard AD, et al. L1000CDS: LINCS L1000 characteristic direction signatures search engine. NPJ Syst Biol Appl. Aug 04, 2016;2(1):16015. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Lipscomb CE. Medical Subject Headings (MeSH). Bull Med Libr Assoc. Jul 2000;88(3):265-266. [ FREE Full text ] [ Medline ]
  • Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. Jan 2011;39(Database issue):D52-D57. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Schaefer C, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, et al. PID: the pathway interaction database. Nat Prec. Aug 29, 2008. [ CrossRef ]
  • Himmelstein D, Pouya K, Hessler CS, Green AJ, Baranzini S. PharmacotherapyDB 1.0: the open catalog of drug therapies for disease. Figshare. 2016. URL: https://tinyurl.com/mv8k46em [accessed 2024-03-25]
  • Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. Jan 08, 2021;49(D1):D1388-D1395. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wu G, Haw R. Functional interaction network construction and analysis for disease discovery. Methods Mol Biol. 2017;1558:235-253. [ CrossRef ] [ Medline ]
  • Kuhn M, Letunic I, Jensen LJ, Bork P. The SIDER database of drugs and side effects. Nucleic Acids Res. Jan 04, 2016;44(D1):D1075-D1079. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Palasca O, Santos A, Stolte C, Gorodkin J, Jensen LJ. TISSUES 2.0: an integrative web resource on mammalian tissue expression. Database (Oxford). Jan 01, 2018;2018:2. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mungall CJ, Torniai C, Gkoutos GV, Lewis SE, Haendel MA. Uberon, an integrative multi-species anatomy ontology. Genome Biol. Jan 31, 2012;13(1):R5. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Martens M, Ammar A, Riutta A, Waagmeester A, Slenter DN, Hanspers KA, et al. WikiPathways: connecting communities. Nucleic Acids Res. Jan 08, 2021;49(D1):D613-D621. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ali M, Berrendorf M, Hoyt CT, Vermue L, Sharifzadeh S, Tresp V, et al. PyKEEN 1.0: a python library for training and evaluating knowledge graph embeddings. J Mach Learn Res. 2021;22(82):1-6. [ FREE Full text ]
  • Zamini M, Reza H, Rabiei M. A review of knowledge graph completion. Information. Aug 21, 2022;13(8):396. [ CrossRef ]
  • Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O. Translating embeddings for modeling multi-relational data. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. 2013. Presented at: NIPS'13; December 5-10, 2013; Lake Tahoe, Nevada.
  • Sun Z, Deng ZH, Nie JY, Tang J. RotatE: knowledge graph embedding by relational rotation in complex space. arXiv. Preprint posted online February 26, 2019. [ FREE Full text ]
  • Yang B, Yih WT, He X, Gao J, Deng L. Embedding entities and relations for learning and inference in knowledge bases. arXiv. Preprint posted online December 20, 2014. [ FREE Full text ]
  • Trouillon T, Welbl J, Riedel S, Gaussier E, Bouchard G. Complex embeddings for simple link prediction. arXiv. Preprint posted online June 20, 2016. [ FREE Full text ]
  • Dettmers T, Minervini P, Stenetorp P, Riedel S. Convolutional 2D knowledge graph embeddings. arXiv. Preprint posted online July 5, 2017. [ FREE Full text ] [ CrossRef ]
  • Newman ME. Clustering and preferential attachment in growing networks. Phys Rev E. Jul 26, 2001;64(2):025102. [ CrossRef ]
  • Barabasi AL, Albert R. Emergence of scaling in random networks. Science. Oct 15, 1999;286(5439):509-512. [ CrossRef ] [ Medline ]
  • Adamic LA, Adar E. Friends and neighbors on the web. Soc Netw. Jul 2003;25(3):211-230. [ CrossRef ]
  • Zhou T, Lü L, Zhang YC. Predicting missing links via local information. Eur Phys J B. Oct 10, 2009;71(4):623-630. [ CrossRef ]
  • Gao Z, Ding P, Xu R. KG-Predict: a knowledge graph computational framework for drug repurposing. J Biomed Inform. Aug 2022;132:104133. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nussbaum RL, Ellis CE. Alzheimer's disease and Parkinson's disease. N Engl J Med. Apr 03, 2003;348(14):1356-1364. [ CrossRef ] [ Medline ]
  • Abbas K, Abbasi A, Dong S, Niu L, Yu L, Chen B, et al. Application of network link prediction in drug discovery. BMC Bioinformatics. Apr 12, 2021;22(1):187. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Romano JD, Hao Y, Moore JH, Penning TM. Automating predictive toxicology using ComptoxAI. Chem Res Toxicol. Aug 15, 2022;35(8):1370-1382. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Iannuzzi F, Sirabella R, Canu N, Maier TJ, Annunziato L, Matrone C. Fyn tyrosine kinase elicits amyloid precursor protein Tyr682 phosphorylation in neurons from Alzheimer's disease patients. Cells. Jul 30, 2020;9(8):1807. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Nygaard HB, van Dyck CH, Strittmatter SM. Fyn kinase inhibition as a novel therapy for Alzheimer's disease. Alzheimers Res Ther. Feb 5, 2014;6(1):8. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Lardenoije R, Roubroeks JA, Pishva E, Leber M, Wagner H, Iatrou A, et al. Alzheimer's disease-associated (hydroxy)methylomic changes in the brain and blood. Clin Epigenetics. Nov 27, 2019;11(1):164. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Lombardo S, Maskos U. Role of the nicotinic acetylcholine receptor in Alzheimer's disease pathology and treatment. Neuropharmacology. Sep 2015;96(Pt B):255-262. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Jo DG, Lee JY, Hong YM, Song S, Mook-Jung I, Koh JY, et al. Induction of pro-apoptotic calsenilin/DREAM/KChIP3 in Alzheimer's disease and cultured neurons after amyloid-beta exposure. J Neurochem. Feb 2004;88(3):604-611. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Jin JK, Choi JK, Wasco W, Buxbaum JD, Kozlowski PB, Carp RI, et al. Expression of calsenilin in neurons and astrocytes in the Alzheimer's disease brain. Neuroreport. Apr 04, 2005;16(5):451-455. [ CrossRef ] [ Medline ]
  • Rosas I, Martínez C, Clarimón J, Lleó A, Illán-Gala I, Dols-Icardo O, et al. Role for ATXN1, ATXN2, and HTT intermediate repeats in frontotemporal dementia and Alzheimer's disease. Neurobiol Aging. Mar 2020;87:139.e1-139.e7. [ CrossRef ] [ Medline ]
  • Aboud O, Parcon PA, DeWall KM, Liu L, Mrak RE, Griffin WS. Aging, Alzheimer's, and APOE genotype influence the expression and neuronal distribution patterns of microtubule motor protein dynactin-P50. Front Cell Neurosci. Mar 25, 2015;9:103. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Caroppo P, Le Ber I, Clot F, Rivaud-Péchoux S, Camuzat A, De Septenville A, et al. DCTN1 mutation analysis in families with progressive supranuclear palsy-like phenotypes. JAMA Neurol. Feb 2014;71(2):208-215. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zochodne DW, Ho LT. Sumatriptan blocks neurogenic inflammation in the peripheral nerve trunk. Neurology. Jan 1994;44(1):161-163. [ CrossRef ] [ Medline ]
  • Liu CT, Wu BY, Hung YC, Wang LY, Lee YY, Lin TK, et al. Decreased risk of dementia in migraine patients with traditional Chinese medicine use: a population-based cohort study. Oncotarget. Oct 03, 2017;8(45):79680-79692. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kim YD, Jeong EI, Nah J, Yoo SM, Lee WJ, Kim Y, et al. Pimozide reduces toxic forms of tau in TauC3 mice via 5' adenosine monophosphate-activated protein kinase-mediated autophagy. J Neurochem. Sep 11, 2017;142(5):734-746. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kumar S, Chowdhury S, Kumar S. In silico repurposing of antipsychotic drugs for Alzheimer's disease. BMC Neurosci. Oct 27, 2017;18(1):76. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • van Duijn CM, Hofman A. Relation between nicotine intake and Alzheimer's disease. BMJ. Jun 22, 1991;302(6791):1491-1494. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hamilton WL, Ying R, Leskovec J. Inductive representation learning on large graphs. arXiv. Preprint posted online June 7, 2017. [ FREE Full text ]
  • Dong Y, Chawla NV, Swami A. metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017. Presented at: KDD '17; August 13-17, 2017; Halifax, NS. URL: https://dl.acm.org/doi/10.1145/3097983.3098036
  • Pan S, Luo L, Wang Y, Chen C, Wang J, Wu X. Unifying large language models and knowledge graphs: a roadmap. arXiv. Preprint posted online June 14, 2023. [ FREE Full text ] [ CrossRef ]
  • Zhu Y, Wang X, Chen J, Qiao S, Ou Y, Yao Y, et al. LLMs for knowledge graph construction and reasoning: recent capabilities and future opportunities. arXiv. Preprint posted online May 22, 2023. [ FREE Full text ]
  • Kefato ZT, Girdzijauskas S. Self-supervised Graph Neural Networks without explicit negative sampling. arXiv. Preprint posted online March 27, 2021. [ FREE Full text ]

Abbreviations

Edited by T de Azevedo Cardoso; submitted 24.02.23; peer-reviewed by P Dabas, N Mungoli, B Xie, C Sun; comments to author 21.04.23; revised version received 23.06.23; accepted 07.11.23; published 18.04.24.

©Joseph D Romano, Van Truong, Rachit Kumar, Mythreye Venkatesan, Britney E Graham, Yun Hao, Nick Matsumoto, Xi Li, Zhiping Wang, Marylyn D Ritchie, Li Shen, Jason H Moore. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 18.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

Help | Advanced Search

Computer Science > Machine Learning

Title: chiplet placement order exploration based on learning to rank with graph representation.

Abstract: Chiplet-based systems, integrating various silicon dies manufactured at different integrated circuit technology nodes on a carrier interposer, have garnered significant attention in recent years due to their cost-effectiveness and competitive performance. The widespread adoption of reinforcement learning as a sequential placement method has introduced a new challenge in determining the optimal placement order for each chiplet. The order in which chiplets are placed on the interposer influences the spatial resources available for earlier and later placed chiplets, making the placement results highly sensitive to the sequence of chiplet placement. To address these challenges, we propose a learning to rank approach with graph representation, building upon the reinforcement learning framework RLPlanner. This method aims to select the optimal chiplet placement order for each chiplet-based system. Experimental results demonstrate that compared to placement order obtained solely based on the descending order of the chiplet area and the number of interconnect wires between the chiplets, utilizing the placement order obtained from the learning to rank network leads to further improvements in system temperature and inter-chiplet wirelength. Specifically, applying the top-ranked placement order obtained from the learning to rank network results in a 10.05% reduction in total inter-chiplet wirelength and a 1.01% improvement in peak system temperature during the chiplet placement process.

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. Data Handling |Graphical Representation of data

    data graphical representation called

  2. Types Of Graph Representation In Data Structure

    data graphical representation called

  3. Types Of Graphical Representations

    data graphical representation called

  4. Graphical Representation of Data

    data graphical representation called

  5. What is graphical representation in maths: Definition, Types and

    data graphical representation called

  6. Pictorial representation of Data

    data graphical representation called

VIDEO

  1. Graphical Representation

  2. Graphical Representation of Data| Team 9 #histogram #datascience #data #exam #statistics

  3. تحليل البيانات الرقمية باستخدام التمثيل البياني🧮📝

  4. Dashboards, Graphs and Maps NetXMS Explained

  5. Diagrammatic and Graphical Representation

  6. Graphical Representation Of Data

COMMENTS

  1. Graphical Representation of Data

    Examples on Graphical Representation of Data. Example 1: A pie chart is divided into 3 parts with the angles measuring as 2x, 8x, and 10x respectively. Find the value of x in degrees. Solution: We know, the sum of all angles in a pie chart would give 360º as result. ⇒ 2x + 8x + 10x = 360º. ⇒ 20 x = 360º.

  2. Graphical Representation

    Frequency Distribution Graphs - Example: Frequency Polygon Graph; Principles of Graphical Representation. Algebraic principles are applied to all types of graphical representation of data. In graphs, it is represented using two lines called coordinate axes. The horizontal axis is denoted as the x-axis and the vertical axis is denoted as the y ...

  3. 2: Graphical Representations of Data

    2.3: Histograms, Frequency Polygons, and Time Series Graphs. A histogram is a graphic version of a frequency distribution. The graph consists of bars of equal width drawn adjacent to each other. The horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies. The heights of the bars correspond ...

  4. What Is Data Visualization: Definition, Types, Tips, and Examples

    Data Visualization is a graphic representation of data that aims to communicate numerous heavy data in an efficient way that is easier to grasp and understand. In a way, data visualization is the mapping between the original data and graphic elements that determine how the attributes of these elements vary. The visualization is usually made by ...

  5. Data representations

    Data representations are useful for interpreting data and identifying trends and relationships. When working with data representations, pay close attention to both the data values and the key words in the question. When matching data to a representation, check that the values are graphed accurately for all categories.

  6. What Is Data Visualization? Definition & Examples

    Data visualization is the graphical representation of information and data. By using v isual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data. Additionally, it provides an excellent way for employees or business owners to present data to non ...

  7. 2: Graphical Descriptions of Data

    2.3: Other Graphical Representations of Data This page titled 2: Graphical Descriptions of Data is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Kathryn Kozak via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.

  8. 2.1: Introduction

    Then patterns can more easily be discerned. Figure 2.1.1 2.1. 1: When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled together with similar ballots to keep them organized. (credit: William Greeson) In this chapter, you will study graphical ways to describe and ...

  9. 8.2: Presenting Quantitative Data Graphically

    This type of graph is called a histogram. Histogram. A histogram is a graphical representation of quantitative data, similar to a bar graph. The horizontal axis is a number line and the bars are touching. Example \(\PageIndex{2}\) For the values above, a histogram would look like:

  10. Graphical Representation of Data

    Graphical Representation of Data: In today's world of the internet and connectivity, there is a lot of data available, ... These two axes divide the plane into four parts called quadrants. The horizontal one is usually called the x-axis and the other one is called the y-axis. The origin is the point where these two axes intersect.

  11. What is Graphical Representation? Definition and FAQs

    Graphical representation refers to the use of intuitive charts to clearly visualize and simplify data sets. Data is ingested into graphical representation of data software and then represented by a variety of symbols, such as lines on a line chart, bars on a bar chart, or slices on a pie chart, from which users can gain greater insight than by ...

  12. Introduction to Graphs

    Principles of graphical representation . The principles of graphical representation are algebraic. In a graph, there are two lines known as Axis or Coordinate axis. These are the X-axis and Y-axis. The horizontal axis is the X-axis and the vertical axis is the Y-axis. They are perpendicular to each other and intersect at O or point of Origin.

  13. What Is Graphical Representation Of Data

    Graphical representation of data, often referred to as graphical presentation or simply graphs which plays a crucial role in conveying information effectively. Principles of Graphical Representation. Effective graphical representation follows certain fundamental principles that ensure clarity, accuracy, and usability:Clarity : The primary goal ...

  14. Graphical Representation: Types, Rules, Principles & Examples

    A graphical representation is the geometrical image of a set of data that preserves its characteristics and displays them at a glance. It is a mathematical picture of data points. It enables us to think about a statistical problem in visual terms. It is an effective tool for the preparation, understanding and interpretation of the collected data.

  15. Graphic Representation of Data: Meaning, Principles and Methods

    General Principles of Graphic Representation: There are some algebraic principles which apply to all types of graphic representation of data. In a graph there are two lines called coordinate axes. One is vertical known as Y axis and the other is horizontal called X axis. These two lines are perpendicular to each other.

  16. 17 Important Data Visualization Techniques

    Bullet Graph. Choropleth Map. Word Cloud. Network Diagram. Correlation Matrices. 1. Pie Chart. Pie charts are one of the most common and basic data visualization techniques, used across a wide range of applications. Pie charts are ideal for illustrating proportions, or part-to-whole comparisons.

  17. Data Representation: Definition, Types, Examples

    Graphical Representation of Data: Histogram. The histogram is another kind of graph that uses bars in its display. The histogram is used for quantitative data, and ranges of values known as classes are listed at the bottom, and the types with greater frequencies have the taller bars.

  18. Graphical Representation of Data

    Graphical Representation of the data is a visual representation of data in statistics done in the form of graphs, charts, lines, and plots. This method is used for the comparison and analysis of qualitative and categorical data using discrete and non-discrete variables. While plotting a graph, make sure that it has a proper title, scale, unit ...

  19. 2.1: Three Popular Data Displays

    Learning Objectives. To learn to interpret the meaning of three graphical representations of sets of data: stem and leaf diagrams, frequency histograms, and relative frequency histograms. A well-known adage is that "a picture is worth a thousand words.". This saying proves true when it comes to presenting statistical information in a data set.

  20. Graphic Presentation of Data and Information

    Data Sources - Wherever possible, include the sources of information at the bottom of the graph. Keep it Simple - You should construct a graph which even a layman (without any exposure in the areas of statistics or mathematics) can understand. Neat - A graph is a visual aid for the presentation of data and information.

  21. Analyzing and Visualizing Data Flashcards

    Study with Quizlet and memorize flashcards containing terms like The graphical representation of data, usually in a visually appealing way, is called _____., Data visualization refers to _____., What is data visualization? and more.

  22. Graphs in Data Structure: Types, Representation, Operations

    Graph Representation in Data Structure. Below are the two most common ways of representing graphs in data structure: 1. Adjacency Matrix ... In a complete graph or fully connected graph in the data structure, every vertex has an edge to all other vertices. A graph is called a complete graph if there is a path from every vertex to every other ...

  23. Survey on Embedding Models for Knowledge Graph and its Applications

    Knowledge Graph (KG) is a graph based data structure to represent facts of the world where nodes represent real world entities or abstract concept and edges represent relation between the entities. Graph as representation for knowledge has several drawbacks like data sparsity, computational complexity and manual feature engineering. Knowledge Graph embedding tackles the drawback by ...

  24. Graph Neural Networks for Wireless Networks: Graph Representation

    Graph neural networks (GNNs) have been regarded as the basic model to facilitate deep learning (DL) to revolutionize resource allocation in wireless networks. GNN-based models are shown to be able to learn the structural information about graphs representing the wireless networks to adapt to the time-varying channel state information and dynamics of network topology. This article aims to ...

  25. 2.3: Graphical Displays

    Florence Nightingale (1820-1910) was one of the first people to use graphical representations to present data. Nightingale was a nurse in the Crimean War and used a type of graph that she called polar area diagram, or coxcombs to display mortality figures for contagious diseases such as cholera and typhus. Nightingale. Nightingale-mortality.jpg.

  26. [2404.10443] AGHINT: Attribute-Guided Representation Learning on

    Recently, heterogeneous graph neural networks (HGNNs) have achieved impressive success in representation learning by capturing long-range dependencies and heterogeneity at the node level. However, few existing studies have delved into the utilization of node attributes in heterogeneous information networks (HINs). In this paper, we investigate the impact of inter-node attribute disparities on ...

  27. The Alzheimer's Knowledge Base: A Knowledge Graph for Alzheimer Disease

    Background: As global populations age and become susceptible to neurodegenerative illnesses, new therapies for Alzheimer disease (AD) are urgently needed. Existing data resources for drug discovery and repurposing fail to capture relationships central to the disease's etiology and response to drugs. Objective: We designed the Alzheimer's Knowledge Base (AlzKB) to alleviate this need by ...

  28. Chiplet Placement Order Exploration Based on Learning to Rank with

    To address these challenges, we propose a learning to rank approach with graph representation, building upon the reinforcement learning framework RLPlanner. This method aims to select the optimal chiplet placement order for each chiplet-based system. Experimental results demonstrate that compared to placement order obtained solely based on the ...