Logo Logo
  • Home Menu
  • Visualization Services
  • Contact

Contact Info

  • Email info@accessnowgroup.com
  • Phone Drop us an email
  • Office Hours 24 X 7

Additional Links

  • Contact Us

Connect With Us

The Curse of Dimensionality

  • Home
  • Blog Details
July 19 2020
  • Atrificial
  • data

Why High Dimensional Data Can Be So Troublesome

Have you ever been in the middle of telling someone a story or struggling through a long explanation of something complicated when the other person looks at you and asks, “What’s the point?”

First, your friend is so rude!

But we have also been in your friend’s shoes too — we are all busy people with places to go and folks to see. We want our information quick and to the point.

That is the essence of dimensionality reduction. When confronted with a ton of data, we can use dimensionality reduction algorithms to make the data “get to the point”.

In a previous post, I covered PCA, a powerful and versatile dimensionality reduction algorithm that unearths the underlying trends in our data. There are other options as well — such as non-negative matrix factorization, linear discriminant analysis and more (I promise to cover these in the future).

However, today our topic is not about a specific algorithm, but rather about why we need dimensionality reduction algorithms in the first place — The Curse of Dimensionality.

When is Data High Dimensional and Why Might That Be a Problem?

The Curse of Dimensionality sounds like something straight out of a pirate movie but what it really refers to is when your data has too many features.

The phrase, attributed to Richard Bellman, was coined to express the difficulty of using brute force (a.k.a. grid search) to optimize a function with too many input variables. Imagine running an Excel Solver optimization where there are 100,000 possible input variables, or in other words, 100,000 potential levers to pull (pretty sure that would crash your Excel).

In today’s big data world it can also refer to several other potential issues that arise when your data has a huge number of dimensions:

If we have more features than observations than we run the risk of massively overfitting our model — this would generally result in terrible out of sample performance.

When we have too many features, observations become harder to cluster — believe it or not, too many dimensions causes every observation in your dataset to appear equidistant from all the others. And because clustering uses a distance measure such as Euclidean distance to quantify the similarity between observations, this is a big problem. If the distances are all approximately equal, then all the observations appear equally alike (as well as equally different), and no meaningful clusters can be formed.

Previous Post Next Post
DataStore

2 Comments

Jonathom Doe
July 19, 2020

Great

Reply
    Spart Lee
    July 19, 2020

    Very nice article for dimension and it talk about various aspects of engineering

    Reply

Leave a Comment Cancel reply

Recent Posts

  • M
  • M
  • M
  • M
  • M

Recent Comments

  • Soumya on What Is Data Engineering and Is It Right for You?
  • Spart Lee on What is Data Engineering: Explaining the Data Pipeline, Data Warehouse, and Data Engineer Role
  • Spart Lee on The Curse of Dimensionality
  • Spart Lee on It’s Time to Re-evaluate Your Relationship With Hadoop
  • Jonathom Doe on What is Data Engineering: Explaining the Data Pipeline, Data Warehouse, and Data Engineer Role

Archives

  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • March 2021
  • July 2020

Categories

  • Analysis
  • Atrificial
  • data
  • Process
  • Store
  • Uncategorized

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Tags

Analysis Artificial Data Process Software Store
Logo

We are there to assist you in building cutting edge solutions which will propel business in profitable direction.

Useful Links

  • Contact

Services

  • Coming soon

Contact Info

Feel free to reach us at any time. We are happy to assist you.

  • Email: info@accessnowgroup.com
  • Contact: info@accessnowgroup.com

© Copyright 2021. AccessNowGroup.com

  • Contact Us